17 Mar 2017

GlusterFS 3.8.10 is available

The 10th update for GlusterFS 3.8 is available for users of the 3.8 Long-Term-Maintenance version. Packages for this minor update are in many of the repositories for different distributions already. It is recommended to update any 3.8 installation to this latest release.

Release notes for Gluster 3.8.10

This is a bugfix release. The Release Notes for 3.8.0, 3.8.1, 3.8.2, 3.8.3, 3.8.4, 3.8.5, 3.8.6, 3.8.7, 3.8.8 and 3.8.9 contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.8 stable release.

Improved configuration with additional 'virt' options

This release includes 5 more options to group virt (for VM workloads) for optimal performance.
Updating to the glusterfs version containing this patch won't automatically set these newer options on already existing volumes that have group virt configured. The changes take effect only when post-upgrade
# gluster volume-set <VOL> group virt
is performed.
For already existing volumes the users may execute the following five commands, if not already set:
# gluster volume set <VOL> performance.low-prio-threads 32
# gluster volume set <VOL> cluster.locking-scheme granular
# gluster volume set <VOL> features.shard on
# gluster volume set <VOL> cluster.shd-max-threads 8
# gluster volume set <VOL> cluster.shd-wait-qlength 10000
# gluster volume set <VOL> user.cifs off
It is most likely that features.shard would already have been set on the volume even before the upgrade, in which case the third volume set command above may be skipped.

Bugs addressed

A total of 18 patches have been merged, addressing 16 bugs:
  • #1387878: Rebalance after add bricks corrupts files
  • #1412994: Memory leak on mount/fuse when setxattr fails
  • #1420993: Modified volume options not synced once offline nodes comes up.
  • #1422352: glustershd process crashed on systemic setup
  • #1422394: Gluster NFS server crashing in __mnt3svc_umountall
  • #1422811: [Geo-rep] Recreating geo-rep session with same slave after deleting with reset-sync-time fails to sync
  • #1424915: dht_setxattr returns EINVAL when a file is deleted during the FOP
  • #1424934: Include few more options in virt file
  • #1424974: remove-brick status shows 0 rebalanced files
  • #1425112: [Ganesha] : Unable to bring up a Ganesha HA cluster on RHEL 6.9.
  • #1425307: Fix statvfs for FreeBSD in Python
  • #1427390: systemic testing: seeing lot of ping time outs which would lead to splitbrains
  • #1427419: Warning messages throwing when EC volume offline brick comes up are difficult to understand for end user.
  • #1428743: Fix crash in dht resulting from tests/features/nuke.t
  • #1429312: Prevent reverse heal from happening
  • #1429405: Restore atime/mtime for symlinks and other non-regular files.

7 Mar 2017

Access Gluster volume as a object Storage (via S3)

Building gluster-object in Docker container:


This document is about accessing a gluster-volume using object interface.

Object interface is provided by gluster-swift. (2)

Here, gluster-swift is running inside a docker container. (1)

This Object interface(docker container) accesses Gluster volume which is mounted in the host.

For the same Gluster volume, bind mount is created inside the docker container and hence can be accessed using S3 GET/PUT requests.

Steps to build gluster-swift container:

git clone docker-gluster-swift containing Dockerfile

$ git clone https://github.com/prashanthpai/docker-gluster-swift.git

$ cd docker-gluster-swift

Start Docker service:
$ sudo systemctl start docker.service

Build  a new image using Dockerfile
$ docker build --rm --tag prashanthpai/gluster-swift:dev .

Sending build context to Docker daemon 187.4 kB
Sending build context to Docker daemon
Step 0 : FROM centos:7
 ---> 97cad5e16cb6
Step 1 : MAINTAINER Prashanth Pai <ppai@redhat.com>
 ---> Using cache
 ---> ec6511e6ae93
Step 2 : RUN yum --setopt=tsflags=nodocs -y update &&     yum --setopt=tsflags=nodocs -y install         centos-release-openstack-kilo         epel-release &&     yum --setopt=tsflags=nodocs -y install         openstack-swift openstack-swift-{proxy,account,container,object,plugin-swift3}         supervisor         git memcached python-prettytable &&     yum -y clean all
 ---> Using cache
 ---> ea7faccc4ae9
Step 3 : RUN git clone git://review.gluster.org/gluster-swift /tmp/gluster-swift &&     cd /tmp/gluster-swift &&     python setup.py install &&     cd -
 ---> Using cache
 ---> 32f4d0e75b14
Step 4 : VOLUME /mnt/gluster-object
 ---> Using cache
 ---> a42bbdd3df9f
Step 5 : RUN mkdir -p /etc/supervisor /var/log/supervisor
 ---> Using cache
 ---> cf5c1c5ee364
Step 6 : COPY supervisord.conf /etc/supervisor/supervisord.conf
 ---> Using cache
 ---> 537fdf7d9c6f
Step 7 : COPY supervisor_suicide.py /usr/local/bin/supervisor_suicide.py
 ---> Using cache
 ---> b5a82aaf177c
Step 8 : RUN chmod +x /usr/local/bin/supervisor_suicide.py
 ---> Using cache
 ---> 5c9971b033e4
Step 9 : COPY swift-start.sh /usr/local/bin/swift-start.sh
 ---> Using cache
 ---> 014ed9a6ae03
Step 10 : RUN chmod +x /usr/local/bin/swift-start.sh
 ---> Using cache
 ---> 00d3ffb6ccb2
Step 11 : COPY etc/swift/* /etc/swift/
 ---> Using cache
 ---> ca3be2138fa0
Step 12 : EXPOSE 8080
 ---> Using cache
 ---> 677fe3fd2fb5
Step 13 : CMD /usr/local/bin/swift-start.sh
 ---> Using cache
 ---> 3014617977e0
Successfully built 3014617977e0

Setup Gluster volume:

Glusterd service start, create and mount volumes

$  su
root@node1 docker-gluster-swift$ service glusterd start

Starting glusterd (via systemctl):                         [  OK  ]
root@node1 docker-gluster-swift$
root@node1 docker-gluster-swift$

Create gluster volume:

There are three nodes where Centos 7.0 is installed.

Ensure glusterd service is started all three nodes(node1, node2, node3) as below:
#systemctl glusterd start

root@node1 docker-gluster-swift$ sudo gluster volume create tv1  node1:/opt/volume_test/tv_1/b1 node2:/opt/volume_test/tv_1/b2  node3:/opt/volume_test/tv_1/b3 force

volume create: tv1: success: please start the volume to access data

- node1, node2, nod3 are the hostnames,

- /opt/volume_test/tv_1/b1,  /opt/volume_test/tv_1/b2 and /opt/volume_test/tv_1/b3 are the bricks

        - tv1 is the volume name

root@node1 docker-gluster-swift$

Start gluster volume:
root@node1 docker-gluster-swift$ gluster vol start tv1

volume start: tv1: success

root@node1docker-gluster-swift$ gluster vol status

Status of volume: tv1
Gluster process                             TCP Port  RDMA Port  Online  Pid
Brick node1:/opt/volume_test/tv_1/b1         49152     0          Y       5951
Brick node2:/opt/volume_test/tv_1/b2         49153     0          Y       5980
Brick node3:/opt/volume_test/tv_1/b3         49153     0          Y       5980

Task Status of Volume tv1
There are no active volume tasks
root@node1 docker-gluster-swift$

Create a directory to mount the volume:
root@node1 docker-gluster-swift$ mkdir -p /mnt/gluster-object/tv1

The path /mnt/gluster-object/ will be used while running Docker container.

mount the volume:

root@node1 docker-gluster-swift$ mount -t glusterfs node1:/tv1 /mnt/gluster-object/tv1

root@node1 docker-gluster-swift$

Verify mount:
sarumuga@node1 test$ mount | grep mnt

node1:/tv1 on /mnt/gluster-object/tv1 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)


Run command in the new container with gluster mount path:

root@node1 test$ docker run -d -p 8080:8080 -v /mnt/gluster-object:/mnt/gluster-object -e GLUSTER_VOLUMES="tv1" prashanthpai/gluster-swift:dev


-p 8080:8080

publish container port to host.

format :    hostport : containerport

                         (a)                (b)
Note: -v /mnt/gluster-object:/mnt/gluster-object
(a) location where all gluster volumes are mounted in host location
(b) location inside docker where volume is mapped

passing tv1 volume name as environment.

Verify container :
sarumuga@node1 test$ docker ps
CONTAINER ID        IMAGE                            COMMAND                CREATED             STATUS              PORTS                    NAMES
feb8867e1fd9        prashanthpai/gluster-swift:dev   "/bin/sh -c /usr/loc   29 seconds ago      Up 28 seconds>8080/tcp   sick_heisenberg

Inspect container and get the IP address:
sarumuga@node1test$ docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}'  feb8867e1fd9"


Verifying S3 access :

Now, verify S3 access requests to the Gluster volume.

We are going to make use of s3curl(3) for verifying object access.

Create bucket:
# ./s3curl.pl --debug --id 'tv1' --key 'test' --put /dev/null  -- -k -v

Put object
# ./s3curl.pl --debug --id 'tv1' --key 'test' --put  ./README -- -k -v -s

Get object
# ./s3curl.pl --debug --id 'tv1' --key 'test'   -- -k -v -s

List objects in a bucket request
# ./s3curl.pl --debug --id 'tv1' --key 'test'   -- -k -v -s

List all buckets
# ./s3curl.pl --debug --id 'tv1' --key 'test'   -- -k -v -s

Delete object
# ./s3curl.pl --debug --id 'tv1' --key 'test'   --del -- -k -v -s

Delete Bucket
# ./s3curl.pl --debug --id 'tv1' --key 'test'   --del -- -k -v -s


(1) GitHub - prashanthpai/docker-gluster-swift: Run gluster-swift inside a docker container.
(2) gluster-swift/quick_start_guide.md at master · gluster/gluster-swift · GitHub
(3) Amazon S3 Authentication Tool for Curl : Sample Code & Libraries : Amazon Web Services

27 Feb 2017

Bangalore Kubernetes Meetup #4 and update on next community event

Last weekend  we had 4th Kubernetes Meetup in Bangalore at VMware’s office. More than 100 people signed and ~40 people showed  up. The first session was from  Akshay Mathur and Manu Dilip Shah  of A10 Networks who shared their experience on choosing  and using Kubernetes for their  product.

The next session was from Krishna Kumar and Dhilip Kumar from Huawei who talked about  managing stateful application with Kubernetes. Krishna first gave us good overview and then Dhilip showed us a demo.  He also urged participant to join Kubernetes Stateful SIG.

The last talk was from Magesh and Kumar Gaurav of VMware on using using k8s to orchestrate VMware SaaS. It was an interesting talk on which Magesh talk about some real problems his team has faced.

Some participants asked about comparing Docker Swarm, Kubernetes, Mesos Marathon and other container orchestration. We had brief  discussion about I share my last year’s LinuxCon/ContainerCon EU workshop on comparing different  container orchestrators.

We also talked about on what we should do in in the next Kubernetes meetup. We might do hands-on labs.

In the end I also shared the details about the community event we are doing in conjunction with other meetup groups like one we did few weeks back AWS, DevOps and Docker meetup groups.   In April we plan to do similar event  on Microservices and Serverless. Other than previous participants Kubernetes and Mesos & CNCF meetup group would also be joining it.

It was a good meetup. Thanks for the organisers and speakers.

21 Feb 2017

AWS, DevOps and Docker Meetup

Continuing our experiment with community driven conference, this time AWS,DevOps and Docker meetup collaborated to do a combined meetup. We charged INR 100 to each participants to make sure they are really interested in coming. From the collected money we gave gifts to speakers and some prized to the participants. The meetup was hosted at Bangalore’s LinkedIn office. It was really nice venue and and everything went as expected.

AWS, DevOps and Docker Meetup - Feb'17

AWS, DevOps and Docker Meetup – Feb’17

Out of 150 participants ~140 showed up, which is really nice. Some people came from Chennai and Cochin as well. So putting a nominal fee really works :). We started almost on time. The first talk was from Neeraj Shah from Minjar Cloud. He shared his experience on how he deployed application on ECS. He mentioned the advantages of using ECS to deploy Docker based application on Cloud and why one should use it.

Next talk was on from Mohit Sethi, who talked about managing storage in containerised  environments.

Sreenivas Makam then talked about Docker 1.13 experimental features. He briefly introduced us with new features of Docker 1.13 and then focussed on experimental feature. Experimental features can enabled from stable binary only, which is a real help for anyone who wants to try out upcoming Docker features. In the end we also good discussion about Docker Stack.

We took a quick break after that and had quick intro session with all the organisers. After the break Madan shared some great tips on saving cost with AWS.

After that Shamasis shared Postman’s scaling journey with Docker and AWS Beanstalk. It was really fascinating.

In the first Flash talk Gourav Shah open-sourced  a project on creating Devops workspaces using Docker. Generally one is required to have more than one machined to try DevOps tools like Chef, Puppet Ansible etc. With this tool one can build a multi-node setup using Docker and do different labs.

The last session was from Mitesh, who showed to use Jenkins and CodeDeploy to deploy the applications on AWS.

All of the sessions were very informative, venue was host, participants were eager to learn. I think it was really good meetup.

Thanks to all the speakers and our host Bathri and Sethil at LinkedIn. Thanks to organisers for AWS and DevOps meetup group specially Mohit and Habeeb.

During the meetup some other meetup groups came forward to join us next time. We might do something around Micro-services and Serverless in April last week in the similar mode. So stay tuned 🙂


16 Feb 2017

GlusterFS 3.8.9 is an other Long-Term-Maintenance update

We are proud to announce the General Availability of yet the next update to the Long-Term-Stable releases for GlusterFS 3.8. Packages are being prepared to hit the mirrors expected to hit the repositories of distributions and the Gluster download server over the next few days. Details on which versions are part of which distributions can be found on the Community Packages in the documentation.

The release notes are part of the git repository, the downloadable tarball and are included in this post for easy access.

Release notes for Gluster 3.8.9

This is a bugfix release. The Release Notes for 3.8.0, 3.8.1, 3.8.2, 3.8.3, 3.8.4, 3.8.5, 3.8.6, 3.8.7 and 3.8.8contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.8 stable release.

Bugs addressed

A total of 16 patches have been merged, addressing 14 bugs:
  • #1410852: glusterfs-server should depend on firewalld-filesystem
  • #1411899: DHT doesn't evenly balance files on FreeBSD with ZFS
  • #1412119: ganesha service crashed on all nodes of ganesha cluster on disperse volume when doing lookup while copying files remotely using scp
  • #1412888: Extra lookup/fstats are sent over the network when a brick is down.
  • #1412913: [ganesha + EC]posix compliance rename tests failed on EC volume with nfs-ganesha mount.
  • #1412915: Spurious split-brain error messages are seen in rebalance logs
  • #1412916: [ganesha+ec]: Contents of original file are not seen when hardlink is created
  • #1412922: ls and move hung on disperse volume
  • #1412941: Regression caused by enabling client-io-threads by default
  • #1414655: Upcall: Possible memleak if inode_ctx_set fails
  • #1415053: geo-rep session faulty with ChangelogException "No such file or directory"
  • #1415132: Improve output of "gluster volume status detail"
  • #1417802: debug/trace: Print iatts of individual entries in readdirp callback for better debugging experience
  • #1420184: [Remove-brick] Hardlink migration fails with "lookup failed (No such file or directory)" error messages in rebalance logs

10 Feb 2017

Avoiding a $10,000 AWS trap

If you’re a back end developer, DevOps or infrastructure person, or you’re interested in AWS Amazon Web Services, check out this blog post from our Developer Adrian Hindle.

It follows his lightning talk for ‘Dispatches from the tech front line’, Cogapp’s Wired Sussex Open Studios event as part of Brighton Digital Festival .

Adrian delivering his talk at Cogapp’s Wired Sussex Open Studios event

Cloud computing

At Cogapp, when we need to build large load-balanced and auto-scaled infrastructure, we use AWS. Here’s AWS’ offering in their own words — “Amazon Web Services offers reliable, scalable, and inexpensive cloud computing services. Free to join, pay only for what you use”.

We also use GlusterFS on some of our sites to store large amounts of data (usually images). GlusterFS is a scalable network filesystem and we usually use a distributed and replicated setup over four servers.

I’m a back end and infrastructure developer, and part of my role is to be responsible for creating and maintaining this kind of setup.

We’ve found that this arrangement normally works well, having reliably used it across a couple of sites. I’m here to tell you about a rare occasion when it stopped working, with potentially expensive consequences! In the process of avoiding this perilous trap, we’re pleased (and relieved!) to say we also avoided any downtime to the site.

A curious problem

Everything had been running smoothly with this site until we started getting AWS Cloudwatch alerts about one of our Drupal instances dropping out of our load balancer. We looked into it, and noticed that the Cloudwatch monitoring had stopped on one of our Gluster servers (Gluster1).

It turned out that Gluster1 instance had failed and was unavailable, in other words, it just stopped working. The good thing with our GlusterFS setup is that even if one server stops working, the others carry on working and the data carries on being served. I’m still not sure why this caused one of our Drupal instances to drop out of the load balancer for a minute.

Cogapp investigates

Initially, I tried to replace the Gluster1 server and its ‘bricks’ with another one but for some strange reason the cluster did not accept the new server. After spending hours reading the documentation and asking questions on the IRC and on the mailing list, I gave up and we decided to replace the entire cluster. The old cluster never stopped working, but going down to three servers instead of four was dangerous because it would only take another server to go offline and the entire site would look broken or stop working properly. All our servers are provisioned with Ansible, so creating another cluster is quick and easy. But instead of getting the data from the old cluster, we thought we would use one of our backups and do a ‘fire drill’.


We back up every archive image to Glacier. We use Glacier instead of S3 because our GlusterFS setup is redundant and is, in a way, a backup of itself. The premise of Glacier is very cheap storage for infrequently accessed data, where a retrieval time of several hours is suitable. The way to put data into Glacier is through S3. You put data in S3 and create a rule on the S3 bucket to move the data to Glacier. You cannot access Glacier directly; you have to go through S3. To get data from Glacier you need to request a ‘restore’ of your files, they will then be available in the S3 bucket after a couple of hours.

The trap

In keeping with its intended niche of long term data archiving, storage costs on Glacier are low; but retrieval fees can be high. Just before I was going to restore the 5TB of images, I thought I would check how much this restoration would cost. This was quite a surprise. The estimated cost was going to be $10,000+!

Length of time for retrieval 4 hours:
Retrieval cost: $9,900.00
Transfer cost: $449.91
Total cost: $10,349.91

After some time playing around with a couple of cost calculators I realised, thankfully, that I could throttle the restoration over 2–3 days, and the cost would be divided by 10.

Length of time for retrieval 72 hours:
Retrieval cost: $547.25
Transfer cost: $449.91
Total cost: $997.16

Problem solved…

So, instead of restoring the entire bucket, I wrote a small script that listed all the files in Glacier and wrote them to a text file. Then, looping on each file the script asked for the restoration of the file and then paused for a few seconds. The output was written to another file so that if the script stopped or if the request failed we could restart it and know where it stopped.

In our case, to restore 540,000+ JPEG 2000 (5TB+), the restoration took 4 days. Once the files were restored, the download speed from S3 to Gluster was (depending on the size of the images) 150–200 images per minute (total download ~125 hours). I started the restore script on a Monday morning, but because I started downloading the images as soon as they were available, by Friday the new Gluster cluster had all the images.

Once the new Gluster cluster was provisioned and had all the data, I did a couple of checks and got ready to replace the old cluster with the new one the following Monday.

…or was it?

When I came back to the office on Monday morning I had an email from Amazon. It said that one of the instances from the new cluster was scheduled to be retired due to problems with its underlying architecture! Which basically means I had to start all over again…but this time I didn’t get the data from Glacier!

What does this mean for you?

If you have a complicated technical project with some complex infrastructure and you need someone to set it up as cost-efficiently as possible, get in touch with Adrian and the Cogapp team.

Avoiding a $10,000 AWS trap was originally published in Cogapp on Medium, where people are continuing the conversation by highlighting and responding to this story.

31 Jan 2017

GlusterFS 3.7.20

GlusterFS 3.7.20 released

GlusterFS-3.7.20 has been released. This is regular bug fix release for GlusterFS-3.7, and is currently the last planned release of GlusterFS-3.7. GlusterFS-3.10 is expected next month, and GlusterFS-3.7 enters EOL once it is released. The community will be notified of any changes to the EOL schedule.

The release-notes for GlusterFS-3.7.20 can be read here.

The release tarball and community provided packages can obtained from download.gluster.org. The CentOS Storage SIG packages have been built and should be available soon from the centos-gluster37 repository.

30 Jan 2017

Gerrit OS Upgrade

When I started working on Gluster, Gerrit was a large piece of technical debt. We were running quite an old version on CentOS 5. Both of these items needed fixing. The Gerrit upgrade happened in June causing me a good amount of stress for a whole week as I dealt with the fall out. The OS upgrade for Gerrit happened last weekend after a marathon working day that ended at 3 am. We ran into several hacks in the old setup and we worked on getting them working in a more acceptable manner. That took quite a bit of our time and energy. At the end of it, I’m happy to say, Gerrit now runs on a machine with CentOS 7. Now of course, it’s time to upgrade Gerrit again and start the whole cycle all over again.

There's light at the end of the tunnel, hopefully, it's not a train

Michael and I managed to coordinate well across timezones. We had a document going where we listed out the tasks to do. As we discovered more items, they went on the todo list. This document also listed all the hacks we discovered. We fixed some of them but did not move the fix over to Ansible. We left some hacks in because fixing it will take some more time.

Things we learned the hard way: * Running the git protocol with xinetd was a trial and error process to configure. It took me hours to get it right. Here’s the right config file:

service git
        disable         = no
        socket_type     = stream
        wait            = no
        user            = nobody
        server          = /usr/libexec/git-core/git-daemon
        server_args     = --export-all --reuseaddr --base-path=/path/to/git/folder --inetd --verbose --base-path-relaxed
        log_on_failure  += USERID
  • There was some selinux magic we needed for cgit. The documentation had some notes on how to get it right, but that didn’t work for us. Here’s what what needed:
semanage fcontext -a -t git_user_content_t "/path/to/git/folder(/.*)?"
  • When you setup replication to Github for the first time, you need to add the Github host keys to known_hosts. The easiest way is to try to ssh into github. That will fail with a friendly error message and prompt you to add your keys. You could also get it from Github.
  • Gerrit needs AllowEncodedSlashes On and ProxyPass nocanon. Without these two bits of configuration, Gerrit returns random 404s.

We’ve removed two big items out of our tech debt backlog and into successes over the past year or so. Next step is a tie between a Jenkins upgrade and a Gerrit upgrade :)

Image credit: Captain Tenneal Steam Train (license)

24 Jan 2017

How to configure linux vxlans with multiple unicast endpoints

Sometimes you just can't use multicast. Some cloud providers just do not provide it. In that scenario, you need to configure your vxlan layer using unicast addresses. This is done easily using iproute2.

3 Node Network

With the preceding layout, we need the docker instances to be able to communicate with each other. We cannot use L3 routes because the provider will not route any thing that's not on the network, so we need to set up our own L2 network layer over which we can establish our L3 routes. For this we'll use a Virtual Extensible LAN (VXLAN).

Linux has all the tools for setting up these VXLANS and the most common method is to use multicasting. This network doesn't support multicast routing so it's not a possibility. We must use unicast addressing.

We'll start by creating a vxlan interface on the first node.

ip link add vxlan0 type vxlan id 42 dev enp1s0 dstport 0

This creates the vxlan0 device, attaches it to enp1s0 listening on the iana default port. This does not assign any endpoints, so we'll create connections to and

bridge fdb append to 00:00:00:00:00:00 dst dev enp1s0
bridge fdb append to 00:00:00:00:00:00 dst dev enp1s0

Assign an address and turn up the interface

ip addr add dev vxlan0
ip link set up dev vxlan0

Do the same on each of the other nodes.

ip link add vxlan0 type vxlan id 42 dev emp1s0 dstport 0
bridge fdb append to 00:00:00:00:00:00 dst dev enp1s0 bridge fdb append to 00:00:00:00:00:00 dst dev enp1s0
ip addr add dev vxlan0
ip link set up dev vxlan0

ip link add vxlan0 type vxlan id 42 dev emp1s0 dstport 0
bridge fdb append to 00:00:00:00:00:00 dst dev enp1s0 bridge fdb append to 00:00:00:00:00:00 dst dev enp1s0
ip addr add dev vxlan0
ip link set up dev vxlan0

Confirm you can ping via the vxlan.

ping -c4 ; ping -c4


PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=64 time=0.072 ms
64 bytes from icmp_seq=2 ttl=64 time=0.092 ms
64 bytes from icmp_seq=3 ttl=64 time=0.089 ms
64 bytes from icmp_seq=4 ttl=64 time=0.061 ms

--- ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 0.061/0.078/0.092/0.015 ms
PING ( 56(84) bytes of data.
64 bytes from icmp_seq=1 ttl=64 time=2.01 ms
64 bytes from icmp_seq=2 ttl=64 time=1.64 ms
64 bytes from icmp_seq=3 ttl=64 time=1.02 ms
64 bytes from icmp_seq=4 ttl=64 time=1.79 ms

--- ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3004ms
rtt min/avg/max/mdev = 1.027/1.619/2.015/0.367 ms

Your new vxlan network is now ready for adding your l3 routes.

Add your docker l3 routes.

ip route add via
ip route add via

ip route add via
ip route add via

ip route add via
ip route add via

Now your docker containers can reach each other.

NOTE: This is not yet something that can be configured via systemd-networkd. https://github.com/systemd/systemd/issues/5145

15 Jan 2017

An other Gluster 3.8 Long-Term-Maintenance update with the 3.8.8 release

The Gluster team has been busy over the end-of-year holidays and this latest update to the 3.8 Long-Term-Maintenance release intends to fix quite a number of bugs. Packages have been built for many different distributions and are available from the download server. The release-notes for 3.8.8 have been included below for the ease of reference. All users on the 3.8 version are recommended to update to this current release.

Release notes for Gluster 3.8.8

This is a bugfix release. The Release Notes for 3.8.0, 3.8.1, 3.8.2, 3.8.3, 3.8.4, 3.8.5, 3.8.6 and 3.8.7 contain a listing of all the new features that were added and bugs fixed in the GlusterFS 3.8 stable release.

Bugs addressed

A total of 38 patches have been merged, addressing 35 bugs:
  • #1375849: [RFE] enable sharding with virt profile - /var/lib/glusterd/groups/virt
  • #1378384: log level set in glfs_set_logging() does not work
  • #1378547: Asynchronous Unsplit-brain still causes Input/Output Error on system calls
  • #1389781: build: python on Debian-based dists use .../lib/python2.7/dist-packages instead of .../site-packages
  • #1394635: errors appear in brick and nfs logs and getting stale files on NFS clients
  • #1395510: Seeing error messages [snapview-client.c:283:gf_svc_lookup_cbk] and [dht-helper.c:1666ht_inode_ctx_time_update] (-->/usr/lib64/glusterfs/3.8.4/xlator/cluster/replicate.so(+0x5d75c)
  • #1399423: GlusterFS client crashes during remove-brick operation
  • #1399432: A hard link is lost during rebalance+lookup
  • #1399468: Wrong value in Last Synced column during Hybrid Crawl
  • #1399915: [SAMBA-CIFS] : IO hungs in cifs mount while graph switch on & off
  • #1401029: OOM kill of nfs-ganesha on one node while fs-sanity test suite is executed.
  • #1401534: fuse mount point not accessible
  • #1402697: glusterfsd crashed while taking snapshot using scheduler
  • #1402728: Worker restarts on log-rsync-performance config update
  • #1403109: Crash of glusterd when using long username with geo-replication
  • #1404105: Incorrect incrementation of volinfo refcnt during volume start
  • #1404583: Upcall: Possible use after free when log level set to TRACE
  • #1405004: [Perf] : pcs cluster resources went into stopped state during Multithreaded perf tests on RHGS layered over RHEL 6
  • #1405130: `gluster volume heal split-brain' does not heal if data/metadata/entry self-heal options are turned off
  • #1405450: tests/bugs/snapshot/bug-1316437.t test is causing spurious failure
  • #1405577: [GANESHA] failed to create directory of hostname of new node in var/lib/nfs/ganesha/ in already existing cluster nodes
  • #1405886: Fix potential leaks in INODELK cbk in protocol/client
  • #1405890: Fix spurious failure in bug-1402841.t-mt-dir-scan-race.t
  • #1405951: NFS-Ganesha:Volume reset for any option causes reset of ganesha enable option and bring down the ganesha services
  • #1406740: Fix spurious failure in tests/bugs/replicate/bug-1402730.t
  • #1408414: Remove-brick rebalance failed while rm -rf is in progress
  • #1408772: [Arbiter] After Killing a brick writes drastically slow down
  • #1408786: with granular-entry-self-heal enabled i see that there is a gfid mismatch and vm goes to paused state after migrating to another host
  • #1410073: Fix failure of split-brain-favorite-child-policy.t in CentOS7
  • #1410369: Dict_t leak in dht_migration_complete_check_task and dht_rebalance_inprogress_task
  • #1410699: [geo-rep]: Config commands fail when the status is 'Created'
  • #1410708: glusterd/geo-rep: geo-rep config command leaks fd
  • #1410764: Remove-brick rebalance failed while rm -rf is in progress
  • #1411011: atime becomes zero when truncating file via ganesha (or gluster-NFS)
  • #1411613: Fix the place where graph switch event is logged

11 Jan 2017

GlusterFS 3.7.19

GlusterFS 3.7.19 released

GlusterFS 3.7.19 is a regular bug fix release for GlusterFS-3.7. The release-notes for this release can be read here.

The release tarball and community provided packages can obtained from download.gluster.org. The CentOS Storage SIG packages have been built and should be available soon from the centos-gluster37 repository.

A reminder to everyone, GlusterFS-3.7 is scheduled to be EOLed with the release of GlusterFS-3.10, which should happen sometime in February 2017.

9 Jan 2017

Bangalore Docker Meetup # 24

We did our first meetup for 2017 at GO-Jek’s engineering office in Bangalore. ~400 people rsvped and ~120 people came for the meetup. This is first time we tried to live-stream and record the videos. Though we could not record all the videos due to out lack of experience but still we got two complete ones, which I think is a good start. This first session was from Nishant Totla of Docker. He talked about Docker Swarm and Swarmkit.

The next session was from Saifi Khan who talked about Docker on Windows. He had no slides but demos. Somehow he had an issue with the display driver so we could not see the demo but even without any slides and demo, he engaged the audience with nicely and had very interactive Q & A session.

After Saifi, Thomas Chacko talked about Serverless Computing and Docker. He have very informative overview with some demos.

Next was Kiran Mova form OpenEBS, who talked about Emerging storage technologies for containers.

Lastly I gave a demo on Jenkin’s Docker Plugin and how we can the Jenkins pipeline to build CI/CD environment.

Within the community we were discussing about the community driven mini-conference/larger meetup for some time, which I documented here. Lets see if we can take it forward or not.

I also shared the details about the MOOC on container technologies which my company is launching on 14th Jan’17. The first course would be on Containers Fundamentals. You can find more details here.

We plan to live-stream and record future meetups. Hope to do better next time.

7 Jan 2017

Time for Community Driven Conference/Larger Meetup ?

You name the technology and you would find active meetup in Bangalore. Be it Docker, DevOps, AWS, GlusterFS, Machine Learning etc. Every meetup  group meets at-least once in every two months and share the knowledge. These days cloud, be it public or private seems to common platform for everyone and containers are slowly becoming unit of execution.  There are few conferences which are dedicated on these topics which bring people together. This idea is on similar lines but the quest is whether can we do a conference which driven by the community ?

Last year Docker and DevOps meetup group did a joint one day meetup, for which we received good feedback. By doing such a joint meetup we can extend it to mini-conference but there are few questions which arise :-

  • Who would provide the venue ?
  • What if we announce the conference, sell few tickets but don’t have enough fund to pull it off ?
  • What if we don’t find the sponsors ?
  • etc..

Can we play safe here ? May be Yes !!! With discussion with some of the other meetup organiser we thought of following:-



We’ll open-up 200 registration in the beginning and then ask people to pay 200 per ticket. 200 registrations because that is what we can accommodate in a meetup, if event does not happen. We are charging some money to make sure participant is really interested.

Once 200 people are signed up we can go to companies for sponsoring the venue. The company can pay us or directly to the venue/event management. If we don’t get the sponsorship  then we’ll just to a meetup  as usual. Whatever money we collected so far, we can give that back in prizes during the meetup.

Sounds good ?? This is just a concept now. Lets see if we can take it forward in next week or so. Do leave your comments if you suggestions/idea ?

31 Dec 2016

Gluster, Me and 2016

I started working with the Gluster community since 2013.

2016 with Gluster was great, gave me the opportunity to work on many areas of Gluster mainly Geo-replication, Glusterfind and Events APIs. Expecting more and more challenges in this new year. Happy New Year to all.

Main highlights

  • Became Maintainer of Gluster Geo-replication component
  • Designed and implemented Events APIs for Gluster
  • Attended Gluster developer summit in Berlin

Number of patches per year

Number of Patches per Year

Number of patches per component

Number of Patches per Component

Gluster github projects

Many projects are still in young stage. Comments and Suggestions are welcome.

Projects started in 2013

Projects started in 2014

Projects started in 2015

Projects started in 2016

Gluster Projects

Charts are created using ggplot2 of R programming, For code look in HTML comments of this page :)

28 Dec 2016

Gluster Geo-replication Dashboard Experiment

Gluster Events APIs are available with Gluster 3.9 release. This project is created as an experiment to showcase the capabilities of Gluster Events APIs, Dashboard shows realtime Geo-replication status without refreshing the page.

Note: This is not ready for production use Yet!

Real-time notifications/UI change only works with Gluster 3.9 or above, but dashboard can work with older versions of gluster(But as static display, manual page reload is required to check current status).


Install the app in any one node of Cluster.

git clone https://github.com/aravindavk/gluster-georepdash.git
cd gluster-georepdash/

Install the following Python dependencies using,

sudo pip install flask flask_sockets glustercli

Install elm and bower using,

sudo npm install -g bower elm

Update the serverName in App.elm and then generate static/app.js using,(editing serverName should be automatic, this is code bug! will fix later)

cd gluster-georepdash/
elm-package install
elm-make App.elm --output static/app.js

Install purecss for style using,

cd gluster-georepdash/static
bower install


Run main.py, to start the app server. Dashboard can be accessed using http://nodename:5000

Test and register this node as Events API subscriber by calling webhook-add command. Read more about starting Events service here

gluster-eventsapi webhook-test http://nodename:5000/listen

If Webhook status is OK from all nodes, then add webhook using,

gluster-eventsapi webhook-add http://nodename:5000/listen

Thats all! If everything is okay, dashboard will show realtime Geo-replication status.


When Geo-replication is stopped

UI Changes when a Geo-rep session is stopped from anywhere in Cluster

When Geo-replication is stopped

UI Changes when a Geo-rep session goes to Faulty

UI/Dashboard Notes

  • UI is very raw since it is created for demo purpose
  • Frontend developed using Elm
  • No event available for change in "Last Synced" column, So that column value will not match with realtime output from status command. Refresh the page to see the latest status.

13 Dec 2016

GlusterFS 3.7.18

GlusterFS 3.7.18 released (finally)

GlusterFS 3.7.18 has been released, about a week late.

This is a regular scheduled release for GlusterFS-3.7 and includes 13 bug fixes since 3.7.17. The release-notes can be read here.

The tarball can be downloaded from download.gluster.org.

The CentOS Storage SIG packages have been built and should be available in the centos-gluster37 release repository.

Packages for other distros are available from download.gluster.org. See the READMEs in the respective subdirs at download.gluster.org for more details on how to obtain them.

2 Dec 2016

A year on my own…

Today I completed one year of being on my own.

In 2004, after graduating from college and doing brief experiment with a startup I came to Pune, looking for a job. I failed clearing the interviews of many MNCs. In the last two and a half years of my college I spent most of my time learning Linux and Computer networks. So, I decided to narrow down my search where there is requirement of some Linux Admin work and after giving my first interview, I got the job. Anyways coming forward, I do had some plans after leaving my full time job but didn’t know what coming next. As last time Linux helped again but this time in the form of  The Linux Foundation. I got an assignment from The Linux Foundation to build a self paced course on “Cloud Infrastructure Technologies”, which got launched in June’16.

Between Dec’15 – March’16 months I also gave free containers(Docker) workshops in Bangalore, Pune, Chennai, Hyderabad and Kolkatta. At the same time I gave some paid workshops as well.

Within a month after leaving my job I realised how easy it be an employee than your own boss. One has to manage his/her time, next month’s pay check, family, health and so on without loosing the cool. After going through on my own I started to respect entrepreneurs, self employed, my local vendors, auto rickshaw drivers etc more.

Less sleep, lot of work took a toll on my body and I got struck with Bell’s Palsy in Jan’16. It was very scary to go under MRI in the middle of nigh for the checkup. It took me few months to recover. This definitely effected my work and I had to re-organise myself. As everyone I thought of getting some interns and employees but that did not work either. I spent my time, energy and money to make them upto speed but did not succeed either. It was difficult to part ways with some of them but I am happy that we did that in good terms. It was a good experience which I think can come only with few mistake. I learnt one very important lesson “Never hire a full-time until I am 100% satisfied”.

The formal registration of the company did not happen until May 2016. It is one marathon task as well. Working with Charted Account to get all documentation is not fun but it has to be done anyway. For registration I had to also give a company name. I spent good amount of time think about it. I was looking for some inspiration/help which I got from Ranga Shankara, a theatre in Bangalore and very near to my office. On my way to back from daily visit to a coffee shop I saw the board for theatre festival “Youth Yuga”. In hindi “Yuga” means “Era”, which made me think that is an era of cloud computing. So why not name my company “CloudYuga”. I took the “.guru” domain as in next year or two I would focusing on trainings.

Till Dec’15 to July’16 we worked from Bangalore Alpha Lab and then moved to our own office. It was fun to see the our own office taking shape.


Throughout the year I was engaged with Docker Community, which helped me both professionally and psychologically. I became part of Docker Captain’s program, which was a good confidence booster. In Bangalore I organised Docker meetups, which kept me well connected with local community. On my own I attended different international conferences DevConf’16, DockerCon’16 and LinuxCon/ContainerCon’16 & spoke in two of them .

When I started last year I told my family that lets see for 6 months and if things does not work out then I can go back to full time again. Its been year and I have things in pipeline for next 6 months, which is good. My family supported me very well, specially my wife Kanika who also joined to work with me part time since last few months.

Till now my focus has been training than consulting, which I believe would continue for some time. I see good amount skill gap in adopting container technologies.. good for me !!. Over the year I got some corporate clients for trainings which I think would continue to grow. Here is the group photo of the container (Docker) training I delivered today in Pune.


Some of learning from last year’s experience are:-

  • There are more good people in the world than we think. They are willing to help you.
  • Reach out to people with helping hand.
  • Give priority to health.
  • Be true to yourself and things would fall in place.

Its been a fun ride with lots of new experienced. Lets see how far we can go !!

Test Automation on CentOS CI with Ansible

I work on Gluster, which is a distributed file system. Testing a distributed file system needs a distributed setup. We run regressions by faking a distributed setup. We’re planning on using Glusto for real distributed functional testing. Centos CI gives us on-demand physical hardware to run tests. I’ve been working on defining our jobs on Centos CI with Jenkins Job Builder.

In the past, we created our jobs via the UI and committed the XML to a git repo. This version control is dependent on discipline from the person making the change. The system is not built around discipline. This does not scale well. Every person who wants to add a test needs to have access or work with someone who does have access to add a new job.

Manufacturing Line

With Jenkins Job Builder, the configuration is the single source of truth for any given job. As a bonus, this reduces code duplication. With David’s help, I wrote centos-ci-sample, which establishes a better pattern for Centos CI jobs. David pointed me to the post-task publisher, which makes sure that the nodes are returned to the pool even on a failing job. This sample is good, but worked best for jobs that needed just one node.

We’re starting to setup proper multi-node tests for functional tests with Glusto. I decided to use an Ansible playbook for the setup of the nodes. Our internal QE folks will be re-using this playbook for setup.

Converting a Jenkins XML to JJB YAMl is fun. It’s best to look at the UI and read the XML to get an idea of what the job does. Then write a YAML which does something close to that. Once you have a YAML, it’s best to convert it to XML and do a diff against the existing job. I use xmllint -c14n to make both XML files standardized. Then I use colordiff to compare the diff. This gives me an idea of what I’ve added/removed. There will always be some changes. JJB assumes some sane defaults.

Image credit: aldenjewell 1951 Plymouth Assembly Line (license)

21 Nov 2016

Docker Mentor Week 2016 – Bangalore

Docker organised Global Mentor Week around the world between 14-19th Nov’16. In Bangalore we scheduled it on 19th Nov at Microsoft office. We divided the group into two groups Beginner and Intermediate of 125 each. Out of which ~60 turned up for each session.

The content for the mentor week was shared by the Docker, which had hands-on exercises for basic and intermediate level.  Microsoft provided Azure pass for every participant so that we all can have same working environment. But during the workshop we faced some internet issue, so everyone could not hands-on on that day.  But with Azure pass and labs, participants can try out the hands-on later.

For the beginner level we had four mentors Sreenivas MakamBathrinath Raveendran, Ratheesh T M  and myself. Sreenivas did the heavy lifting by walking through the Docker basics and labs. He also shared the notes to try out Docker Mentor Week’s lab on Azure. We had the beginner session between ~9:40 AM to 12:145 PM.

After the lunch we did the intermediate training. As the internet was not working, we decided to share the concept of Container Orchestration in general and then with Docker Swarm we showcased a sample application. I used the content from my LinuxCon/ContainerCon workshop which I did last month in Berlin.

After that Ajeet showed how to prepare Docker Swarm cluster on Azure and run & scale the sample application.

Both sessions were very interactive. I hope more people would have joined from the confirmed list. I had to say no to few group members as they could not get the confirmed ticket.

Thanks to Pracheta, Usha, Surubhi  and Sudhir form Microsoft for helping us with the Venue and Azure Passes.

20 Nov 2016

Gluster Geo-replication Tools

A must have tools collection for Gluster Geo-replication users!

Currently this repository contains gluster-georep-setup and gluster-georep-status tools. More tools will be added in future. Let me know if you need any specific tool to manage Gluster Geo-replication.

gluster-georep-setup was previously called as georepsetup (Blog about georepsetup is here).


Install the tools collection using, pip install gluster-georep-tools. Binary packages are not yet available, hopefully I will work on the packaging in near future.


Wrapper around Geo-rep status command and Volume info command to provide more features compared to Gluster CLI. This tool combines Geo-rep status and Volume info to get following advantageous.

  • Nodes will be displayed in the same order as in Volume info
  • Offline nodes are shown with "Offline" as status
  • Status output from different sessions are not mixed.
  • Filters are available(Ex: --with-status=active, --with-crawl-status=changelog, --with-status=faulty etc)
  • Shows summary of number of workers per status

Example output(Listing all the sessions, gluster-georep-status:

SESSION: gv1 ==> fvm1::gv2
| fvm1:/bricks/b1 | Active | Changelog Crawl |    fvm1    | 2016-11-14 08:34:40 |    N/A     |       N/A       |          N/A          |
| fvm1:/bricks/b2 | Active | Changelog Crawl |    fvm1    | 2016-11-14 08:32:21 |    N/A     |       N/A       |          N/A          |
Active: 2 | Passive: 0 | Faulty: 0 | Created: 0 | Offline: 0 | Stopped: 0 | Initializing: 0 | Total: 2

SESSION: gv1 ==> geoaccount@fvm1::gv3
| fvm1:/bricks/b1 | Stopped |     N/A      |    N/A     |     N/A     |    N/A     |       N/A       |          N/A          |
| fvm1:/bricks/b2 | Stopped |     N/A      |    N/A     |     N/A     |    N/A     |       N/A       |          N/A          |
Active: 0 | Passive: 0 | Faulty: 0 | Created: 0 | Offline: 0 | Stopped: 2 | Initializing: 0 | Total: 2


In previous blog, we discussed about this tool. This tool simplifies the steps involved in Geo-replication setup. Now setting up Geo-replication is as easy as running one command. Yay!

Gluster Geo-rep Setup

Usage instructions of all the tools are available here

Let me know if these tools are useful.