Gluster Rocks the Vote

Rock the Vote needed a way to manage the fast growth of the data handled by its Web-based voter registration application. The organization turned to GlusterFS replicated volumes to allow for filesystem size upgrades on its virtualized hosting infrastructure without incurring downtime.

Over its twenty-one year history, Rock the Vote has registered more than five million young people to vote, and has become a trusted source of information about registering to vote and casting a ballot.

rtv

Since 2009, Rock the Vote has run a Web-based voter registration application, powered by an open source rails application stack called Rocky.

I talked to Lance Albertson, Associate Director of Operations at the Oregon State University Open Source Lab and primary technical systems operation lead for the service, about how they’re using Gluster to provide for the service’s growing storage requirements.

“During a non-election season,” Albertson explained, “the filesystem use and growth is minimal, however during a presidential election season, the growth of the filesystem can be exponential. So with Gluster we’re trying to solve the sudden growth problem we have.”

Rock the Vote’s voter registration application is served from a virtual machine instance running Gentoo Hardened, with a pair of physical servers running CentOS 6 with Gluster 3.3.0 to host voter registration form data. The storage nodes host a replicated GlusterFS volume, which the registration front end accesses via Gluster’s NFS mount support.

The Gluster-backed iteration of the voter registration application started out in September with a 100GB volume, which the team stepped up incrementally to 350GB as usage grew in the period leading up to the election.

Before implementing Gluster for their storage needs, Rock the Vote’s application hosting team was using local storage within their virtual machines to store the voter form data, which made it difficult to expand storage without bringing their VMs down to do so.

The hosting team shifted storage to an HA NFS cluster, but found the implementation fragile and prone to breakage when adding/removing NFS volumes and shares.

“Gluster allowed us more flexibility in how we manage that storage without downtime,” Albertson continued, “Gluster made it easy to add a volume and grow it as we needed.”

Looking ahead to future election seasons, and forthcoming GlusterFS releases, Albertson told me that the Gluster attribute he’s most interested in is limited-downtime upgrades between version 3.3.0 and future Gluster releases. Albertson is also looking forward to the addition of multi-master support in Gluster’s geo-replication capability, an enhancement planned for the upcoming 3.4 version.

Gluster User Story: Fedora Hosted

The Fedora Project’s infrastructure team needed a way to ensure the reliability of its Fedora Hosted service, while making the most of their available hardware resources. The team tapped GlusterFS replicated volumes to convert what had been a two-node, active/passive, eventually consistent hosting configuration into a well-synchronized setup in which both nodes could take on user load.

Hosting Fedora Hosted

The Fedora Infrastructure team develops, deploys, and maintains various services for the Fedora Project. One of these services, Fedora Hosted, provides open source projects with a place to host their code and collaborate online.

I talked to the team’s Infrastructure Lead, Kevin Fenzi, about how they’re using Gluster to ensure availability of these services while making the most of their server resources.

Fedora Hosted is served from a pair of virtual instances hosted at serverbeach.com, which donates these resources to the project. The instances run Red Hat Enterprise Linux 6 and maintain a replicated GlusterFS 3.3.0 volume to keep the 50GB of project data stored at Fedora Hosted in sync. The nodes use Gluster’s NFS mount support, which the team found to deliver better performance with the many small files that Fedora Hosted serves.

“Both servers are in DNS, so it’s round robin which one you hit for any given connection. Since the data on the backend is replicated, both of them are up to date at any given time,” Kevin explained. “This way, not only can we handle more load cpu-wise, but if we wish to reboot one node for an update or the like, we simply adjust DNS and there is no outage seen by our projects.”

The Road to Gluster

An earlier incarnation of Fedora Hosted was also run on a pair of virtual instances, one actively serving users and the other a standby kept in sync with an hourly rsync job. If the primary node failed, the standby instance could be brought up in short order, but the hourly sync window meant that the service could suffer an hour or two of data loss.

The Fedora Infrastructure team managed to close this sync window by shifting to a new configuration based on the DRBD project. While this solution dealt with the problem of data loss following an outage, the configuration left one node mostly idle.

The team’s first foray into a GlusterFS-backed configuration for Fedora Hosted turned up a couple of issues with the then-current GlusterFS version 3.2, which the Gluster project addressed in their 3.3 release.

“The Gluster folks were very responsive to our issues and were working on the patch very soon after we requested it,” Kevin explained. “Additionally, 3.3 performance seemed to be much better than 3.2 for our use cases.”

Looking ahead, Kevin and the other members of the Infrastructure team have their eyes set on continued performance enhancements. While the Gluster 3.3-backed Fedora Hosted service has handled its community collaboration load quite well, Kevin pointed out that “we could always want better performance.”