Recent Adventures in oVirt and Gluster

At the end of last week, I spied an exciting tweet about oVirt:

libgfapi-ready

Not long after I started using oVirt and Gluster together, the projects started talking about a way to improve Gluster performance by enabling virtualization hosts to access Gluster volumes directly, using Gluster’s libgfapi, rather than through a FUSE-mounted location on the virtualization host. There was a little bit of fit and finish work to be done, and then we’d all be basking in the glow of ~30% better Gluster storage performance.

That was about four years ago. There ended up being kind of a lot of different little things that needed fixing to make this feature work in oVirt. You can follow many of the twists and turns in bugzilla.

All along, I was eagerly awaiting the feature both as a cool new oVirt+Gluster development and as a welcome option for speeding up my own lab. Disk has always been the weakest part of my hardware setup. My servers each have a single pair of 1TB drives in mirrored RAID, shared between Gluster and the OS, and my VM’s virtual drives had been stored in triplicate in replica 3 Gluster volumes. More recently, with the advent of Gluster arbiter bricks, I’ve been able to get the split-brain protection of replica 3 volumes with only two copies of the data, and that sped things up a bit, but did nothing to dampen my appetite for libgfapi.

Since I need my oVirt setup to get things done, I usually don’t test RC versions of new oVirt components there, but I couldn’t wait any longer and took the plunge. I installed the RC2 updates on each of my virt hosts, and on my engine, I installed a slightly newer versionof the code, from the experimental repo, which contained a few last bits that hadn’t made RC2. Then, on my engine, I ran:

# engine-config -s LibgfApiSupported=true
# systemctl restart ovirt-engine

Any VMs that were already running before the upgrade continued running without libgfapi, and if I migrated them to another host, they’d turn up on that host still using the old access method. When I restarted my VMs, they returned using libgfapi. I could tell which was which by grepping through the qemu processes on a particular VM host.

# ps ax | grep qemu | grep 'file=gluster\|file=/rhev'

-drive file=/rhev/data-center/00000001-0001-0001-0001-00000000025e/616be2b6-71db-4f54-befd-be6a444775d7/images/3f7877e7-e532-44a0-8735-c7b2ca06de3b/48ee34fc-ae12-494c-892f-4229fe1fef9d

-drive file=gluster://10.0.20.1/data/616be2b6-71db-4f54-befd-be6a444775d7/images/6597f45a-51cd-4da5-b078-a2652baf78e4/cc3a575e-27b8-4176-b922-9466273153be

The qemu command lines are super long, so I cut them down just to include the line specifying the virtual drives. In the first example, the drive is being accessed through a FUSE mount, and the second, there’s a direct connection to the Gluster volume.

So, how was performance?

I tried a few different tests, starting with runningddon one of my VMs:

# dd bs=1M count=1024 if=/dev/zero of=test conv=fdatasync && rm test

I ran this a bunch of times on a VM in both storage configurations and the libgfapi configuration came out about 44% faster on average.

For a more “real world” test, I figured I’d measure the time it takes to complete a common task of mine: configuring a test Kubernetes cluster from three Fedora Atomic Host VMs using the upstream ansible scripts. I recorded and averaged the time it took to complete this task across multiple runs on VMs running in each storage configuration, and found that libgfapi was 11% faster.

zram madness

Not too bad, but like I said earlier, my oVirt setup can use all the storage speed help it can get. My servers don’t have a lot of disk but they do have quite a bit of RAM, 256GB apiece, so I’ve long wondered how I could use that RAM to wring more speed out of my setup. For a few months I’ve been experimenting with using Gluster volumes backed by RAM-disks, using zram devices.

This actually works pretty well, and I was seeing speeds similar to what I get running on the SSD in my laptop. Of course, RAM-disks mean losing everything on the disk in the event of a reboot (expected or otherwise), but using replica 3 Gluster volumes, I could reboot one host at a time without losing everything else. Upon bringing back the rebooted host, I’d run a little script to recreate the zram device and the mount points, and then follow the Gluster instructions for replacing a failed brick.

# cat fast.sh
ZRAMSIZE=$((1024 * 1024 * 1024 * 50))
modprobe zram
echo ${ZRAMSIZE} > /sys/class/block/zram0/disksize
mkfs -t xfs /dev/zram0
mkdir -p /gluster-bricks/fast
mount /dev/zram0 /gluster-bricks/fast
mkdir /gluster-bricks/fast/brick

However, if all of my machines went down at once, due to a power failure in the lab or something like that, replication wouldn’t help me. I wondered if I could still get a significant boost out of a mixture of zram and regular disk backed volumes, with each of my servers hosting one zram-backed brick, one regular disk-backed brick, and one regular disk-backed arbiter brick, all combined into one distributed-replicated Gluster volume.

brick-house

I ran my same ansible-kubernetes setup tests with the VM drives hosted from my “fast” Gluster domain, and the tests run 32% faster than with the my regular disk-backed (and now libgfapi-enabled) “data” storage domain. Pretty nice, and, in this sort of setup, a power loss would mean that each of four replica groups would be missing one brick, with a remaining data brick and an arbiter brick still around to maintain the data and allow me to repair things.

I want to experiment a bit further with automated tiering in Gluster, where I’d connect a RAM-disk boosted volume like this to the volume for my main data domain, and frequently-accessed files would automatically migrate to the faster storage. As it is now, my fast domain has to be relatively small, so I have to budget my use of it.

Gluster Rocks the Vote

Rock the Vote needed a way to manage the fast growth of the data handled by its Web-based voter registration application. The organization turned to GlusterFS replicated volumes to allow for filesystem size upgrades on its virtualized hosting infrastructure without incurring downtime.

Over its twenty-one year history, Rock the Vote has registered more than five million young people to vote, and has become a trusted source of information about registering to vote and casting a ballot.

rtv

Since 2009, Rock the Vote has run a Web-based voter registration application, powered by an open source rails application stack called Rocky.

I talked to Lance Albertson, Associate Director of Operations at the Oregon State University Open Source Lab and primary technical systems operation lead for the service, about how they’re using Gluster to provide for the service’s growing storage requirements.

“During a non-election season,” Albertson explained, “the filesystem use and growth is minimal, however during a presidential election season, the growth of the filesystem can be exponential. So with Gluster we’re trying to solve the sudden growth problem we have.”

Rock the Vote’s voter registration application is served from a virtual machine instance running Gentoo Hardened, with a pair of physical servers running CentOS 6 with Gluster 3.3.0 to host voter registration form data. The storage nodes host a replicated GlusterFS volume, which the registration front end accesses via Gluster’s NFS mount support.

The Gluster-backed iteration of the voter registration application started out in September with a 100GB volume, which the team stepped up incrementally to 350GB as usage grew in the period leading up to the election.

Before implementing Gluster for their storage needs, Rock the Vote’s application hosting team was using local storage within their virtual machines to store the voter form data, which made it difficult to expand storage without bringing their VMs down to do so.

The hosting team shifted storage to an HA NFS cluster, but found the implementation fragile and prone to breakage when adding/removing NFS volumes and shares.

“Gluster allowed us more flexibility in how we manage that storage without downtime,” Albertson continued, “Gluster made it easy to add a volume and grow it as we needed.”

Looking ahead to future election seasons, and forthcoming GlusterFS releases, Albertson told me that the Gluster attribute he’s most interested in is limited-downtime upgrades between version 3.3.0 and future Gluster releases. Albertson is also looking forward to the addition of multi-master support in Gluster’s geo-replication capability, an enhancement planned for the upcoming 3.4 version.

Gluster User Story: Fedora Hosted

The Fedora Project’s infrastructure team needed a way to ensure the reliability of its Fedora Hosted service, while making the most of their available hardware resources. The team tapped GlusterFS replicated volumes to convert what had been a two-node, active/passive, eventually consistent hosting configuration into a well-synchronized setup in which both nodes could take on user load.

Hosting Fedora Hosted

The Fedora Infrastructure team develops, deploys, and maintains various services for the Fedora Project. One of these services, Fedora Hosted, provides open source projects with a place to host their code and collaborate online.

I talked to the team’s Infrastructure Lead, Kevin Fenzi, about how they’re using Gluster to ensure availability of these services while making the most of their server resources.

Fedora Hosted is served from a pair of virtual instances hosted at serverbeach.com, which donates these resources to the project. The instances run Red Hat Enterprise Linux 6 and maintain a replicated GlusterFS 3.3.0 volume to keep the 50GB of project data stored at Fedora Hosted in sync. The nodes use Gluster’s NFS mount support, which the team found to deliver better performance with the many small files that Fedora Hosted serves.

“Both servers are in DNS, so it’s round robin which one you hit for any given connection. Since the data on the backend is replicated, both of them are up to date at any given time,” Kevin explained. “This way, not only can we handle more load cpu-wise, but if we wish to reboot one node for an update or the like, we simply adjust DNS and there is no outage seen by our projects.”

The Road to Gluster

An earlier incarnation of Fedora Hosted was also run on a pair of virtual instances, one actively serving users and the other a standby kept in sync with an hourly rsync job. If the primary node failed, the standby instance could be brought up in short order, but the hourly sync window meant that the service could suffer an hour or two of data loss.

The Fedora Infrastructure team managed to close this sync window by shifting to a new configuration based on the DRBD project. While this solution dealt with the problem of data loss following an outage, the configuration left one node mostly idle.

The team’s first foray into a GlusterFS-backed configuration for Fedora Hosted turned up a couple of issues with the then-current GlusterFS version 3.2, which the Gluster project addressed in their 3.3 release.

“The Gluster folks were very responsive to our issues and were working on the patch very soon after we requested it,” Kevin explained. “Additionally, 3.3 performance seemed to be much better than 3.2 for our use cases.”

Looking ahead, Kevin and the other members of the Infrastructure team have their eyes set on continued performance enhancements. While the Gluster 3.3-backed Fedora Hosted service has handled its community collaboration load quite well, Kevin pointed out that “we could always want better performance.”

A Buzzword-Packed Return to Gluster UFO

A little while back, I tested out the Unified File and Object feature in Gluster 3.3, which taps OpenStack’s Swift component to handle the object half of the file and object combo. It took me kind of a long time to get it all running, so I was pleased to find this blog post promising a Quick and Dirty guide to UFO setup, and made a mental note to return to UFO.

When my colleague John Mark asked me about this iOS Swift client from Rackspace, I figured that now would be a good time to revisit UFO, and do it on one of the Google Compute Engine instances available to me while I’m in my free trial period with the newest member of Google’s cloud computing family. (OpenStack, iOS & Cloud: Feel the Search Engine Optimization!)

That Quick and Dirty Guide

The UFO guide, written by Kaleb Keithley, worked just as quickly as advertised: start with Fedora 16, 17 or RHEL 6 (or one of the RHEL 6 rebuilds) and end with a simple Gluster install that abides by the OpenStack Swift API. I installed on CentOS 6 because this, along with Ubuntu, is what’s supported right now in Google Compute engine.

Kaleb notes at the bottom of his post that you might experience authentication issues with RHEL 6–I didn’t have this problem, but I did have to add in the extra step of starting the memcache service manually (service memcached start) before starting up the swift service (swift-init main start).

The guide directs you to configure a repository that contains the up-to-date Gluster packages needed. I’m familiar with this repository, as it’s the same one I use on my F17 and CentOS 6 oVirt test systems. I also had to configure the EPEL repository on my CentOS 6 instance, as UFO requires some packages not available in the regular CentOS repositories.

I diverged from the guide in one other place. Where the guide asks you to add this line to the  [filter:tempauth] section of /etc/swift/proxy-server.conf:

user_$myvolname_$username=$password .admin

I found that I had to tack on an extra URL to that line to make the iOS client work:

user_$myvolname_$username=$password .admin https://$myhostname:443/v1/AUTH_$myvolname

Without the extra URL, my UFO setup was pointing the iOS client to a 127.0.0.1 address, which, not surprisingly, the iOS device wasn’t able to access.

The iOS Client (and the Android non-client)

Rackspace’s Cloud Mobile application enables users of the company’s Cloud Servers and Cloud Files offering to access these services from iOS and Android devices. I tried out both platforms, the former on my iPod Touch (recently upgraded to iOS 6) and on my Nexus S 4G smartphone (which runs a nightly build of Cyanogenmod 10).

My subhead above says Android non-client, because, as reviewers in the Google Play store and the developer in this github issue comment both indicate (but the app description and [non-existent] docs do not), the current version of the Android client doesn’t work with the recent, Swift-based incarnation of Rackspace’s cloud Files service.

What’s more, the Android version of the client does not allow any modification of one’s account settings. When I was trial-and-erroring my way toward figuring out the right account syntax, this got pretty annoying. Also annoying was the absence of any detailed error messages.

Things were better (albeit still undocumented) with the iOS version of the client, which allowed for account details editing, for ignoring invalid ssl certs, and for viewing the error message returned by any failed API operations.

In the parlance of the above Gluster UFO setup guide, here are the correct values for the account creation screen (the one you reach in the iOS client after selecting “Other” on the Provider screen:

  • Username:    $myvolname:$username
  • API Key:    $password
  • Name:   $whateveryouwant
  • API Url:    https://$myhostname:443/auth/v1.0
  • Validate SSL Certificate:   OFF
After getting those account details in place, you’ll be able to view the Swift/Gluster containers accessible to your account, create new containers, and upload/download files to and from those containers. There were no options for managing permisisons through the iOS client, so when I wanted to make a container world-readable, I did it from a terminal, using the API.

Google Compute Engine

As I mentioned above, I tested this on Google Compute Engine, the Infrastructure-as-a-Service offering that the search giant announced at its last Google I/O conference. I excitedly signed up for the GCE limited preview as soon as it was announced, but for various reasons, I haven’t done as much testing with it as I’d planned.

Here are my bullet-point impressions of GCE:

  • CentOS or Ubuntu — On GCE, for now, you run the instance types they give you, and that’s either CentOS 6 or Ubuntu 10.04. You can create your own images, by modifying one of the stock images and going through a little process to export and save it. This comes in handy, because, for now, on GCE, there are…
  • No persistent instances — It’s like the earlier days of Amazon EC2. Your VMs lose all their changes when they terminate. There is, however…
  • Persistent storage available — You can’t store VMs in persistent images, but you can hook up your VMs to virtual disks that persist, for storing data.
  • No SELinux — The CentOS images come with SELinux disabled. This turned out to be annoying for me, as OpenShift Origin and oVirt both expect to find SELinux enabled. This cut short a pair of my tests. I was able to modify the oVirt Engine startup script not to complain about SELinux, but was then foiled due to…
  • Monolithic kernel (no module loading) — oVirt engine, which I’d planned to test with a Gluster-only cluster (real virt wouldn’t have worked atop the already-virtualized GCE), wanted to load modules, and there’s no module-loading allowed (for now) on GCE. All told, though…
  • GCE is a lot like EC2 — With a bit of familiarity with the ways of EC2, you should feel right at home on GCE. I opened firewall ports for access to port 443 and port 22 using security groups functionality that’s much like what you have on EC2. You launch instances in a similar way, with Web or command line options, and so on.

 

oVirt 3.1, Glusterized

One of the cooler new features in oVirt 3.1 is the platform’s support for creating and managing Gluster volumes. oVirt’s web admin console now includes a graphical tool for configuring these volumes, and vdsm, the service for responsible for controlling oVirt’s virtualization nodes, has a new sibling, vdsm-gluster, for handling the back end work.

Gluster and oVirt make a good team — the scale out, open source storage project provides a nice way of weaving the local storage on individual compute nodes into shared storage resources.

To demonstrate the basics of using oVirt’s new Gluster functionality, I’m going to take the all-in-one engine/node oVirt rig that I stepped through recently and convert it from an all-on-one node with local storage, to a multi-node ready configuration with shared storage provided by Gluster volumes that tap the local storage available on each of the nodes. (Thanks to Robert Middleswarth, whose blog posts on oVirt and Gluster I relied on while learning about the combo.)

The all-in-one installer leaves you with a single machine that hosts both the oVirt management server, aka ovirt-engine, and a virtualization node. For storage, the all-in-one setup uses a local directory for the data domain, and an NFS share on the single machine to host an iso domain, where OS install images are stored.

We’ll start the all-in-one to multi-node conversion by putting our local virtualization host, local_host, into maintenance mode by clicking the Hosts tab in the web admin console, clicking the local_host entry, and choosing “Maintenance” from the Hosts navigation bar.

Once local_host is in maintenance mode, we click edit, change to the Default data center and host cluster from the drop down menus in the dialog box, and then hit OK to save the change.

This is assuming that you stuck with NFS as the default storage type while running through the engine-setup script. If not, head over to the Data Centers tab and edit the Default data center to set “NFS” as its type. Next, head to the Clusters tab, edit your Default cluster, fill the check box next to “Enable Gluster Service,” and hit OK to save your changes. Then, go back to the Hosts tab, highlight your host, and click Activate to bring it back from maintenance mode.

Now head to a terminal window on your engine machine. Fedora 17, the OS I’m using for this walkthrough, includes version 3.2 of Gluster. The oVirt/Gluster integration requires Gluster 3.3, so we need to configure a separate repository to get the newer packages:

# cd /etc/yum.repos.d/
# wget http://repos.fedorapeople.org/repos/kkeithle/glusterfs/fedora-glusterfs.repo

Next, install the vdsm-gluster package, restart the vdsm service, and start up the gluster service:

# yum install vdsm-gluster
# service vdsmd restart
# service glusterd start

The all-in-one installer configures an NFS share to host oVirt’s iso domain. We’re going to be exposing our Gluster volume via NFS, and since the kernel NFS server and Gluster’s NFS server don’t play well nicely together, we have to disable the former server.

# systemctl stop nfs-server.service && systemctl disable nfs-server.service

Through much trial and error, I found that it was also necessary to restart the wdmd service:

# systemctl restart wdmd.service

In the move from v3.0 to v3.1, oVirt dropped its NFSv3-only limitation, but that requirement remains for Gluster, so we have to edit /etc/nfsmount.conf and ensure that Defaultvers=3, Nfsvers=3, and Defaultproto=tcp.

Next, edit /etc/sysconfig/iptables to add the firewall rules that Gluster requires. You can paste the rules in just before the reject lines in your config.

# glusterfs
-A INPUT -p tcp -m multiport --dport 24007:24047 -j ACCEPT
-A INPUT -p tcp --dport 111 -j ACCEPT
-A INPUT -p udp --dport 111 -j ACCEPT
-A INPUT -p tcp -m multiport --dport 38465:38467 -j ACCEPT

Then restart iptables:

# service iptables restart

Next, decide where you want to store your gluster volumes — I store mine under /data — and create this directory if need be:

# mkdir /data

Now, head back to the oVirt web admin console, visit the Volumes tab, and click Create Volume. Give your new volume a name, and choose a volume type from the drop down menu. For our first volume, let’s choose Distribute, and then click the Add Bricks button. Add a single brick to the new volume by typing the path you desire into the the Brick Directory field, clicking Add, and then OK to save the changes.

Make sure that the box next to NFS is checked under Access Protocols, and then click OK. You should see your new volume listed — highlight it and click Start to start it up. Follow the same steps to create a second volume, which we’ll use for a new ISO domain.

For now, the Gluster volume manager neglects to set brick directory permissions correctly, so after adding bricks on a machine, you have to return to the terminal and run chown -R 36.36 /data (assuming /data is where you are storing your volume bricks) to enable oVirt to write to the volumes.

Once you’ve set your permissions, return to the Storage tab of the web admin console to add data and iso domains at the volumes we’ve created. Click New Domain, choose Default data center from the data center drop down, and Data / NFS from the storage type drop down. Fill the export path field with your engine’s host name and the volume name from the Gluster volume you created for the data domain. For instance: “demo1.localdomain:/data”

Wait for data domain to become active, and repeat the above process for the iso domain. For more information on setting up storage domains in oVirt 3.1, see the quick start guide.

Once the iso domain comes up, BAM, you’re Glusterized. Now, compared to the default all-in-one install, things aren’t too different yet — you have one machine with everything packed into it. The difference is that your oVirt rig is ready to take on new nodes, which will be able to access the NFS-exposed data and iso domains, as well as contribute some of their own local storage into the pool.

To check this out, you’ll need a second test machine, with Fedora 17 installed (though you can recreate all of this on CentOS or another Enterprise Linux starting with the packages here). Take your F17 host (I start with a minimal install), install the oVirt release package, download the same fedora-glusterfs.repo we used above, and make sure your new host is accessible on the network from your engine machine, and vice versa. Also, the bug preventing F17 machines running a 3.5 or higher kernel from attaching to NFS domains isn’t fixed yet, so make sure you’re running a 3.3 or 3.4 version of the kernel.

Head over to the Hosts tab on your web admin console, click New, supply the requested information, and click OK. Your engine will reach out to your new F17 machine, and whip it into a new virtualization host. (For more info on adding hosts, again, see the quick start guide.)

Your new host will require most of the same Glusterizing setup steps that you applied to your engine server: make sure that vdsm-gluster is installed, edit /etc/nfsmount.conf, add the gluster-specific iptables rules and restart iptables, create and chown 36.36 your data directory.

The new host should see your Gluster-backed storage domains, and you should be able to run VMs on both hosts and migrate them back and forth. To take the next step and press local storage on your new node into service, the steps are pretty similar to those we used to create our first Gluster volumes.

First, though, we have to run the command “gluster peer probe NEW_HOST_HOSTNAME” from the engine server to get the engine and it’s new buddy hooked up Glusterwise (this another of the wrinkles I hope to see ironed out soon, taken care automatically in the background).

We can create a new Gluster volume, data1, of the type Replicate. This volume type requires at least two bricks, and we’ll create one in the /data directory of our engine, and one in the /data directory of our node. This works just the same as with the first Gluster volume we set up, just make sure that when adding bricks, you select the correct server in the drop down menu:

Just as before, we have to return to the command line to chown -R 36.36 /data on both of our machines to set the permissions correctly, and start the volumes we’ve created.

On my test setup, I created a second data domain, named data1, stored on the replicated Gluster domain, with the storage path set to localhost:/data1, on the rationale that VM images stored on the data1 domain would stay in sync across the pair of hosts, enabling either of my hosts to tap local storage for running a particular VM image. But I’m a newcomer to Gluster, so consult the documentation for more clueful Gluster guidance.

Fedora 17, OpenStack Essex & Gluster 3.3: All Smushed Together

Within the past couple weeks, Fedora and Gluster rolled out new versions, packed with too many features to discuss in a single blog post. However, a couple of the stand-out updates in each release overlap neatly enough to tackle them together–namely, the inclusion of OpenStack Essex in Fedora 17 and support for using Gluster 3.3 as a storage backend for OpenStack.

I’ve tested OpenStack a couple of times in the past, and I’m happy to report that while the project remains a fairly complicated assemblage of components, the community around OpenStack has a done a good job documenting the process of setting up a basic test rig. Going head to head with Amazon Web Services, even with the confines of one’s own organization, won’t be a walk in the park, but it’s fairly easy to get OpenStack up an running in a form suitable for further learning and experimentation.

OpenStack on Fedora 17

The getting started with OpenStack on Fedora 17 howto that I followed for my latest test involves quite a bit of command line cut and paste, but it didn’t take long for me to go from a minimal install Fedora 17 virtual machine to a single node OpenStack installation, complete with compute, image hosting, authentication, and dashboard services–everything I needed to launch VMs, register images, and manage everything from the comfort of a web UI.

A couple of notes, I did everything on this minimal-install Fedora machine as root–since this is a soon-to-be blown-away test VM, I didn’t bother to create additional users. You may need to sprinkle in some sudos if you’re running as non-root. Also, I hit at least one issue with SELinux (related to glance) during my tests. I never turn off SELinux by default, but once I hit an error on a test box, I throw it into permissive mode.

Also, I elected to run the whole show (the openstack part of it, at least) within a single virtual machine running on my home oVirt installation, so the performance of my guest instances was very slow, but everything worked well enough for me to take OpenStack for a spin, and get to fiddling with trickier OpenStack topics, such as…

The one OpenStack element that the Fedora howto touches on only briefly is OpenStack Swift, the object storage system intended to replace Amazon’s S3. Here’s what the howto has to say about Swift:

These are the minimal steps required to setup a swift installation with keystone authentication, this wouldn’t be considered a working swift system but at the very least will provide you with a working swift API to test clients against, most notably it doesn’t include replication, multiple zones and load balancing.

 

(Configure swift with keystone)

What an ideal segue for Gluster 3.3, a storage software project with replication and load balancing as its stock in trade. The Gluster portion of my tests was quite a bit trickier than the OpenStack on Fedora part had been, but I learned a lot about Gluster and OpenStack along the way.

Building Gluster 3.3 Packages

First off, Gluster 3.3 shipped a bit after Fedora 17, and the version of Gluster available in the Fedora software repositories is still at 3.2. What’s more, the 3.3 packages offered by the Gluster project target Fedora 16, as well. The Fedora folder on the Gluster download server doesn’t include any source rpms, but I found a spec file for building Fedora rpms in the Gluster source tarball on the download server.

On my Fedora 17 notebook, I fetched the build dependencies for Gluster 3.2 using the command yum-builddep from the yum-utils package:

sudo yum-builddep glusterfs

I grabbed the file glusterfs.spec from the glusterfs-3.3.0.tar.gz tarball, dropped it in ~/rpmbuild/SPECS, and put the tarball into ~/rpmbuild/SOURCES. If you don’t have rpm-build installed on your Fedora machine, you’ll need to do that, as well.

Next, I built my Gluster 3.3 packages for F17:

rpmbuild -bb ~/rpmbuild/SPECS/glusterfs.spec

Then, I copied the packages over to my OpenStack test machine and updated the glusterfs and glusterfs-fuse packages that had been pulled in as dependencies during my OpenStack on F17 install:

scp ~/rpmbuild/RPMS/x86_64/glusterfs-* root@openstackF17:/root
ssh root@openstackF17 yum install -y ./glusterfs-3.3.0-1.fc17.x86_64.rpm glusterfs-fuse-3.3.0-1.fc17.x86_64.rpm

Gluster+OpenStack: The Easy Way

As described on the Connecting with OpenStack Resource Page on the Gluster wiki, there are two ways of using Gluster with OpenStack. The first is super simple, and amounts to locating the images for your running OpenStack instances on Gluster by simply mounting a Gluster volume at the spot where OpenStack expects to place these images. On the resource page, there’s a PDF titled OpenStack VM Storage Guide that steps through the process of creating a four node distributed-replicated volume and mounting it in the right spot. Easy.

I did this with my test OpenStack setup, and it worked as advertised. I kicked off a yum update operation in one of my OpenStack instances, and then ungracefully shutdown (pulled the virtual plug on) the gluster VM node where the instance was calling home. I watched as the yum update process paused for a short time before continuing happily enough on one of the other Gluster nodes I’d configured.

Where things got quite a bit trickier was with the second OpenStack-Gluster integration option, that for Unified Object and File Storage. Gluster’s UFO is based on a slightly modified version of OpenStack Swift, where Gluster brings the storage, and users are able to access files and content either as objects, through Swift’s REST interface, or as regular files, through Gluster’s FUSE or NFS mounts.

Building Gluster UFO Packages

Again, I started by building some packages. The Gluster download site offers UFO (aka gluster-swift) packages for enterprise Linux 6 (RHEL and its relabeled children). There’s a source tarball, but unlike the main glusterfs tarball, the gluster-swift tarball doesn’t include a spec file for building rpms. I located spec files for gluster-swift and gluster-swift-plugin at Gluster’s github site, but these spec files referenced a handful of patches that weren’t in the git repository, so I wasn’t able to build them.

After Googling a while for the missing patches, I found source rpms for gluster-swift and gluster-swift-plugin in a public source repository for Red Hat Storage 2.0. Both of these packages are a hair older than the ones in the Gluster download location: gluster-swfit-1.4.8-3 vs 1.4.8-4 and gluster-swift-plugin-1.0-1 vs. 1.0-2, but I forged ahead with these.

I had to tweak the SPEC files slightly, changing references to the python2.6 in el6 to the python2.7 that ships with Fedora 17, but I managed to build both of them without much hassle, before copying them over to my openstack test machine and installing them:

rpmbuild -bb ~/rpmbuild/SPECS/gluster-swift.spec
rpmbuild -bb ~/rpmbuild/SPECS/gluster-swift-plugin.spec
scp ~/rpmbuild/RPMS/noarch/gluster-swift* root@openstackF17:/root
ssh root@openstackF17 yum install -y ./gluster-swift-*

Gluster-Swift + OpenStack

Over on our openstackF17 machine, the gluster-swift package has placed a bunch of configuration files in /etc/swift. We’re going to leave most of these configurations in place, but we need to make a few modifications, starting with fs.conf:

vi /etc/swift/fs.conf

I’m using the four VM gluster cluster described in the OpenStack VM Storage Guide I mentioned above, which is remote from my openstack server, so I have to change “mount_ip” to the ip of one of my gluster servers, and change “remote_cluster” to yes. If my gluster volume, or part of it, was local, I could have left these values alone.

The other thing required to make the remote gluster cluster bit work is enabling passwordless ssh login between my openstackF17 machine and the gluster server I pointed to in fs.conf:

ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub root@gluster1

More config file editing. Next up, proxy-server.conf. In order to get gluster-swift working with OpenStack’s Keystone authentication service, we’re going to grab some of the configuration info from the Fedora 17 OpenStack guide:

vi /etc/swift/proxy-server.conf

Change the “pipeline” line under [pipeline:main], adding “authtoken keystone” to the line, and removing “tempauth”:

pipeline = healthcheck cache authtoken keystone proxy-server

And then add these sections to correspond with our added elements. As to the “are these needed” comment question, that comes from the howto in the Fedora wiki, and I don’t know the answer, so I left it in:

[filter:keystone]
paste.filter_factory = keystone.middleware.swift_auth:filter_factory
operator_roles = admin, swiftoperator
[filter:authtoken]
paste.filter_factory = keystone.middleware.auth_token:filter_factory
auth_port = 35357
auth_host = 127.0.0.1
auth_protocol = http
admin_token = ADMINTOKEN
# ??? Are these needed?
service_port = 5000
service_host = 127.0.0.1
service_protocol = http
auth_token = ADMINTOKEN

If you followed along with the Fedora 17 OpenStack howto, you’ll have a file (keystonerc) in your home directory that sets your OpenStack environment variables. Let’s make sure our variables are set correctly:

. ~/keystonerc

Next, we run these commands to replace some placeholder values in our proxy-server.conf file:

openstack-config --set /etc/swift/proxy-server.conf filter:authtoken admin_token $ADMIN_TOKEN
openstack-config --set /etc/swift/proxy-server.conf filter:authtoken auth_token $ADMIN_TOKEN

Now we add the Swift service and endpoint to Keystone:

SERVICEID=$(keystone service-create --name=swift --type=object-store --description="Swift Service" | grep "id " | cut -d "|" -f 3)
echo $SERVICEID # just making sure we got a SERVICEID
keystone endpoint-create --service_id $SERVICEID --publicurl "http://127.0.0.1:8080/v1/AUTH_$(tenant_id)s" --adminurl "http://127.0.0.1:8080/v1/AUTH_$(tenant_id)s" --internalurl "http://127.0.0.1:8080/v1/AUTH_$(tenant_id)s"

Gluster-swift will be looking for Gluster volumes that correspond to Swift account names. We need to figure out what names we need, and create Gluster volumes with those names. We ask Keystone about our account names:

keystone tenant-list

In my setup, this turns up four accounts:

+----------------------------------+--------------------+---------+
|                id                |        name        | enabled |
+----------------------------------+--------------------+---------+
| 18571133bf9b4236be0ad45f2ccff135 | invisible_to_admin | True    |
| 1918b675fa1f4b7f87c2bb3688f6f2f7 | admin              | True    |
| 42c41f15e6a24fa5b105e89b60af18fb | demo               | True    |
| decd4d68f50345eeb2eae090e2d32dcb | service            | True    |
+----------------------------------+--------------------+---------+

So far, I’ve needed volumes for the admin and demo accounts. You’ll need to name your Gluster volumes after the value in the “id” column. Following the four node example in the OpenStack VM Storage Guide, the command (which you must run from on of your gluster nodes) will look like this, substituting your own Gluster node IPs, and your volume name values from keystone tenant-list:

gluster volume create 42c41f15e6a24fa5b105e89b60af18fb replica 2 10.1.1.11:/vmstore 10.1.1.12:/vmstore 10.1.1.13:/vmstore 10.1.1.14:/vmstore

Run the command again so you have volumes that correspond to both the admin and demo tenant ids.

Each Gluster volume needs its own mount point. You don’t have to create your mount points manually on each server. And again, the Gluster volume doesn’t have to live on a remote cluster. Any properly named Gluster volume on a server that gluster-swift knows about (from fs.conf, which we modded earlier) and can access passwordlessly (red spell check underline be damned) ought to work.

All right, almost done. Start or restart memcached, and start gluster-swift:

service memcached restart
swift-init main start

Now, we should be able to test gluster-swift:

swift list

If all is well, gluster-swift should try to mount the admin volume (the keystonerc file is telling swift to use the admin account), and satisfying hard drive activity gurgling sounds should ensue. If you run the command “mount” you should see that you have a Gluster volume mounted at the mount point “/mnt/gluster-object/AUTH_YOURADMINVOLNAME”. Like so:

gluster1:1918b675fa1f4b7f87c2bb3688f6f2f7 on /mnt/gluster-object/AUTH_1918b675fa1f4b7f87c2bb3688f6f2f7 type fuse.glusterfs (rw,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072)

You can test uploading to the volume from the command line:

swift upload container /path/to/file

You ought to be able to ssh in to one of your gluster nodes, navigate to the mount point that corresponds to your admin account volume, and see the file you just uploaded.

For a more GUI-ful experience, we can check out our snazzy gluster-swift store from the OpenStack dashboard (you’ll have installed this if you followed the OpenStack Fedora 17 howto). Make sure your firewall is down or you have port 80 open, and restart your web server for good measure:

service httpd restart

Visit the dashboard at http://YOUROPENSTACKSERVERIP/dashboard, and log in with admin and (assuming you retained the password default from the howto) verybadpass. In the left nav column, click the “Project” tab. The default project is “demo” (which is why we had to create a demo volume). In the left nav column, under “Object Store,” click “Containers,” and create, delete, upload to, download from, etc. at will. In the background, just as with the “swift list” command, gluster-swift should be reacting to the dashboard’s requests by mounting your Gluster volume.

UFO in Action

For Further Study: Glance on Gluster-Swift

By default, OpenStack’s image-hosting service, Glance, stores its images in a local directory, but it’s possible to use Swift as a back-end for that image storage, by the backend listed in /etc/glance/glance-api.conf from “file” to “swift” and by correctly hooking up the authentication details there. I’ve yet to get this working, though.

In this OpenStack on Ubuntu howto, the author notes that a glance package from a particular PPA is required to make this work, due to some issue in the latest (as of 5/28/12) glance package from the official repos. I took a peek at the patches included in this substitute package, and couldn’t immediately tell what, if anything, might be missing from Fedora’s glance package.

If you’re still with me, and you’re interested in setting up all or part of this yourself, don’t hesitate to ask me questions–I puzzled over this for a week or so, and if I can save you some time, that’ll make my toiling more worthwhile to me. Fire away in the comments below, or hit me up on IRC. I’m jbrooks on freenode IRC, and #gluster is one of the channels where you can find me.