Mac OS X Server,  Xsan

Rorke Aurora Galaxy and Xsan

With Apple bundling Xsan into Lion and opening up more storage options than before, it seems like time to start exploring alternatives to Promise Vtrak’s for Xsan storage. ActiveStorage makes a very nice RAID chassis and should be shipping metadata controller appliances soon. I’ve discussed both here before and they make for very nice kit. But in order to have an ‘ecosystem’ you really need a little biodiversity. And the Xsan environment needs to become more of an ecosystem and less of a vendor lock-in situation. So another option that I’d like to discuss is the Rork Aurora Galaxy. These little firecrackers have a lot of potential upside:

  • 4 8Gbps Fibre Channel controllers (in the form of Atto Celerity cards)
  • 36 drive bays
  • 3TB drive modules
  • Low power requirements
  • Linux, so you have root access
  • For those familiar w/ Webmin, the NumaRAID plug-in will seem familiar
  • Great tech support
  • More PCI slots, so upgradeable with more cards, etc.

36 drive bays at 3TB per bay means 108TB of raw storage running at 32Gbps per chassis. I recently had the chance to put a pair of these things through their paces. Using a combination of vMeter and the Qlogic Enterprise Fabric Suite Manager, we added stream after stream and when all of the clients were running multicam edits for an aggregate throughput of well over 50Gbps, we still hadn’t yet found a place where we started to drop frames. But we ran out of clients, streams, media, etc so stopped testing at that point. Watching all the statistics on the RAIDs and clients though, I do not doubt that we could have saturated a good 60Gbps.

When the RAIDs show up they have 3 LUNs baked into them. Given the size of the RAID and how Xsan likes to have LUNs added, it seemed prudent to convert those 3 LUNs into 4 LUNs. You manage the RAID by using Webmin, which runs on a default port of 10000. The default IP address is 192.168.1.129, so use the address:

http://192.168.1.129:10000

When you see a login screen, the default username and password is admin/password. Click on NumaRAID in the upper left corner of the screen to see details of the chassis, with the RAIDs first and the LUNs second. Click on a RAID to delete the RAID and associated LUNs for that RAID. When you delete LUNs you need to restart the RAID controller for clients with Apple LSI cards to see them. This can be done by stopping the nr_target service:

service nr_target stop

Then start it back up:

service nr_target start

You then have a choice of how to create the RAIDs. You can carve a RAID up to create a LUN. Therefore, you can create 1 or 2 RAID 6 sets and carve those up into 4 LUNs. With 4 fibre channel ports, we then associated 1 LUN per port. However, on testing failover, we realized that when you perform a 1:1 mapping in such a fashion that you get no fibre channel failover. While unlikely that a fibre channel port will fail, there can often be a lot of points of failure on the cables, which may go through fibre channel patch panels and other malfeasants en route to their switch. Therefore, we associated two ports to each LUN. This represented about a 6% boost in performance over not masking any LUNs to ports, although about 3% less than the 1:1 mapping scheme. If you do 4 LUNs you can use about 4gigs of RAM per LUN safely, although I was unable to tell any performance change between using the factory default per LUN cache of 1gig. You will also want to set them as targets in your switch, which may RSCN suppress as most switches tend to think an ATTO card is an initiator (and it often is).

When you save things in the Webmin GUI, you are executing writes into an XML file that is stored at /usr/libexec/webmin/NumaRAID/nrconfig.xml. Once one of these is in production, I would personally start making a backup of this file before and after any changes. As an example if you cat nrconfig you can grep for certain items, such as lun:

cat nrconfig.xml | grep -i lun

The number of LUNs that you see here should match up with the output of lsscsi:

lsscsi -s

You can also use lsscsi to see the other hosts on the network and confirm that your zoning is open:

lsscsi -H

You can also change various settings, such as the raid_cache size in the nrconfig.xml file. When you make a change in the xml file, make sure to back it up first and when you bring the nr_target service back up, make sure to watch /var/log/messages, where the Aurora Galaxy saves its logs:

tail -f /var/log/messages

From your clients, look at /dev. You should see that there is one device node per LUN, per port that the LUN is available on in each chassis. If you do not mask the LUNs then you would do well to create a zone for each port of the RAID, much as you might do with Linux clients of your Xsan. If you use soft zoning and build groups, this actually doesn’t take much time at all. There isn’t any special foo for volume creation. Once the clients can see the LUNs, treat the new volume as a bigger, less power consuming version of your existing storage. Now, while Linux clients can often get away doing non-port based zoning, the Rorke will likely show in Xsan Admin twice on at least one LUN but not more than 2 LUNs per chassis if it’s not zoned just right. This can cause high PIO HiPriWr stats.

Once it’s in production you’ll want to manage it. The IP and user/password can be changed pretty easily. Beyond that, there isn’t an snmp MIB specifically for the Aurora. However, it is just Linux, so a standard snmpwalk against a Linux host from your monitoring solution should result in all you need to know regarding possible hardware failure. And the fact that it’s Linux brings up an interesting point. This device is made from off-the-shelf hardware (good off-the-shelf hardware too, not some crappy Fry’s mother board that was on sale ’cause they spilled a 1.75 liter of Crown and Coke on it while partying in the back). It’s a computer with a bunch of drives in it. And many may be concerned about putting a “software RAID” into production. But think about it this way, all RAID controllers are software. The question is whether the software is baked into some EEPROM chip or firmware, or whether it’s loaded on top of a generic kernel.

There are certainly a few things you should not do with an appliance-oriented Linux distro. Running a torrent off it for example. But if you need to be told that then you should have your RAIDs taken away… While you might be able to get away with it, don’t run Samba on these (if they’re going into an Xsan/StorNext/HyperFS environment they’re gonna’ have a virtualized file system anyway). Don’t install StorNext (it’s confusing, am I a target or an initiator, I’m so confused I think I’ll just fail repeatedly). Don’t install the root kit or even a lot of security tools. If antivirus, snort or host based IDS tools are required just unplug the network cable. This is an appliance. You can treat it as such and pretend that you only have the Webmin access (I just can’t help but to start tinkering around under the hood…).

Finally, I don’t have long-term statistics on how these will hold up. I’ve never heard anyone complain terribly about them, but this model is still pretty new. Given the disk density I’m curious to see how things will go. Rorke will sell you a parts kit in the form of an unlicensed, diskless chassis. Given my experience with other vendors I’d have to recommend getting one, but they didn’t seem too insistent that it was a requirement.