Tiny Deathstars of Foulness

Planning an Xsan is perhaps the most complicated part of any deployment. First, start with one of two objectives, speed or size (or both). How big does the SAN need to be and what speeds does the SAN (aggregate speed of all clients) need to be able sustain? That becomes the primary design consideration. Beyond that, you’ll also want to plan how it will get backed up and when, the makeup of the clients (Mac, PC, Linux), how permissions will get handled for new files written to the SAN, etc.


Xsan needs an out-of-band metadata network. This network is used to transfer information, or metadata on files being written to the SAN. The metadata network on an Xsan should be low latency. Nothing should potentially interfere with the transfer of data. Therefore, make sure that you disable all the management features of your switch. But make sure it’s a good switch. I prefer to use a managed switch, disabling all the management features.

Additionally, each volume you create needs a dedicated Metadata Storage Pool with at least one LUN in the pool. This LUN shouldn’t be used for anything else. When looking to carve up storage into pools, consider that a mirror of 2 of your drives on your fibre channel array should be used for the metadata LUN. This means that the 16 bay or 24 bay chassis with let’s say 3TB drives in each bay is now a 14 or 22 bay chassis and you have 6TB less of storage. This isn’t a bad thing as the metadata LUN should be dedicated to that task and should be really fast. The rebuild should be fast as well, in the event of a drive failure; therefore, the mirrored drive approach within the LUN.

The Components

Fiber Channel is a technology for transmitting data between computer devices similar to SCSI but with networking components based on fiber optics. Fiber Channel is especially suited for attaching computer servers to shared storage devices and for interconnecting storage controllers and drives. Apple uses Fiber Channel for Xsan, it’s storage virtualization platform. All of the objects that make up a Fiber Channel network are referred to as the fabric. These typically include HBAs (the card that goes in a machine), cabling, transceivers, a fiber channel switch and the fiber channel controllers on the storage.

A RAID can be split into multiple logical units, referred to as a LUN. Each side, or channel, of the RAID is, by default a single LUN. When the LUNs are formatting (which generally takes awhile) you will start to see them in disk utility. Do not assign a file system to them yet if you are to use them with Xsan. Instead you will use the Xsan Admin interface or cvlabel command to label each of your LUNs, which marks them as able to be used by Xsan.

In Xsan you can take multiple LUNs (when presented over fibre channel) and stream data to them in a round-robin fashion. When doing so you will group them together in what Xsan calls Storage Pools. Each Storage Pool has a maximum throughput of about 4 LUNs worth of storage, although they can have affinities that map more storage and therefore more throughput (thus no maximums). You can then lump multiple Storage Pools into a given Volume to obtain substantial volume sizes as well as increasing your aggregate bandwidth between hosts.

The default stripe breadth on a metadata storage pool is 256 blocks. Quantum recommends using a 16 or 64 block stripe breadth for metadata storage pools. If you have a relatively small volume with a small number of files then use 16 and if you have a larger environment with big files use 64. As with many things re: Xsan the tuning per environment is where you will get the biggest bang for your buck, but it is worth noting that no matter which way you go, this is a setting that should be changed on each deployment in order to keep with Quantum best practices.

In Xsan, the PIO HiPriWr shows you how latent the connection to your LUNs is. If the connection to any of your LUNs is too high then it can cause instability and worse, potential volume integrity issues. If you run into issues with this kind of latency then you should fix it. But if you can’t, then you can deal with it programatically using the Buffer Cache Size. Increasing the buffer will allow for more caching, which will in turn allow for more latent LUNs to have less effect on the overall performance, health and viability of the SAN. Additionally, the iNode Cache should be increased for the same purpose (although more specifically to allow for iNodes to be written if you have latency on your Metadata LUN(s).
These settings are defined in the volume setup wizard but can be updated in the VOLUMENAME.cfg file of your SAN volume, in /Library/Preferences/FileSystems/Xsan/config.


December 21st, 2006

Posted In: Xsan