Tag Archives: Storage

Windows Server

Configure Volume Shadow Copy on Windows Server

Working with Shadow Copy requires elevated privileges. I usually access Shadow Copy through vssuirun. This prompts for elevating privileges. Once open, use the Settings pane to select the volume you’d like to schedule backups to. Then choose how much space shadow copies can use. Click on the Schedule button to configure how frequently backups run. I usually try to time these things for when the server isn’t slammed. Otherwise you might run into issues.

By default, Shadow Copy keeps 64 versions of each file. Running snapshots every hour. You can restore easily, by selecting a volume, although volume-based restores are not supported on system derives. Restores can be done using vssuirun and then using the Revert Now… button.

The quicker way to do all this is to use the vssadmin command, which has a lot more options. Run vssadmin along with the list verb to see a list of different types of objects. For example, to see a list of the storage used with Shadow Copy, use the vssadmin command, with the list verb and then the shadowstorage noun:

vssadmin list shadow storage

You’ll then see the storage for each volume, along with the space used, allocated and max for each. Run vssadmin followed by the list verb then shadows to see your shadow copy sets:

vssadmin list shadows

Each shadow copy set is displayed along with a generated ID. Creation times, volume information, tarot location, name of server and the type (e.g. client accessible) are all displayed. You can also use the add verb with these same options, along with a variety of switches for each. To add storage /for the a drive (G) on a drive (H) and give it a maxsize (64GB) use the following:

vssadmin add shadowstorage /for=g: /on=H /maxsize=64GB

Once you’ve added Shadow Copy Storage for a volume, you can then run a manual shadow copy on an enabled volume using the create verb, along with the shadow noun and then the /for: option, specifying the volume:

vssadmin create shadow /for=g:

To revert to a shadow copy, and this is dangerous as you might not want to revert so be careful here, use the revert verb along with shadow (yes, it’s singular as there’s only one) and then the /shadow option followed by the GUID of the copy to revert to:

vssadmin revert shadow /shadow=(AAAAAAAA-BBBB-11111-2222-CCCCCCCC)

To delete a shadow copy, use the delete verb, along with the shadows noun (yes, that’s randomly plural) and then the /shadow option following by the GUID of the shadow copy to delete (yes, I made that GUID up):

vssadmin delete shadows /shadow=(AAAAAAAA-BBBB-11111-2222-CCCCCCCC)

Alliteratively, use favorite option for this verb /oldest which just tosses the oldest backup (less typing, I’m lazy):

vssadmin delete shadows /for=g: /oldest

This is interactive as well, so you’ll have to hit y to confirm. Finally, when disabling all shadow copies (holy shiznit batman, we’re out of space big time) use the delete verb but this time followed by the drive letter to clear copies for:

vssadmin delete shadowstorage /for:g:

iPhone

iPad + Box.net = Win

Box.net is a cloud-based file sharing service that I used extensively in my last book. Similar to dropbox.com, Box.net allowed my publishers and I to automate our workflow with regard to the publishing process, but more importantly, I was actually able to do much of the review and exchange of files from the iPad, which was really nice given that the book was on iOS. I’ve been working with a few companies over the past few weeks on coming up with various strategies for cloud interoperability, and Box.net has come up a few times in this regard. Looks like I’m not the only one!

Business

EMC + Isilon = ?

EMC is buying Isilon for $2.25 billion. They want the video market, which seems to just be growing and growing. EMC stock dipped a little on the news, which is not surprising because Isilon isn’t worth what EMC is paying for it. What does this mean for the video market? More uncertainty. EMC has been an acquisition marathon runner since 2002, buying up Avamar, Documentum, Epoch, McData, Iomega, Archer, Greenplum, Bus-Tech, Kashya, Dantz, Mozy, Data Domain and even VMware (not to mention a bunch of other companies).

So what does this mean for Isilon’s product line moving forward. If you look at how the acquisition of Dantz and Iomega sparked the Insignia line at EMC and how profits from those lines jumped well over 60% it isn’t hard to think that Isilon will almost instantly become more profitable. Of course, the Mac Retrospect software was sold off and now has an uncertain future… One would like to think that the combination of EMC’s wide variety of technologies and Isilon will result in even more environments that Isilon can play and even more technical advances to the product line. But I guess we’ll see what happens there…

Xsan

Xsan TCO

I recently read an article in CIO magazine about the cost per gig per month. In the article they quoted Google at about 6 cents per gig per month.  I use Amazon for a few projects, which runs at about 12 cents per gig per month.   Including labor and hardware I decided to look at about what it would cost per gigabyte per month for Xsan storage.  Averaging out 30 installs that we did over the past year turned out a total of about 7.2 cents per gig per month, as opposed to around $2.00 per gig per month which is pretty average for many SAN solutions.  Now, Xsan does have its drawbacks compared to a lot of other truly enterprise-class storage solutions (no snapshots, no LUN redundancy, etc), but provided you build it properly, use it for the purposes that it is actually intended and therefore keep labor costs down over a 3 year cycle you can get similar TCO numbers to what you might end up paying for other solutions.  

Having said this, the larger Xsans typically require more infrastructure and features, which can lead to around double the cost per month per gig.  For example, introducing Cloverleaf or Vmirror into the equation will typically require us to double up storage costs and require bigger and better switches. 

I will not say that a cloud storage service such as Google or Amazon doesn’t have its place.  It absolutely does: offline storage, web storage, if you have an existing Xsan and need to archive but can’t spring for the tape drive, Final Cut Server archival (see my previous post on using that) if you travel a lot (like me), etc.  But before you jump on the Storage as a Service bandwagon run the numbers very carefully.  If it makes sense on a per-use basis then absolutely go for it, but try and factor everything in the process (especially the data access speed over your WAN pipe and additional load that will be placed on said pipe).

Windows Server

iSCSI Target Creation

The iSCSI Initiator that we use for connecting Windows to iSCSI targets has a friend.  It’s called Microsoft Windows Storage Server, which you can use to turn a DAS RAID in a Windows box into a LUN for iSCSI.  Good stuff.  Check out the data sheet here:

download.microsoft.com/download/d/8/4/ d84b1c50-e0bb-45ba-b2f4-356f4f456a88/WUDSS%20Datasheet_Final.doc

Now that’s not to say they’re the only game in town.  iSCSI Target is also a feature of OpenSolaris:

http://opensolaris.org/os/project/iscsitgt/

And there’s a nifty little Open Source Project called iSCSI Enterprise Target:

http://sourceforge.net/projects/iscsitarget/?abmode=1

Mac OS X

Solid State Storage for the Masses

I originally posted this at http://www.318.com/TechJournal

The new MacBook Air was introduced at MacWorld with the option for a 64GB Solid-State hard drive. Toshiba is also now offering Solid-State drives in sizes that are 32GB, 64GB and 128GB. The drives still seem to be lagging in adoption due to high costs, but they offer more durability, faster boot times and lower power requirements which should all lead to higher adoption over the next two years.

Toshiba will also begin making Solid-state SATA drives in May that can be used in desktop systems.

Mac OS X Windows XP

Using Trash for Storage

I’m not sure why this keeps coming up, but you don’t want to use your trash (whether for Entourage, Outlook, Mac OS X or the Recycle Bin in Windows) as a place to store files, emails or anything else you’d be bummed out about loosing.  Keep in mind that trash can be taken away at any given moment…

Xsan

Primordial Storage

Primordial storage refers to unallocated storage capacity on a storage device. Storage capacity can be allocated from primordial pools to create storage pools. This means that primordial pools are disk/device sources for allocation of storage pools.  In Xsan primordial pools aren’t used but there is often unused capacity in the form of LUNs that are referred to as primordial at time.  Especially on a Promise RAID where you might have certain LUNs that are smaller than the potential size of others and therefore might end up with disks left over which can be mapped and used as near-line storage later.  This term, primordial, can be used to refer to those.

Articles and Books Xsan

Article on EMC Channel Manager Retiring

Another article on EMC I was quoted in:

http://www.crn.com/storage/197006487?pgno=2

Xsan

Practical ILM

I originally posted this at http://www.318.com/TechJournal

The amount of data used by Small Businesses is on target to rise 30% to 35% in 2006. Sarbanes-Oxley, HIPPA and SEC Rule 17a-4 have introduced new regulations on the length of time data must be kept and in what format. Not only must data be kept, it must be backed up and secured. These factors have the cost of data storage for the Small Business increasing exponentially.

Corporations valued at more than 75 million dollars are generating 1.6 billion gigabytes of data per year. Small and medium sized companies can reap the benefits of developments being made with larger corporations. Different methods and classifications for data are one of these.

Information Lifecycle Management (ILM) is a process for maximizing information availability and data protection while minimizing cost. It is a strategy for aligning your IT infrastructure with the needs of your business based on the value of data. Administrators must analyze the trade-offs between cost and availability of data in tiers by differentiating production or transactional data from reference or fixed content data.

ILM includes the policies, practices, services and tools used to align business practices with the most appropriate and cost-effective data structures. Once data has been classified into tiers then storage methods can be chosen that are in line with the business needs of each organization. The policies to govern these practices need to be clearly documented in order to keep everyone working towards the same goals.

Storage Classification

Online storage is highly available with fast and redundant drives. The XRAID and XSAN are considered online storage, which is best used for production data as it is dynamic in nature. This can include current projects and financial data. This data must be backed up often and be rapidly restored in the event of a loss. It is not uncommon to use an XRAID to backup another XRAID for immediate restoration of files and a Tape Library to maintain offsite backups of the XRAID.

Offline storage is used for data retained for long periods of time and rarely accessed. Data often found on offline media includes old projects and archived email. Media used for offline storage is often the same as media used for backup such as tape drives and Optical media. When referring to offline storage we refer to archives, not backups. Archives are typically static whereas backups are typically dynamically changed with each backup. Offline storage still needs to be redundant or backed up, but the schedules for backup are often more lax than with that of other classifications of storage. In a Small or Medium Sized company offline media is often backed up, or duplicated, to the same type of media that it is housed on. There may be two copies of a tape (one onsite and one offsite) or two copies of DVD’s that the data has been burned onto, with each copy stored in a different physical location.

Near-line storage bridges the gap between online and offline storage by providing faster data access than archival storage at a lower cost than primary storage. Firewire Drives are often considered near-line storage because they are slower and usually not redundant. Near-line can refer to recent projects, old financial data, office forms that are updated rarely and backups of online storage to be made readily available for rapid recovery. Backup of Near-line storage will probably be to tape.

Data Classification

Mission Critical data is typically stored in online storage. This data is the day-to-day production data that drives information-based businesses. This includes the jobs being worked on by designers, the video being edited for commercials and movies, accounting data, legal data (for law firms) and current items within an organizations groupware system.

For the small business, Vital and Sensitive data are often one and the same. Vital data is data that is used in normal business practices but can be down for minutes or longer. Sensitive data is often accounting data that a company can live without for a short period of time, but will need to be restored in the event of a loss in a short amount of time. Small business will typically keep Vital and Sensitive data on the same type of media but may have different backup policies for it. For example, a company may choose to encrypt sensitive data and not vital data.

Non-Critical data includes items such as digital records and personal data files of network users. Non-Critical data could also include a duplicate of Mission Critical data from online storage. Non-Critical data often resides on near-line or off-line media (as is the case with Email archives). Non-critical data primarily refers to data kept as part of a companies risk management strategy or for regulatory compliance. This includes old emails and financial records and others.

Classification Methods

The chronological method for classifying data is often one of the easiest and most logical. For example, a design firm may keep their mission critical current jobs on an Xraid, vital jobs less than three months old on a Firewire drive attached to a server and non-critical jobs older than three months on backup tapes or offline Firewire drives. It would not be possible to implement this classification without having the data organized into jobs first. Another way to look at this method is that data over 180 days old automatically gets archived.

This characteristic method of data organization means that data with certain characteristics can be archived. This can applied to accounting and legal firms. Whether a client is active or not simply represents a characteristic. If a type of clothing is in style or not represents another possible characteristic. Provided that data is arranged or labeled by characteristic, it is possible to archive using a certain characteristic as a variable or metadata. Many small and medium sized companies are not using metadata for files yet, so a good substitution can be using a file name to denote attributes of the files data.

The hierarchical method of data organization means that files or folders within certain areas of the file system can be archived. For example, if a company decides to close down their Music Supervision department then the data stored in the Music Supervision share point on the server could be archived.

Service Level Agreements

The final piece of the ILM puzzle is building a Service Level Agreement for data management within a company. This is where the people that use each type of data within an organization sit down with IT and define how readily available that data needs to be and how often that data needs to be backed up.

In a Small Business it is often the owners of companies that make this decision. In many ways, this makes coming to terms with a Service Level Agreement easier than in a larger organization. The owner of a small business is more likely to have a picture of what the data can cost the company. When given the cost difference between online and near-line storage, small business owners are more likely to make concessions easier than managers of larger organizations who do not have as much of an ownership mentality towards a company.

Building a good Service Level Agreement means answering questions about the data, asked per classification. Some of the most important questions are:

How much data is there?How readily available does the data need to be?How much does this cost the company, including backups? Given the type of storage used to house this data, how much is it costing the company? If nearly half the data can be moved to near-line storage what will the savings be to the company? In the event of a loss, how far back in time is the company willing to go for retrieval? Is the data required it to be in an inalterable format for regulatory purposes? How fast must data be restored in the event of a loss? How fast must data be restored in the event of a catastrophe? Will client systems be backed up? If so, what on each client system will be backed up?

Information Lifecycle Management

Most companies will use a combination of methods to determine their data classification. Each classification should be mapped to a type of storage by building a SLA. Once this is done software programs such as BRU or Retrospect can be configured for automated archival and backups. The backup/archival software chosen will be the component that implements the SLA, so should fill the requirement of the ILM policies put into place.

The schedules for archival and backups should be set in accordance with the businesses needs. Some companies may choose to keep the same data in online storage for longer than other companies in the same business because they have invested more in online storage or because they reference the data often for other projects. The business logic of the organization will drive the schedule using the SLA as a roadmap.

Setting schedules means having documentation for what lives where and for how long. Information Lifecycle Management means bringing the actual data locations inline with where the data needs to be. Once this has been done, the cost to house and back up data becomes more quantifiable and cost efficient. The SLA is meant to be a guideline and should be revisited at roadblocks and intervals along the way. Checks and balances should be put into place to ensure that the actual data management situation accurately reflects the SLA.

ILM and regulatory compliance are more about people and business process than about required technology changes. The lifecycle of data is important to understand. As storage requirements spiral out of control, administrators of small and medium sized organizations can look to the methods of Enterprise networking for handling storage requirements with scalability and flexibility.