Update: More information on RAID can be found here.
And who wouldn’t put their hands up for RAID? The Redundant Array of Independent disks can improve your disk performance, or give you data integrity, or both.
Windows Media Center provides an excellent home for your digital entertainment — and like any excellent home you want to pack more and more into it. Take a look at your physical collections — photos, albums, CD’s, DVDs, video tapes, etc. If you are into home entertainment then you’ve probably collected plenty of good stuff. The same thing will happen to your digital collection. Before long you will outgrow a single drive, then two. Around the time you get your third drive and file searches need three separate operations you’ll start looking around for a better storage system.
Adding a RAID Controller card and an array of 3 or more hard drives will combine those separate drives (minus a bit of space for safe data storage) into a single virtual drive that is easier to manage and provides protection against a single disk failure. It may improve your disk performance, depending on how you configure it.
In a Nutshell:
A RAID system platform can be software or hardware — the software system is generally not favoured as a system crash can lose data. Hardware is the preferred solution and it typically consists of a card that fits into your motherboard, a driver to integrate the card and its drives into your system, and 3 or more hard drives – all of which usually should be at least the same size, and sometimes the same brand and model as well.
Raid reformats your 3 drives into one large virtual drive and reserves some space (usually hidden) for redundancy. According to Wikipedia the term RAID was first defined by three guys working at the University of California, Berkeley in 1987. In the early years the technology was used mainly in corporate and high-end systems. Today, RAID technology has become affordable for most home users.
There are several standard levels of RAID creatively named RAID 0, RAID 1, RAID 2, RAID 3, RAID 4, RAID 5, and RAID 6. Each level provides an escalating combination of benefits. The three primary methods used in RAID technology are “striping” whereby a block of data is split into chunks (stripes) written to, and read back from all the drives in the array simultaneously providing a significant performance improvement. “Mirroring” that writes data to two drives at the same time, and “parity” for data protection which we’ll look at later.
Mirroring is similar, but it writes stripe A1 to two drives instead of one.
The third method provides redundancy by creating parity data in the hidden space I mentioned above. The lower RAID levels can use just one or two discs focus on striping and mirroring while the higher RAID levels focus on parity and generally require the most disks.
Wikipedia has an excellent page on the subject (http://en.wikipedia.org/wiki/RAID) but don’t be embarrassed if you get lost around the second or third screen — The “parity” part of RAID is rocket science in my estimation.
In addition to the levels, there are also many more flavors of RAID, many of them are proprietary (RAID hardware and software written by company for use with its own gear) but any RAID solution you select should have a management screen that allows you to change the disk configuration, build and change the size of the virtual drive and adjust operating variables such as block size and stripe size. It should also give you the option to manage alarms and reports on any faults that may occur and manage the disk re-build and re-initialisation process.
Here’s the management screen I’m using now:
… admittedly a bit dense, but it’s got a lot of features.
RAID 5 seems to be the favorite just now — it doesn’t provide much in the way of performance enhancements, but it does consolidate all of your physical drives into a single virtual drive, and it protects your data. You can lose a whole disk and your system will continue to operate normally with perhaps a small loss of performance — when you replace the bad disk, RAID will rebuild it for you. It’s a great feeling to know your data is safe.
Another tip on data protection – keep your operating system separate. In a one-disk environment, this means partitioning your drive into two virtual drives. The “C:\” drive part holds Windows and your programs — I’ve allocated 100 Mb for this drive and that’s always been enough. The “D:\” portion will hold your data. In this way, even if you lose your primary drive in a system crash, your data drive will remain unaffected.
It’s great for maintenance too — sometimes it’s easier just to wipe your C:\ drive and start over from scratch – formatting C:\ and reloading Windows. This can be easier than trying to find a fault or the source of a system slow-down. Be sure to keep a copy of your favourites, cookies, etc. (personal preferences normally held on the C:\ drive) in a backup folder on D:\.
Currently I use a separate physical drive for this purpose. While all the drives in my RAID array are SATA – I use the 40-pin ATA (commonly known as the IDE) port on my motherboard to connect up an old 120 Gig Seagate just for Windows.
This is where the rocket science kicks in — I’ve got 8 1Tb drives on my system, and they give me 6.63 Tb of useable space which means that 1.37 Tb is used for the parity information. Yet under this system, I can lose any single 1 Tb. disk – meaning that somehow, 6.63 Tb of data has been backed up on 1.37 Tb of disk! How can that be?
Here’s a picture of how the data blocks are written to the disks (in this case 4 physical, even though they will look like 1 virtual drive to us):
This picture, and many others I came across doing the research for this blog, show how parity is written, but doesn’t really tell you much about how it works. To quote one source:
“Parity can be added to protect the striped data. Parity data is calculated for the stripes and placed on another disk drive.”
Great. It gets calculated then it gets written. But how do those little red stripes manage to back up all my data?
At the high-end, explanations start to look like this:
The calculation process is also displayed below:
1111 XOR 1110 XOR 1100 XOR 1000 ((1111 XOR 1110) XOR 1100) XOR 1000 (0001 XOR 1100) XOR 1000 1101 XOR 1000 = 0101
Yikes! That doesn’t help at all!
But fear not, I found an explanation that, while simple, gives a picture of how parity works.
Let’s say you have three drives – A, B, and C – and in a RAID 5-type environment you would write data to two of them A and B, and on C you would write the parity information. Let’s say the information on A is “5” and the information on B is “12” and the parity information calculated is “7” (12 – 5). If you lose the information on one drive — B for example (12) – you can calculate the missing data by doing a calculation on the remaining stripes — A and C.
So now, just imagine that the data written to A and B is much longer and more complex than a simple integer, and imagine that the adding and subtracting calculations are more like the “XOR” operation above and you can start to get a feel for what must be going on.
The Down Side:
And a word of caution here. My controller card began to re-format my disks without notifying me or giving me a choice to cancel out the moment I attached them to my controller card. Fortunately the 3 disks I was testing didn’t contain valuable information. If they had, that information would be gone.
The first card I’d used (a Highpoint RocketRAID) allowed me to convert the data on my separate drives into an array on the fly – rather like the process of growing the array. you can add a disk, and then merge that disk and it’s data into the array as a separate operation – as I recall the process took nearly a full day. My new RAID card doesn’t have that feature.
The safest course of action is to start with blank disks. Any existing data on your system will need to be copied some other temporary repository (borrowing your friends external drives for a few days perhaps), and depending on the level of RAID you choose – you will lose a portion of that disk space to provide the redundancy. From my 8 x 1Tb disks, I get a 6.63 Tb of storage space – but the payoff is large in easier disk management and peace of mind.
A second word of caution – make sure that you have enough power from your power supply to handle the additional hardware. Along with the added hardware is more heat so also make sure you have sufficient cooling. I went overboard with 5 fans, but as long as you keep your system from over-heating you should be fine – fans are cheap and pretty easy to install.
Perhaps more than any other type of package, Windows Media Center (and Media Center-type packages) encourage users to load their systems with the largest types of files held on a home computer. A typical move in AVI format will be 800 Mb – a movie held in full DVD format (.VOB files) can be 4 to 5 Gigs – so, sadly, Terabytes are becoming the new Megabytes.
Upgrading to a RAID storage solution was a very good move for me and I’d recommend it to anyone who knows how to add a disk to their system.
Update: More information on RAID can be found here.
My thanks to the good folks on the Internet that have chosen to share their knowledge with us: