A Beginner's Guide to RAID for Photographers

Protecting and efficiently using your data is one of the most important computing topics any photographer or videographer should be acquainted with. This helpful set of videos will give you a quick introduction to the most common RAID setups to help you choose the right one for you.

Coming to you from Fstoppers alum Kyle Ford, these two helpful videos talk about the basics of RAID for photographers and videographers. RAID stands for "redundant array of independent disks" and uses multiple physical disk drives as a way to improve either data redundancy or performance (or both). The first and most important thing to know about RAID is that it does not constitute a backup and should never be thought of as one. A RAID setup is still susceptible to many of the same problems a single drive would be, and even if it wasn't, having both your data and a backup in the same physical location leaves you susceptible to theft and things like natural disasters. It's crucial to always have at least one offsite copy of your data.

Here's the second video on RAID 5 and 6:

A RAID setup can be quite useful if you frequently work with large chunks of data. 

Alex Cooke's picture

Alex Cooke is a Cleveland-based portrait, events, and landscape photographer. He holds an M.S. in Applied Mathematics and a doctorate in Music Composition. He is also an avid equestrian.

Log in or register to post comments
11 Comments

Two excellent introductions to RAID for beginners - easy to understand and keeping things simple.

A minor correction: Around 0:15 in the second video, Kyle says that RAID5 and RAID6 has the added bonus of accelerated read- and write speeds, compared to RAID1, since it writes and reads across multiple disks.

That is not entirely accurate.

RAID5 and RAID6 writes the data to one disk and parity data to another (RAID6 writes parity data to two disks), which means that you are actually having slightly lower write speeds than when using RAID1, where data is written to both disks in parallel (and, like Kyle says, even lower write speeds for RAID6).

It is, however, correct that read speeds will be faster since data can be read simultaneously from 3 disks (4 with RAID6) where RAID1 only has two disks to read from.

If you want faster write speeds, you will need to use RAID10 which requires at least 4 drives (basically two sets of RAID1 drives) - that will double your write speed, compared to a RAID1 array.

There is a good explanation of the technical aspects of all this here:
https://www.prepressure.com/library/technology/raid

"RAID5 and RAID6 writes the data to one disk and parity data to another (RAID6 writes parity data to two disks), which means that you are actually having slightly lower write speeds than when using RAID1"

No.

From your link:
"RAID 5 is the most common secure RAID level. It requires at least 3 drives but can work with up to 16. Data blocks are striped across the drives and on one drive a parity checksum of all the block data is written. The parity data is not written to a fixed drive, they are spread across all drives, as the drawing below shows. Using the parity data, the computer can recalculate the data of one of the other data blocks, should those data no longer be available."

The write speeds are slower because of the parity calculation, not because of the fact it goes to a different disk. But this is mitigated with many true hardware RAID solutions. The data is split over every drive in the array. The diagram in your article is drawn poorly but demonstrates this correctly. Block 1A is written to Drive 1. Block 1B is written to Drive 2, etc. The parity bits are split across the drives for throughput and wear leveling. There is no "parity drive" specifically, but there is one drive worth of data as parity data with RAID 5. I personally recommend RAID 10 for most people.

UNLESS. You are using a solution like UnRAID which by its name is not RAID. Data is not striped through the disks in the array but is instead written to one drive with parity. The advantages to this are you can add drives in later and mixmatch drive sizes(though, the largest drive needs to be the "parity drive". This drive is a parity drive as it is used exclusively for parity data.)

A correction I would like added to the article is in addition to theft if you delete the wrong file, or get hit with malware without a backup you have no way of recovering the data which is far more common than theft or natural disaster.

Well, yes and no.

You are correct that calculating the parity data is part of the slower performance issue, but unless you have a device with a battery backed up cache, writing to disk (and waiting for confirmation that data has been written) is definitely a factor as well.

Most of the 1-4 drive SAN devices on the market today has no battery backup of the cache (at least I know of none), although some models has the option to use an SSD disk as a cache, albeit with lower performance than a real RAM-based cache.

Oh, and I agree regarding RAID 10 being the best option, but it is a bit costly compared to RAID5, both in terms of enclosures and disks.

anyone else think he is dressed like a nascar driver in video 1, also very informative and well produced.

Haha, I get that a lot with that shirt. I wish Kodak was still so big they'd sponsor a Nascar team.

"The first and most important thing to know about RAID is that it does not constitute a backup and should never be thought of as one."

Can you please put that in bold capital letters and underline it too!

In short no. If worried about bit rot, use a file system like BTRFS or ZFS.

Thankfully even some cheap NAS units support BTRFS. The caveat is that you need to factor in the time to do regular scrubs (never fast) and remember that scrubbing the file system will also place stress on the drives for "quite some time" :-)

Another thing - You still need to backup to a system that can verify integrity of the files.

Hmmm - if a bit is flipped on the disk and affecting the volume on the filesystem-level on one of the disks, it will be detected and corrected by the RAID and cause a rebuild, won't it?

I mean, the way I understand bitrot, it's a problem on the physical layer of the disk, and RAID should specifically be able to detect this kind of problem and do a rebuild to correct it.

The RAID standard itself do not have the "scrubbing", but most subsystems (hardware, drivers, software) and Operating Systems have features for this, where it control the data, both on file system, but also hardware, for errors and remap the defect sectors, or rewrite the faulty block if its on file system level.

This is also true when it come to single disk setup, both Windows and Linux has this, Windows system check is automated, I don't know how its done on Linux or MAC.

This is actually not a big problem anymore, not been for the last 15 or so years, regarding Windows... but many people disable the System Check/Maintenance in Windows and specially on early windows versions, without the maintenance enabled, The OS can not fix this "on the run" regardless of what system you use...

I desperately need some expert advice. I am a photographer who concluded that serious storage needed to be invested in, and after some online research, settled on a RAID 1, and ordered the G-RAID with Thunderbolt 3 from Amazon.

I just received it, but have not formatted it from 0 to 1 yet, nor installed it.Somehow, I did not come upon the option of RAID 5 and up online until now - I am not sure how I missed it - but am feeling that I really need a RAID 5, not a RAID 1! Have I made a big mistake?

Now I am stuck with this G-RAID with Thunderbolt 3 with dual drives. I am woefully lacking in knowledge and confidence when it comes to peripheral hardware. My question is: can I simply purchase one more single drive, and daisy-chain it, and then somehow (how?) set it up as a RAID 5? Or am I stuck with a RAID 1 system now, period?

Hoping very much that someone here can assist me. Thanks!