It Can Happen to You: My Photo Hard Drive Just Failed

It Can Happen to You: My Photo Hard Drive Just Failed

In between my music and photography work, I have a lot of hard drives. But even so, I've never had one fail, until just two days ago. It went without warning: no crunching sounds, no notifications, just a red light.

The Story

It's been a rough few weeks for me computer-wise. I was editing client photos earlier this weekend, and all was fine. Then, when I turned on my computer Sunday, I received the following message: "Your device has a RAID configuration issue." I figured it was something a restart might fix, but I peered at my enclosure (it's hidden behind my monitor) and saw solid red lights (never a good sign). I shut down my computer, then restarted it and watched the enclosure as the soft white light blinked on, thinking all was fine, only to watch it suddenly switch to solid red again. I opened the drive utilities app and saw the dreaded status telling me one of the drives had failed and needed to be replaced.

Thankfully, I use two 8 TB hard drives in a single enclosure in RAID 1 for my photos. RAID isn't a backup solution, but it did allow me to immediately switch over to the redundant drive and continue working. Everything is backed up with Backblaze (at $5 a month, I really can't recommend them enough) and on another drive in my apartment, but even so, the thought of losing literal terabytes of not just data but hard work, creativity, and memories was sobering.

The 3-2-1 Strategy

If you're never heard of the 3-2-1 strategy, it's the best way to back up your data. It goes like this: at the minimum, you have three copies of your data, two of which are local and one of which is offsite. So, for example, my current setup is:

  • Main external hard drive: My photos live on an 8 TB external hard drive in RAID 1 configuration with an identical drive. I do not count the RAID configuration as a backup; it simply makes it very easy to get back up and running if a drive fails.
  • NAS: I have a second external hard drive attached to my router. My computer automatically backs up the photos drive to that drive. I prefer it this way because my router is in another room on another circuit, so I do get at least some isolation between the two local drives (imagine a burst pipe in the ceiling pouring on both if they were on the same desk). 
  • Backblaze: Every night, my computer syncs to Backblaze, so my offsite backup is always up to date. The initial backup took about 40 days, but if that's too long for you, you can send them a hard drive of your data. 

With this method, you can easily get up and running again if your local drive fails, and if something catastrophic happens, you have the offsite backup. And I can't stress the importance of an offsite backup enough. It doesn't matter if you have 500 copies; if they're all in the same place and a fire/flood/theft occurs, you're done for. If your Internet connection is slow, another option is buying a backup drive that you bring home every few weeks or so, then store elsewhere, such as an office or relative's house. At $5 a month and with the benefit of real-time backups, Backblaze is a no-brainer for me, but if your connection speed precludes its use, bringing home an extra external drive every few weeks isn't a bad alternative. 

Yes, the extra hard drives and subscription services add cost to the equation. Nonetheless, the thought of losing my work is terrifying enough that I'll gladly pay that extra cost, and I definitely recommend you do too. 

Lead image by Pixabay user 422737, used under Creative Commons.

Alex Cooke's picture

Alex Cooke is a Cleveland-based portrait, events, and landscape photographer. He holds an M.S. in Applied Mathematics and a doctorate in Music Composition. He is also an avid equestrian.

Log in or register to post comments
44 Comments

I have everything stored on one external travel-sized drive. It's called the 50-50 strategy as every day it feel like I have a 50% chance of making it without losing it all.

You like to live your life dangerously... (Seriously Ryan please follow Alex's advice)

Ummmmm 😳

livin on ze edge

As a film shooter and self-developer I live on the edge anyway so I am pretty much in the same boat when it comes to backup! I have my negatives, but man would that be a bitch to re-scan!

Currently SSD drives are probably best suited to be a “working drive” - that is, you have your projects stored on SSD until they are delivered to the client, then it goes to your archive drive (which is usually much larger and cheaper but slower).

As for backing up to the cloud only - well, it depends.

There are two terms that you should know when trying to determine a backup strategy:

Recovery Point Objective (RPO) and Recovery Time Objective (RTO).

RPO is basically how much work, you are ready to lose, if a media with data not yet backed up is lost. If you are a wedding photographer your RPO is basically zero - you shoot to dual cards, a card is not formatted until the data has been transferred to a hard drive AND backed up AND copied to an offsite location.

If you are a portrait photographer you may be able to accept a 24 hour RPO and perform a backup every night - after all, if you lose a days worth of portraits (or retouching work), you can probably talk the client into returning for another session.

It your “backup to the cloud” is a near-real-time synchronization (e.g. Dropbox) it is imperative that deleted or replaced files are stored and available for a while - otherwise a file that has been deleted locally will be gone from the cloud a few seconds later.

RTO is how long you can wait for a restore of your data, and this is where the cloud may be an insufficient solution. Alex briefly touched on this: A backup taking 40 days to complete will also take days or weeks to restore, and that may not be fast enough if your business depends heavily on your data.

If you want to continue using a cloud service as your only backup, I strongly suggest a segmentation of your data: One small set (your working drive, e.g) and several larger sets defined by time (one new folder each month, e.g) - that way you can restore your working data first and then start restoring your archive (and pray you don’t need anything from your archive).

Personally I would not be happy with anything less than what Alex has described in his article, but I can easily appreciate that not everyone can afford a full blown 3-2-1 solution if photography is just a hobby. If you are a working pro, I have a hard time finding a good excuse for not having a local backup too.

Some cloud services (such as Dropbox) do keep a history so if you delete/edit a file locally and it syncs to Dropbox, you can recover both the original and the edited version. I pay a small extra fee to Dropbox and get a full 12 months of history on every file.
Also, even if it takes a month to fully restore your local copies, you can always access individual files on Dropbox - so it shouldn't be a huge inconvenience.
You always run the risk that the cloud service provider will fail, but this is very rare and IMHO the chance of this happening at the same time that you lose your local files is very, very small indeed.

Perfect article Alex, I quote every single word! Especially "RAID isn't a backup solution" I work in IT and sadly most customers understand how important backup is, only after a major failure.
I could add another reason to have an offsite backup and is computer virus such as Cryptolocker. Usually, photo/videographers PCs/Macs are really fast and this makes the Cryptolocker work easier, so having an offsite backup that is not connected to our LAN is vital.

I agree that most understand how important this is only after a major failure. Everyone in my family has experienced something along these lines, including myself when I spilled a glass of wine on my laptop. I've made them all install Google Drive and pay the $2 per month to have their important documents backed up. And it's those people that need a solution like Backblaze or Google Drive that does this automatically otherwise they get lazy and forget why it was so important in the first place.

Yes backup has to be automatic otherwise no one will do it

I shoot them --> images go from the card to to my laptop --> laptop gets backed up to two different loca drives --> Edited session with both RAW and PSD files gets achived to two different mirrored drives --> final web jpg and full res TIFFs go to the cloud.

Also if I shoot client work tethered I backup all RAW files in 15min intervals to a external SSD.

Requires some manual work and has a point of failure in that all files are locally stored for a while but it is a cheap and pretty robust solution for me at the moment.

Good job with the 3-2-1, but your NAS component has a major downside. Here's your test case: turn off your router, unplug the USB3 drive from it and plug it into a computer. Can you read it? Most likely not, as the drive is formatted in EXT4, which isn't natively supported on Windows and Mac computers. If you're lucky you'll spend an hour or two setting up a virtual machine with a Linux LiveCD and hope you can share the drive back to yourself.

Yes, home routers 'can' share out a drive like a NAS, but it's not a full featured device.

Had a VERY similar experience earlier this week with my RAID drives. It was a sobering reminder that they're not a backup solution (and luckily I wasn't treating them like one!).

Which NAS brand do you use? I'm considering adding one to my backup strategy as well.

QNAP and Synology are very good brand regarding NAS, just make sure to fill them with NAS-specific Hard Drives such as Western Digital RED or RED PRO series

I use WD as I like their drives, but I'll defer to Andrea Re Depaolini's expertise here; he's the IT guy. :)

«…but if your connection speed precludes its use….»
Just to make it clear to those who may be unaware, the Internet speed here in question is your ‘upload’ speed, not your download speed.

Your Internet “connection speed” has two values; download and upload. Most ISPs will quote your download speed to you. If you are not on a fiber connection, your connection is probably asymmetrical, meaning that your download speed is NOT the same as your upload speed.

Most non-fibre telco connections give a relatively low upload speeds. Most residential non-fibre cable connections, although much better than telcos, usually are no higher than 10-15Mbps upload. Even with a 1Gbps download speed, some cable providers limit uploads to no more than 35Mbps (which isn't too shabby, but expensive).

Fibre connections tend to be symmetrical. This means that if your advertised download speed is 100Mbps, then your upload speed is probably 100Mbps, also.

The point is that many purchase a 250 Mbps download connection for ¤299.95 per month (plus modem rental, fees, taxes, for upward of ¤330), then wonder why their backup is taking so long. The answer is that their upload speed may still be as slow as 10Mbps.

Exactly why I pay for a 60/6 connection. I'd give anything for symmetrical d/u speeds.

From the beginning, I've stored everything on matching externals. As drive space has gotten cheaper I buy larger drives, 250GB, 500GB and now 1TB drives. Like CF and SD cards, I never have all the images on one big card but write to two in camera. If one fails, I simply replace and backup from the matching drive or card. I believe in redundancy and so far it hasn't failed me. Luckily, I've had one drive fail and it's since been replaced.

A duplication of data is not a backup. A backup is duplication of storage units. RAID is one storage unit, whether RAID1, RAID5, RAID50, RAID10, or RAID0. It is one storage unit. You may argue that RAID1 contains two copies of your data, but it is still one storage unit, so not a backup.

Using your logic, Google drive is a multi-layered backup solution because there are several copies of my data scattered on several servers in several countries. The thing is that it is still one storage unit. If a nefarious actor was to hack Google and take the AAA services down, thus preventing anyone from accessing their data, —yes, highly unlikely, but run with me on this— then my one and only storage unit is gone, regardless of the fact that it contains over a dozen copies of my work.

RAID1 is the same way. More than one copy of data, but still one storage unit, and should it go down completely —by overheating, being dropped off the desk, a liquid spill, et al— it does not matter that it has two copies of your data. You have just lost the single storage unit.

Besides, before he said, «I do not count the RAID configuration as a backup,» he had already said, «RAID isn't a backup solution….» Whether one calls it a backup, a backup system, or a backup solution, the important thing is, “copies of the data on different storage units, in different locations.”

He made that quite clear without any confusion, I think, even to the point of explaining why multiple local locations are necessary, and why off-site is necessary. His only confusing term was “local”. He ought to have used the terms, online, near-line, and offline. …Except, in today's world of every-body-is-a-tech-user, people may get the confused by the terms, ‘online’ & ‘offline’ storage, not to mention, ‘near-line’. Additionally, his solution only included one online, and two near-line storage solutions, (one nearer than the other).

From an IT perspective no RAID level is considered a back up. It is considered fault tolerance. RAID is only a consideration when discussing server uptime. The backup discussion is separate. For example, one of the file servers I manage is running RAID 6 with a nightly differential backup performed by a completely separate system that also uses disk in this case. File restores come from the backups. RAID 6 just means that I can have two drives fail simultaneously without the server going down. Thus I can hot swap out up to two failed drives on a live system and let the RAID array rebuild itself with zero down time. While RAID 1, the simplest form of RAID that provides fault tolerance, does provide that fault tolerance by performing a simple duplication of data, it was never intended to be a backup.

Of course, the real gotcha in this conversation is that when RAID was originally envisioned no one involved ever conceived of consumer level devices supporting RAID in any form.

In my opinion, the conversation about backups is slightly misguided. We really need to be discussing risk. For example, the oft cited off site backup sounds great. However, if the backup is done to a site that is in the same geographical region then a natural disaster could easily wipe out all the local data and the off site backup. The entire risk discussion can take you down quite a deep rabbit hole because the final decision needs to be how many nines of reliability do you need after the decimal? The cost/benefit for 99.99% reliable uptime is substantially better for most people than the benefit of 99.9999% as there are serious diminishing returns on investment.

If anyone is still reading, please, I implore you, never use RAID 0 for your photographs!

Also, use a NAS that supports ZFS or BTRFS and learn how to use things like CoW and snapshots. Also, don't forget to scrub :-)

You can quibble over semantics all you like, but you'll be screwed if you use RAID as a backup. The misinformation is being spread by you here; listen to the experts.

RAID as a backup is the equivalent of placing all your eggs in one basket. "Hey, look at all the backup eggs I've got! If one's rotten, I'm fine!"
*Drops the basket*
"...oh I understand now."

Everything should be made as simple as possible, but not simpler. There is an important practical difference between data redundancy and backup, and you're inability to understand this or defer to those describing it just to prove a silly pedantic point could seriously cause you or others a lot of hurt.

"I don't need an expert to provide the definition of a word."
Good luck with that pompous attitude. Nope. You're wrong here, and obviously any "reasonably educated English speaker" should understand the difference between simplistic laymen definitions and specialized terminologies.

Look, I can find stuff online too!

http://techthoughts.info/synology-raid-failure-raid-not-a-backup/

https://blog.storagecraft.com/5-reasons-raid-not-backup/

https://www.cnet.com/how-to/digital-storage-basics-part-3-backup-vs-redu...

https://www.smallnetbuilder.com/nas/nas-features/31745-data-recovery-tal...

It's okay to be wrong sometimes.

Let us make a few things clear….

① “Backup” IS a technical term, so a TECHNICAL definition is needed.

② The definition you gave,
«a copy or duplicate version,… retained for use in the event that the original is in some way rendered unusable,»
Excludes RAID as a backup, as RAID was not designed to create “a duplicate… for use in the event that the original is in some way rendered unusable.» It is a duplicate for the purpose of preventing downtime should a hardware component fail, leaving the original data usable, (except for RAID0, which makes no duplicate, and was designed for speed & capacity).

③ The definition you gave,
«a procedure to follow in such an event,»
means that whether by “backup,” he meant “a copy of a file for the purpose of recovery,” or “a backup solution, backup plan, backup procedure, backup system,” can be taken by context, as a ‘backup’ is also defined as a backup procedure/plan/solution/system by your own admition.

④ to reiterate my last point in this conclusion, a duplication of data, such as in a RAID1, is not the same as a backup, as its purpose is NOT to mitigate data loss, but to mitigate hardware loss and allow business continuity.

⑤ …And when it was said that, “I do not count the RAID configuration as a backup,” in context of the previous statement that, “RAID isn't a backup solution,” and the rest of the statement which goes on to state, “…it simply makes it very easy to get back up and running if a drive fails,” that the use of the term, ‘backup,’ was in the sense of, ‘a backup solution.’

It is called, “language comprehension,” or ‘logics,” and the author of the article needs to make no clarifications for those who know how to comprehend context.

This is a succinct description of technical processes, placed in appropriate and practical usage context, and based on logic and a comprehension of relatively complex systems.

This will not end well...

succinct to any reasonably intelligent adult.

Better?

Bob, are you actually incapable of understanding the difference between common definitions and technical terms?

Here’s a definition for you:

Technical: relating to a particular subject, art, or craft, or its techniques.

All this energy arguing over your inability to understand technical terms would be better spent creating an off-site backup, by the way.

You’re right: you have a right to remain ingorant and stubborn.

① «I provided the techical [sic] definition of the word backup from five dictionaries.»

No, you did not. You provided definitions from five dictionaries. Concise dictionaries are useless. They will tell you that ‘precise’ & ‘exact’ have the same meaning. They will tell you that ‘fat’ & ‘zaftig’ are synonymous. ‘Gigantic’ and ‘enormous’. They are NOT technical dictionaries.

The only thing close to a “technical” definition, is the one, (definition 5), preceded by the word in italics, “Computers.”

② RAID is NOT designed to provide a means of restoring PRIMARY storage when the PRIMARY storage is no longer accessible, since RAID IS designed to be PRIMARY STORAGE.

③ «That particular meaning is the verb meaning.»
~~~~~~
noun
1. a person or thing that….
5. Computers.
a copy or duplicate version, especially of a file, program, or entire computer system, retained for use in the event that the original is in some way rendered unusable.
a procedure to follow in such an event.
~~~~~~~~~
You ought to learn how to read a dictionary. Apparently you DO need someone to help you. “A procedure to follow,” is a noun. “Following a procedure,” is a verb. “To back up,” is a verb, but “backup,” is listed as a noun, (as in these definitions), and as an adjective, such as, “the backup copy.”

By technical merit, a RAID1 does NOT have a backup copy. It has two, equal, primary copies.

④ «RAID 1 …copying is occurring…”

Technically, copying is NOT occurring. What is occurring is simultaneous writes, for redundancy. It is like a “balanced” microphone cord, which has two feeds of a signal. The feeds are NOT copies, (in fact, one feed is the inverse of the other), but there to mitigate signal corruption.
Copying is taking data that is written, and writing it again somewhere else.

⑤ «Alex's statement was by definition of the word backup, incorrect…»
But it perfectly matched definition ⑤ⓑ which YOU quoted from Random House Unabridged Dictionary, in the section of, “Computers.” So his statement, by DEFINITION, according to RHUD, was CORRECT!

«Your kind of tech talk, grammar and verbosity is also why novices have so much trouble understanding so much about computers. »

…and one cannot use a medical dictionary to understand physics. If a novice wants to understand computers, IT REQUIRES THEM to understand “tech talk,” a.k.a., jargon. Those who fail to get the jargon of any field, will always fail to understand that field. One cannot take non-tech terms to try and comprehend tech terms. “Backup” is a tech term, as the RHUD points out. Learn the tech term, understand the tech term, and use the tech term.

Your kind of literal interpretation, lack of grammar, and insistent on sticking with the archaic, is also why novices like you have so much trouble when speaking with those who know about computers.

…I see that this is not getting anywhere. The real point is ① Get a backup solution, and ② RAID is NOT a backup solution. There was no confusion about this in the article at all.

I am done.

Preach!

① Clearly you cannot comprehend.

② Clearly you cannot comprehend.

③ Clearly you cannot comprehend.

④ Irrelevant, non-expert comment.

⑤ Clearly you cannot comprehend.

…skipping the rant…. (because, clearly you cannot comprehend).

«The fact that you are now having to include the word solution….»

When were the words, ‘backup solution’ first used regarding the term, ‘RAID’? When the author had said, “RAID isn't a backup solution….” It was the third time he used the word, ‘RAID,’ and the FIRST time he used the word, ‘backup,’ which he did in conjunction with the word, ‘solution.’

Clearly you cannot comprehend.

The real points of the article are Ⓐ Get a backup solution, and Ⓑ RAID is not a backup solution.

I’m honestly impressed with the level of effort you go into to maintain and defend your ignorance.

This is some next level stupidity!

Question about Backblaze. Is it actually unlimited, or is it one of those unlimited where the fine print actually indicates X gigabytes Is what they mean by "unlimited?" I already back everything up in quadruplicate, including off site, but would be nice to have a cloud backup as well. But I've got roughly 20TB and counting, so not sure if that would still be $5/month?

Truly unlimited! I've got somewhere around 6 TB with them and pay $5 a month.

Wait, are you saying you don't have off-site backup?

Attitude. Careful.

I was asking: is the only backup you don't have with your machine "off-site", or is "off-site backup" the only type you don't use?

“If a backup is not with me, or more specifically at the location of my computer, then where else would it be if not off-site.”

On a drive in another room or floor of your house, genius. Where it could still be affected by a fire or other disaster.

So you don’t have off-site back-up? Because you said earlier you “obviously understand what is required for a good backup system.”

You obviously don’t. Spend more time listening to people who know what they’re talking about, rather than pretending to be knowledgeable about something you’re not.

Your response explains a good deal about your lack of understanding of backup systems. A backup can be on-site yet not directly connected to your computer system (e.g. a portable drive physically stored in the home).

You don’t know the difference between on-site and off-site backup, you claim to understand what a good backup system is yet you don’t actually practice it, and you have no ability to differentiate between technical terms and simplistic definitions. There is an important technical distinction between redundancy data and backup which you continue to ignore. How shockingly foolish.

This conversation hit a brick wall (and a thick head). I’m not continuing it any further; it’s pointless.

Best of luck with your backup. You have a lot to learn.

I'd like to add that whilst a NAS is a good way to add resilience, thought is required if you want to do it well.

I strongly recommend only selecting a NAS that supports either BTRFS or ZFS. This will provide features like CoW and snapshots, which will give you a better chance of getting your file back if you accidentally overwrite it. Don't forget to ensure that the system regularly does a scrub. These terms may sound intimidating at first, but they're easy enough to understand with a few minutes reading.

Bit rot. A quick read on Wikipedia is enough to give you nightmares! Spinning rust (HDD), SSDs and CF cards are all vulnerable to some extent. Suffice to say that sometimes a drive can't read the data without error, even though modern drives have heroic abilities to correct data reads.

Today, most consumer drives are rated 10 to the 14th power regarding their Unrecoverable Read Error Rate. That means that there is a _chance_ that you'll get a read error that can't be corrected every 12.5 TB of data. So with a 4 TB drive, if you read the entire drive more than 3x, there is a _chance_ you'll get a read error that the drive can't automatically correct.

What this means in practical terms is that if you have a really large array, there's a very real possibility that it will fail to rebuild when a drive fails, simply due to unrecoverable read errors. To be fair though, this is not usually an issue unless you have many TB of storage.

On single drives, that's not to say that if some of your data was corrupted with a non-recoverable read error you'd be left with a useless file - If it's just one bit flipped, you probably won't notice the corruption.

Nowadays I have several backups. I have my NAS with two hd.
I have a backup on a second harddisk. I have a backup on an external HD and I have a copy in the cloud.
In the past when I had to make backup on RW DVD's, I lost an entire year of pictures (all backups were corrupt) and I swore to do a better job in the future. Lately I erased all my pictures on my HD (tired, the flu, and a glass of red wine too many) because I delete the Lightroom catalogue (yeah daft, I know) and I was very glad to have a uptodate backup.

I'm going to start the Router/External outline you mention above. Been a Backblaze user since last month. 20 days or so for initial backup, but sleeping better for....$5 bucks a month! Seriously!

My backup system is a little easier than Alex uses.

My files are on my computer. Once a month they get backed up to an onsite external drive. The next day I bring my offsite drive to the office, back it up overnight, then take it back to its offsite location. If I put anything on my computer between the backups, I back it up onto a flash drive until my monthly backup.

My way is a little less secure than Alex uses, but it's worked well for the past 12 years.

Have Fun,
Jeff

PS On Facebook last week a photographer and artist posted that he had everything on a RAID drive, but a power surge ran through his house and fried both drives. He's now raising $2,000 to have a recovery company pull his data from the drives.

The other drive will die shortly