Okay, so I know a lot of readers may disagree with me on this and have different opinions. My writing here isn't intended to cause any offense, but more to offer an opinion and start a discussion on the matter. My evidence here is based on having worked in the IT industry for over 30 years and working in the photographic industry with creatives. Hopefully, after reading this you'll come up with an understanding which may help your next (or first) computer storage purchase.
The Cheapest Way
The cheapest and most convenient (but least reliable) way to store your images is on the internal drive of the computer you are editing on. There's no need to buy any extra equipment and you might think that this is a good place to start. On import from a camera or memory card, an editing program such as Adobe's Lightroom copies it to a location on your hard drive. This is an incredibly bad way to store your images so I'm now going to set out my reasons why.
Your internal drive contains the files needed to run your operating system and the programs you use on this hard drive. An internal hard drive will slow down as it gets filled with more data and as such slow your computer's access to the files it needs. Think of the files being like books in a library. Isn't it easier to find the book you want in a smaller library than in a library with a million books? Ideally, you shouldn't be filling your computer more than half full of data.
- Access to the files. Unless you're using a remote access program such as Teamviewer or Google remote desktop or a cloud-based archive system like Dropbox or Google Drive, you aren't going to be able to access the contents of the computer whilst away from the office, home, or place of work. Getting access to these files can be problematic sometimes and you'd normally have to log on to your computer to access files on your desktop.
- Failure of the computer system. If you have a problem with the computer system. Unless it's a catastrophic hard drive failure. Getting the files back can be a pain but generally, it can be done. It might mean taking the hard drive out of the computer and inserting it in a caddy or another computer or getting the computer repaired first.
- Archiving the files. Storing the files on the internal drive doesn't really give you the option to archive the files and store them for future reference in an alternate location.
The Most Expensive Way (Long Term)
External Hard drives are cheap and convenient. Products such as the WD 2TB Elements Portable USB 3.0 External Hard Drive are cheap and portable. Even by buying rugged drives such as the LaCie 2TB Rugged Mini USB 3.0 External Hard Drive you're not completely protected.
- Small drives are easy to misplace or get stolen if it's something you're carrying around with you. Ones that aren't rugged are likely to break when dropped.
- The data can only be accessed when they are connected to a computer or plugged into a device that has the ability to share the contents. For example, some routers have the ability to plug in a USB drive and allow access to this across the network.
- Single drives can fill up quite quickly and even if you're buying larger drives. File sizes on photos increase over the years and the longer you keep the drives plugged in and working the more chance of failure. If you're going to go down this route look at smaller drives that you repurchase every year or when they fill.
- There's zero redundancy. Once a drive fails, you're looking at a hefty bill for recovery of data or worse still lost data.
My Recommended Solution
A Network Attached Storage array such as the Synology DiskStation DS1821+ 8-Bay NAS Enclosure is definitely the way to go. This particular model is the one that I personally own and has 8 bays and allows a variety of RAID configurations. A RAID (Redundant Array of Independent Disks) configuration gives the ability to spread files over multiple drives allowing for redundancy should one of the drives fail. RAID comes in 2 types and multiple configurations. The types are Hardware Raid and Software Raid. Hardware RAID uses a physical controller chip and is in general a little more reliable than Software based RAID systems. The most common RAID configurations are 1, and 5. The DS1821+ has also the advantage of including Synology's SHR (Synology Hybrid RAID). This allows you to use drives of different sizes to create your array as opposed to RAID which only allows for drives of the same size. SHR comes in 2 configurations with SHR being the same as RAID 5 and SHR2 comparable to RAID 6.
The system is incredibly easy to set up thanks to Synology's included software and it's very self-explanatory making it easy to set up your first array. You can start by making your SHR array with just 3 drives and then add and increase the size of the array as you add more drives at a later date. This is one of the reasons for buying the 8-bay drive. Expansion is just a matter of adding extra drives and then clicking into the software and allowing it to rebuild.
Speed is also a factor with this device which has 4 network ports on the back allowing you to combine all the cables together and create a data aggregation bond making the transfer of data 4 times quicker than a single cable alone. If you connect your computer to your switch or router via an ethernet cable then editing directly from the drive is entirely possible.
Another reason why this is an ideal device is the ability for you to access the contents of your share via your phone on an app that is created by Synology and available for download on the play store and the app store. In fact, they have numerous apps that allow you to do multiple things including looking at your photos remotely.
Lastly, this device allows connection via the included app Cloud Sync to multiple, online data storage sources for cloud-based offsite backup. It's easy to synchronize different folders to different cloud-based systems.
Hopefully, this guide will give you an alternative option if you're considering your backup and editing solution. Synology has a RAID calculator on their website that will give you an idea of the type of array you can create with different-sized disks. https://www.synology.com/en-uk/support/RAID_calculator
I've been running a NAS for my photo/video for years. I use a QNAP 9 bay with a 10GB SPF+ direct connection to my desktop, as well as a 1GB connection to my router. My NAS is only accessible from the outside world over OpenVPN. I run firewalls on my router and NAS. With 5 12TB drives in RAID-5 and 4 SSD drives configured as a RAID-0 read cache, I have enough speed to edit 6K ProRes RAW video directly from my NAS. Backup jobs write out to a USB C connected RAID array.
Finished edits get uploaded to Google Drive.
Thanks for your comment Lux Shots this is an awesome way to work and the Synology can certainly handle it.
Similar to my setup.
A backup is only as good as the last recovery test.
Absolutely i agree. So many people we ask when did you last test your backup and the answer is usually NEVER!!
I have the Synology DiskStation DS1621+ 6-Bay NAS Enclosure, 4-Core 2.2GHz, 4GB RAM, No HDD case, filled with almost 1K TB of storage. I did a ton of research on this system and most importantly the what kind of drives to get. Synology has documentation of drives that they recommend for the different systems. Plus their customer service is worth every single penny. They are able to connect to Backblaze which is a huge plus. I also have a PhotoShelter account since 2005. Not sure what’s going on with those guys, so I’m moving everything to my NAS. I recommend to my friends and family to never keep any files or images on their local drives. Instead to at the very least, have one large external drive and use Google for cloud storage.
After reading thru all the ~50 comments here and a previous article here concerning multi level storage it is apparent that a lot of people are taking lots of things for granted instead of testing, testing and testing each and every part of their data infrastructure.
If you are using NAS of any form and it is running on a 1Gbps LAN Network you are wasting your valuable time. The only exception to this is if the NAS is solely used for onsite backup.
If you want to have NAS for all your storage needs then one should look at an end to end 10Gbps LAN Network. Client cards for workstations are ~$100 or maybe on the motherboard or 2.5Gbps is on the motherboard. On the NAS look at a SFP+ glassfiber solution as they are slightly faster, offer lower latency and are more energy efficient. 8-12 port 1/2.5/5/10Gbps/SFP+ switches can be picked up for less than $500 ie. QNAP offer a couple of SMB switches that just work (use one here).
On your onsite NAS (I use a Synology ds3617xs with 10x HDD and 2xSSD for R/W cache... think more IOPS) look to use a filesystem with snapshot capability. If you setup the snapshot schedule correctly then you can do a very good job of negating ransomware attacks. Obviously the NAS is not directly attached to your Internet line... put it behind a good router that has logging enabled... use remote syslog to your main server or to the NAS (synology central rsyslog'ing)... if you are security paranoid have realtime analysis on the logging enabled and send notifications via Email or a http push to your smartphone etc.
Onsite data integrity (stop/minimize bitrot) can be accomplished by checksumming all the files on your NAS ie.
server1$ find /filer -type f -print | md5sum > csum-$date-$time.txt (or sha256sum... whatever rocks your boat... main thing is fast)
run this often as a cron/schedular job and compare the last N runs to see if any file changes have occurred... if so decide if it belongs on a whitelist of blacklist. If you run this at the same time schedule as your snapshots you can basically select which snapshot to restore a non corrupted file from. We have never had a bitrot file but have heard of such cases.
Important for me over the years has been to increase the RAID Volume size once it reaches 70% full. A full raid could have dog performance especially when writing data. Again set notifications or use SNMP to proactively monitor your storage infrastructure.
Your choice of which RAID type you use will comedown to price and performance, number of hot standby disks, number of cold standby disks and the size of your HDD/SSD disks... i personally steer clear of synology's SHR system... simple RAID5/6 with 1-2 hot standby disks. I also do not use the newest and largest HDD's available due to 1. price 2. the larger the disk the longer the RAID rebuild and chance of another disk failure during the RAID rebuild. Synology allows you to set the rebuild speed,,, i always go for the fastest possible to minimize a second failure during rebuild time. RAID6 or 10 have other mitigation policies but you will be investing more in the number of disks to match a RAID5 volunes net capacity. A balanced RAID is a fast RAID so always buy the same disks in bulk (capacity, model and same firmware). Once disk runtime reaches 50K hours the disks are monitored more often. No disk we have used (99% WD RED) has lived past 90K hours due to the workloads we run (bespoke social media and image databases).
Once you have a stable onsite NAS you need to consider a DR (Desaster Recovery) concept. After looking at numerous offsite/cloud solutions we went with another Synology at a location close to our primary location with immediate access so it can be fetched and plugged in within 45 minutes. A set of bespoke scripts to backup various filesystem trees and various content based on a set of rules then call and control rsync and/or rclone todo the heavy lifting. Make sure data is being written to your offsite system ie. send an Email report or mount the offsite filesystem via a VPN tunnel or something like SSHFS to your onsite system.
We sometimes release some of our admin scripts into the wild :-) so you may find something here: https://lightaffaire.com/code
Summary: test, test and retest each stage of your storage infrasture.
Have fun!
"Data Infrastructure"? Seriously? LOL
You're overcomplicating things. All I gotta do is copy the images that are worth saving to a few external hard drives. No need for any kind of "infrastructure" or system or anything like that. There are no "stages" to test. Just simple export instructions to send copies from Photos to the hard drive.
You should try simplifying what you do to back up your pics - you will find that you don't have to get so complicated and involved and high falutin. All it is is making copies to save in another location in case your computer dies or your house burns down. No need for anything staged and infrastructured. LOL
If you want to be a "Cog" in your backup process then you must have more time then me. A lot more time.
"Just simple export instructions to send copies from Photos to the hard drive."... so even if i wanted to be a "Cog" in my backup system where can I purchase a 110TB hard drive ? 2.5 inch prefered but in a pinch I would take a 3.5 inch drive. Do I need 2x 110TB drives or am I going to be safe with a single point of failure containing 30 years of work?
Seriously: if you feel you have a simple backup system and it works ok for you then good for you.
My personal view is remove the human element, test every new part of the backup system, monitor the running infrastructure. In my original posting I forgot to list the most basic question of all which drives nearly all the decisions you will want to or have to make... "How much is your content really worth?"
ps. i spent years working at Network Appliance which basically came up with NAS back in the 90's (yes Auspex Systems were first but fell away as the market started to heat up).
Did he just say copy the files to a few external backups? How is that easier than automating the process of backing up to the cloud from a secure NAS? Removing the human element removes the potential for more problems to occur.
what we have here is a "know it all Larry" that needs to feel loved... nothing less and nothing more.
Nothing says I love you more than a solid concern for someone else's backup process.
Funny how such wise ass remarks are always posted via profiles with zero images...
Interesting nobody mentioned syncthing (https://syncthing.net/) - it is opensource and for me bullet proof. I have an small fanless beelink PC with a big HDD attached to it and it is in my parents house. It is working non-stop for 4 years I log into it from time to time by teamviewer just to check it. It was really bullet proof until now. It has auto restart if the power goes. The PC is small size (like an 2.5" HDD) and I took out the fan, running WIN10. I can do the same thing with a an Intel stick PC. The magic comes from Syncthing
Thank you for the tip. I was looking for a solution for an old win10 workstation and this may fit the bill.
I could not agree more about SyncThing, tested it for 9 months and been using it in commercial production for the last 3 months. Only a few errors, however they were easily resolved. I also have snapshots of the dataset enabled, so if there is a big problem, I can easily roll back.
I probably should have mentioned it in my earlier post, though I noted the article was related more to storage than synchronisation of files.
I don't see how local network storage is any better than directly attached storage, unless one's storage requirements are super high, or unless you genuinely need shared storage.
You can get a 16TB disk far cheaper than the cheapest NAS.
Some here are advocating using multiple RAID drives to cheaply increase NAS storage. Keep in mind that the mean time between failure (MTBF) goes down with the product of the number of drives used. Sure, if you're lucky, losing one drive will mean you still have 3/4ths of your images on a 4-drive RAID. But in my experience, I have NEVER been able to recover a partial RAID after failure of one of its drives!
Another consideration: how often do you actually have that "hot box" in use? My 16TB directly-attached backup drive sleeps with the computer. I assume the NAS is running all the time, unless you take action to shut it down or sleep it. That increases wasted, needless power drain and reduces the long-term life of the drives — they are "consuming" MTBF whenever they are spinning.
My solution: I have a second SSD drive (named "Media") that is symlinked to my Photos, Music, and Movies directories. It does not hold active system software, although I keep a bootable partition on it for insurance. I have a third, large, slower drive that automagically backs up everything, via Time Machine.
The only vulnerability this has is locality: if you have a house fire, everything is destroyed… just as it would be with a local NAS! If you really need non-local security, go with the cloud!
I'm doing this on a Mac Pro, and can run everything inside. There's the "cables" argument against doing this if your internal storage won't exceed one drive — but you still need a cable to get to your NAS!
Using this strategy, I haven't lost ANYTHING in well over fifteen years.
Now I have fetched a coffee I can stop laughing at your error ridden post. It should 1. definitely be ignored by others or 2. be deleted to protect the synapse negative poster!
"Some here are advocating using multiple RAID drives to cheaply increase NAS storage. ... Sure, if you're lucky, losing one drive will mean you still have 3/4ths of your images on a 4-drive RAID. But in my experience, I have NEVER been able to recover a partial RAID after failure of one of its drives!"
LOL... you have ZERO idea how a RAID system works... 3/4ths of the images LOL... if you have RAID5 on a 4 disk system running then you will still 100% (thats 4/4ths in your speak) of your images if one of the disks fails. Thats the whole beauty of running a RAID system... data protection.
That you could not recover a "partial RAID" just shows how little you know... shocking that you even commented in this big boys thread! It is either a running RAID or a degraded RAID...partial RAID does not exist.
The rest of your post is just more "i am the man and will insert myself in the backup strategy".
The beauty of a GOOD backup strategy is that is working in the background keeping the time to last backup to a minimum without human intervention... leaving the human to be more productive.
You don't seem to understand the article's points, nor the difference between RAID 1 and RAID 5.
The article assumed such knowledge, but seemed to imply that you could get more storage with RAID, emphasizing the "I" in the acronym, as opposed to getting more reliability, emphasizing the "R" in the acronym.
In short, going for more storage, using RAID 0 or 5, means your MTBF (that's engineer-speak for "Mean Time Between Failure") is t/n, where "n" is the number of striped volumes. And if the software or hardware has a "glitch," you might not be able to recover from the working volumes when even one goes down — and the spec more-or-less says you can't recover from two or more drives going down.
It's too bad you feel the need to attack rather than just explain — or better yet, ask questions.
I was an engineer for Tandem Computers. We implemented the equivalent of hardware RAID in the 1980s, well before the current RAID spec was even a gleam in some engineer's eye. I am well aware of the nuances — and the hidden pitfalls — of RAID. But rather that ask me for details about my bad experiences with RAID, you just made unwarranted assumptions and attacked — a sure sign of insecurity! :-)
So have a good day, but please know I won't be reading you again. Life is too short for abusive, disagreeable people!
Yeah right... and I spent years at Network Appliance... you may have heard of them... so I kind of know a little bit about RAID... just a little bit.
I stand 100% by my text above... the tech is solid... it has saved my data numerous times... and I hope no one reading this thread thinks your micky mouse strategy towards data backup is the way togo.