Why you should or should not dedupe

Deduplication is the technique of eliminating duplicate data in storage. Deduplication systems are widely used in backup storage; they readily calculate reachable rates as 20:1, which is partly true, but if you assume 10:1 on average you may be safer. Or in other words, with 20 terabytes of backup storage, you can basically lose 200 terabytes of data. After all, the data are deduplicated. Wonderful, because it saves hardware costs.

Apples and pears

At the same time, comparing costs in this area quickly becomes a matter of apples and oranges. That is, compare deduplicated systems with nondeduplicated systems. To begin with, the calculation of the amount of storage required is actually a bit strange. So if you need 200 TB of storage, you can choose a system with 20 TB and deduplication. There are some snags there, however. The main catch is this: if you write your backup to a dedup system, you do so with the hope that you will never have to do a restore. But with an eye toward the idea that one day that might be necessary. After all: that’s why you’re backing up. However, a restore from a dedup system is many times slower compared to a system that has not been deduped. In a dedup, data from different places must be pieced together, which is a lengthy process. And that in a situation where you want your system back up and running quickly.

Blockwise

Another consideration is that in deduplication systems, data is deduplicated block by block. References to the individual cubes are maintained in a database. If an error creeps into that database, the entire backup is effectively worthless. Without this dedup database, you won’t get your data back. If a reference is off, all files containing the block are no longer readable.

Saving on your storage

Of course, we all want to save on the cost of backup storage. It’s a consideration, though, if you don’t prefer to use the deduplication of the backup software. Normally that method is safer, but also less effective. If you want to save money on storage hardware so you need less capacity, consider carefully whether to use hardware-based deduplication on the storage, or rather backup software deduplication. That really requires careful consideration of pros and cons.

Want to know more?

Please feel free to contact us with questions and comments. We are happy to give you all the information!

Book a free consultation here!

Subscribe for tips and info

We regularly write blogs on current topics from the world of digital storage technology. Sign up here to be notified about new blogs.