Over the years a lot has been written about deduplication (dedupe) and storage. There are people who are dedupe aficionados and there are dedupe haters. At Open vStorage we take a pragmatic approach: we use deduplication when it makes sense. When the team behind Open vStorage designed a backup storage solution 15 years ago, we developed the first CAS (Content Addressed Storage) based backup technology. Using this deduplication technology, customers required 10 times less storage for typical backup processes. As said, we use deduplication when it makes sense and that is why we have decided to disable the deduplication feature in our latest Fargo release.
What is deduplication:
Deduplication is a technique for eliminating duplicate copies in data. This is done by identifying and fingerprinting unique chunks of data. In case a duplicate chunk of data is found, it is replaced by a reference or pointer to the first encountered chunk of data. As the pointer is typically smaller than the actual chunk of data, the amount of storage space to store the complete set of data can hence be reduced.
The Good, the Bad, the Ugly
Duplication can be a real lifesaver in case you need to store a lot of data on a small device. The deduplication ratio, the amount of storage reduction, can be quite substantial in case there are many identical chunks of data (think the same OS) and if the size of the chunks is a couple of multitudes larger than the size of the pointer/fingerprint.
Deduplication can be CPU intensive. It requires to fingerprint each chunk of data and fingerprinting (calculating a hash) is an expensive CPU instruction. This performance penalty will introduce additional latency in the IO write path.
The bigger the size of the chunk, the less likely chunks will be duplicates as even the smallest change of a bit will make sure the chunks are no longer identical. But the smaller the chunks, the smaller the ratio between the chunksize and the fingerprint. This has as consequence that the memory footprint for storing the fingerprints can be large in case a lot of data needs to be stored and the chunk size is small. Especially in large scale environments this is an issue as the hash table in which the fingerprints are stored can be too big to fit in memory.
Another issue is the fact the hash table might get corrupt which basically means your whole storage system is corrupt as the data is still on disk but you lost the map as to where every chunk is stored.
Block storage reality
It is obvious that deduplication only makes sense in case the data to be stored contains many duplicate chunks. Today’s applications already have deduplication built-in at the application level or generate blocks which can’t be deduped. Hence enabling deduplication introduces a performance penalty (additional IO latency, heavier CPU usage, …) without any significant space savings.
Deduplication also made sense when SSD were small in size and expensive compared with traditional SATA drives. By using deduplication it was possible to store more data on the SSD while the penalty of the deduplication overhead was still small. With the latest generation of NVMe drives both arguments have disappeared. The size of NVMe drives is almost on par with SATA drives and the cost has decreased significantly. The latency of these devices is also extremely low, bringing them in range of the overhead introduced by the deduplication. The penalty of deduplication is just too big when using NVMe.
At Open vStorage we try to make the fastest possible distributed block storage solution. In order to keep the performance consistently fast it is essential that the metadata can fit completely in RAM. Every time we need to go to an SSD for metadata, the performance will drop significantly. With deduplication enabled, the metadata size per LBA entry was 8 bit for the SCO and offset and 128 bit of the hash. Hence by eliminating deduplication we can store 16 times more metadata in RAM. Or in our case, we can address a storage pool which is 16 times bigger with the same performance as compared to with deduplication enabled.
One final remark, Open vStorage still uses deduplication when a clone is made from a volume. The clone and its parent share the data upto the point at which the volume is cloned and only the changes to the cloned volume are stored on the backend. This can easily and inexpensively be achieved with 8 bits and they share the same SCOs and offsets.