Hybrid cloud, the phoenix of cloud computing


Hybrid cloud, an integration between both on-site, private and public clouds, has been declared dead many times over the past few years but like a phoenix it keeps on resurrecting in the yearly IT technology and industry forecasts.

Limitations, hurdles and issues

Let’s first have a look at the numerous reasons why the hybrid cloud computing trend hasn’t taken off (yet):

  • Network limitations: connecting to a public cloud was often cumbersome as it requires all traffic to go over slow, high latency public internet links.
  • Storage hurdles: implementing a hybrid cloud approach means storing data multiple times and keeping these multiple copies in sync.
  • Integration complexity: each cloud, whether private or public, has its own interface and standards which make integration unnecessary difficult and complex.
  • Legacy IT: existing on-premise infrastructure is a reality and holds back a move to the public cloud. Next to the infrastructure component, applications were not built or designed in such a way that you can scale them up and down. Nor are they designed to store their data in an object store.

Taking the above into account it shouldn’t come as a surprise that many enterprises saw public cloud computing as a check-in at Hotel California. The technical difficulties and the cost and the risk of moving back and forth between clouds was just too big. But times are changing. According to McKinsey & Company, a leading management consulting firm, over the next 3 years enterprises are planning to transition IT workloads at a significant rate and pace to a hybrid cloud infrastructure.

Hybrid cloud (finally) taking off

I see a couple a reasons why the hybrid cloud approach is finally taking off:

Edge computing use case
Smart ‘devices’ such as self driving cars are producing such large amounts of data that they can’t rely on public clouds to process it all. The data sometimes even drives real-time decisions where latency might be the difference between life or dead. Evolutionary, this requires that computing power shifts to the edges of the network. This Edge or Fog Computing concept is a textbook example of a hybrid cloud where on-site, or should we call it on-board, computing and centralized computing are grouped together into a single solution.

The network limitations are removed
The network limitations have been removed by services like AWS Direct Connect. With these you have a dedicated network connection from your premises to the Amazon cloud. All big cloud providers now offer the option for a dedicated network into their cloud. Pricing for dedicated 10GbE links in metropolitan regions like New York have also dropped significantly. For under $1.000 a month you can now get a sub millisecond fibre connection from most building in New York to one of the many data centers in New York.

Recovery realisation
More and more enterprises with a private cloud realise the need for a disaster recovery plan.
In the past this meant getting a second private cloud. This approach multiplies the TCO by at least a factor 2 as twice the amount of hardware needs to be purchased. Keeping both private clouds in sync makes disaster recovery plans only more complex. Instead of making disaster recovery a cost, enterprises are now turning disaster recovery into an asset instead of a cost. Enterprises now use cheap, public cloud storage to store their off-site backups and copies. By adding compute capacity in peak periods or when disaster strikes they can bring these off-site copies online when needed. On top, additional business analytics can also use these off-site copies without impacting the production workloads.

Over the past years standards in cloud computing have crystallized. In the public cloud Amazon has set the standard for storing unstructured data. On the private infrastructure side, the OpenStack ecosystem has made significant progress in streamlining and standardizing how complete clouds are deployed. Enterprises such as Cisco for example are now focussing on new services to manage and orchestrate clouds in order to smooth out the last bumps in the migration between different clouds.

Storage & legacy hardware: the problem children

Based upon the previous paragraphs one might conclude that all obstacles to move to the hybrid model have been cleared. This isn’t the case as 2 issues still strike up:.

The legacy hardware problem
All current public cloud computing solutions ignore the reality that enterprises have a hardware legacy. While starting from scratch is the easiest solution, it is definitely not the cheapest. In order for the hybrid cloud to be successful, existing hardware must in some form or shape be able to be integrated in the hybrid cloud.

Storage roadblocks remain
In case you want to make use of multiple cloud solutions, the only solution you have is to store a copy of each bit of data in each cloud. This x-way replication scheme solves the issue of data being available in all cloud locations but it solves it at a high cost. Next to the replication cost, replication also adds significant latency as writes can only be acknowledged if all location are up to date. This means that in case replication is used hybrid clouds which span the east and west coast of the US are not workable.

Open vStorage removes those last obstacles

Open vStorage, a software based storage solution, allows multi-datacenter block storage in a much more nimble and cost-effective way than any traditional solution. This way it removes the last roadblocks towards the hybrid cloud adoption.
Solving the storage puzzle
Instead of X-way replication Open vStorage uses a different approach which can be compared to solving a Sudoku puzzle. All data is chopped up in chunks and additionally some parity chunks are adjoined. All these chunks, the data and parity chunks, are distributed across all the nodes, datacenters and clouds in the cluster. The amount of parity chunks can be configured but allows for example to recover from a multi node failure or a complete data center loss. A failure, whether it is a disk, node or data center will cross out some numbers from the complete Sudoku puzzle but as long as you have enough numbers left, you can still solve the puzzle. The same goes for data stored with Open vStorage: as long as you have enough chunks (disks, nodes, data centers or clouds) left, you can always recover the data.
Unlike X-way replication where data is only acknowledged once all copies are stored safely, Open vStorage allows to store data sub-optimally. This has as big advantage that it allows to acknowledge writes in case not all data chunks are written to disk. This makes sure that a single slow disk, datacenter or cloud, doesn‘t detain applications and incoming writes. This approach lowers the write latency while keeping data safety at a high level.

Legacy hardware
Open vStorage also allows to include legacy storage hardware. As Open vStorage is a software based storage solution, it can turn any x86 hardware into a piece of the hybrid storage cloud.
Open vStorage leverages the capabilities of new media technologies like SSDs and PCI-e flash but also those of older technologies like large capacity traditional SATA drives. For applications that need above par performance additional SSDs and PCI-e flash cards can be added.


Hybrid Cloud has long been a model that was chased by many enterprises without any luck. Issues such as network and storage limitations and integration complexity have been major roadblocks on the hybrid cloud path. Over the last few years a lot of these roadblocks have been removed but issues with storage and legacy hardware remained. Open vStorage overcomes these last obstacles and paves the path towards hybrid cloud adoption.

Location, time based or magical storage?

Storage comes in many forms and over the years multiple strategies to store and retrieve data on disk have been implemented. For Open vStorage I/O is the write or read operation on the LBA (Logical Block Address) of a Virtual Machine. Let’s first have a theoretical look at the 3 most important strategies to store data, their benefits and their drawbacks:

  • Location-based storage:
    Slide4Location-based storage stores the exact location where the data is placed. This means that we store in the metadata for each address the exact location in storage system where the actual value is stored. The advantage of this strategy is that it is very fast for read operations as you know the exact location of the data even if data frequently changes. The drawback is you don’t have a history: when an address gets overwritten, the location of the old value is lost as the address will contain the location of the new data. You can find this strategy in most of the storage solutions like SANs.
  • Time-based storage:
    Slide4Time-based storage is using time to identify when data was written. The easiest way to achieve this is by doing a log type approach to store the writes by appending all data writes to the log as a sequence. Whenever you have new data to write, instead of finding a suitable location, you simply append it to the end of the log. The advantage is that you always have the complete history (all writes are appended) of the volume and snapshots are also very easy to implement by ending the log and starting a new log file. The drawback is after a while data gets spread across different log files (snapshots) so reads become slower. To find the latest value written for an address, you need to go from the last log file to the first to identify the last time an address was written. This can be a very timely process in case data was written a long time ago. A second problem is that the always append strategy can’t be followed indefinitely. A garbage collection process must be available to reclaim the space of data which is longer needed when it is for example out of the retention period.
  • Content-addressable storage (CAS):
    With CAS each write gets labeled with an identifier, in most cases a hash. The hash value is calculated in some way from the content of the stored information. The hash and the data are stored as key/value pair in the storage system. To find data back in a CAS system, you go look up the hash in the metadata and go through the hash table. When the hash matches, the data can be found behind that hash-key. One of the reasons why hashing is used is to make sure objects are only stored once. Hence this strategy is often used in storage solutions which offer deduplication. But, CAS can’t be used to efficiently to store writes or a lot of data and is only usable when data doesn’t change frequently as keeping the hashes sorted requires some overhead. That is why it is mostly used in caching strategies as data doesn’t change quite often in the cache.Slide5

So far the theory, but which strategy does Open vStorage use? When we designed Open vStorage, we wanted storage that has great read and write performance and gives us a history of the volumes so easy snapshots, zero-copy cloning and other features come out of the box. Taking a single one of the above strategies was not an option as all of them have benefits but more importantly have drawbacks. That is why Open vStorage combines all of them as the benefits of one strategy is used to counterbalance the drawbacks of the other. To achieve great performance Open vStorage uses the SSDs or PCIe flash cards inside the host for caching. The read cache is implemented as a CAS as this offers us deduplication and great performance for frequently consulted data. The write cache is implemented using a location-based approach. The storage backend is implemented using a time-based approach by aggregating writes which occurred together in time. This approach gives us features like unlimited zero copy snapshots, cloning, and easy replication.


Before we can start with a deepdive we need to explain how the basic write transaction is implemented. Open vStorage uses a time aggregated, log based approach for all writes. In case a write is received the 4k block is appended to a file, a Storage Container Object (SCO). As soon as the SCO reaches 4MB, a new file is created. Meanwhile, in a transaction log (TLOG), for every write the address, the location (a combination on the SCO name and the offset within the SCO) and a hash of the data is saved. These SCOs and TLOGS are stored on the storage backend when they are no longer required on the SSD inside the host.


Let’s now put everything together …

  • Location-based storage
    The write caching works as a transaction log based cache on fast Flash or SSD redundant storage. In this transaction log we store the address, the location and a hash. The actual write cache is accomplished by filling up Storage Container Objects (SCO), a file containing a sequence of 4k blocks, which turns any random write I/O behavior into a sequential write operation. During each write, the address of the 4k block, the hash, the SCO number and the offset are stored as metadata in the metadata lookup database. As the metadata contains the exact location of the data of an address, the SCO and its offset in the SCO, it is evident that this is an location-based approach. But why do we also store a hash?
  • Content-addressable storage
    When a read request is done, the Storage Router will look up the hash in the metadata which contains the latest state of the volume for each address and will see if that hash is available in the read cache. This read cache is CAS (content-addressable storage) and stores hash/value combinations on SSD or flash storage. If it exists it will serve the read requests directly from the SSD or flash storage, resulting in very fast read I/O operations. Since most of the reads will be served from the cache, the content of this cache doesn’t change very often so we don’t have a large penalty to maintain the hash table. Moreover, by using hashing we can even make better use of the SSD as it allows us to do content based deduplication. In case the data is not in the read cache, but on the write cache as it was only recently written, we can still quickly retrieve it as the metadata also stores the exact SCO it is in and the offset within that SCO.
    In case the requested address is not in the read or write cache we need to go to the storage backend which is time-based.
  • Time-based storage
    The Storage Router writes or reads the data using SCO’s and transaction logs when it is communicating with the backend. By adding the data writes to the SCO’s in an log-structured, append only way, data which needs to be evacuated from the write cache, is pushed as an object (a SCO) to the storage backend. Next to the SCOs, the transaction logs containing the sequence of the writes, the address, the location and offset and the hash, are also stored on the backend. The combination of the always append strategy and the address means we have a complete history of all writes done to the volume.Slide9The benefit of this approach is that the time-based approach gives us enterprise features like zero copy snapshots and cloning. Time-based storage also requires maintenance to compact older SCO’s or cleanup deleted snapshots. By having all transaction logs and SCO’s stored on the backend, maintenance tasks can totally be offloaded from the Storage Router on the host. The Scrubber, a process that does the maintenance of the time-based storage, can work totally independent from the Storage Router, as it has access to all transaction logs and SCO’s stored on the backend storage. Once the Scrubber has finished, it will create an updated set of transaction logs that is being used by the Storage Router to update the local metadata, and to delete the obsolete SCO’s on the backend. Because of the caching in the Storage Router, the maintenance work does not impact performance because most read and write I/O requests will be using the read and write cache.
  • In the event of a disaster where the complete volume is lost, the volume can be rebuilt from the storage backend only on another host. The only thing which needs to be done to get the volume back to its latest state is to get all transaction logs from the backend and replay them so the metadata contains for each address the latest location of the data. When a read request comes, you only need to fetch the correct SCO from the backend and put it in the read cache for quick access.

Open vStorage uses different approaches when data is stored and read. On the frontend, on the SSDs inside the host, we use a content-based read cache which offers performance and deduplication across volumes. The write cache makes sure that data is quick written in an always append mode. A location-based cache is used for this approach so a miss in the read cache can be quickly covered in the write cache if it is recent data. When data is no longer needed in the write cache it gets pushed in a time-aggregated fashion (SCO) to the backend. When this happens the transaction logs are also pushed to the backend. As the backend is implemented using a time-based approach, snapshots, zero-copy cloning and easy replication come out of the box.