Open vStorage High Availability (HA)

Last week I received an interesting question from a customer:

What about High-Availability (HA)? How does Open vStorage protect against failures?

This customer was right to ask that question. In case you run a large scale, multi-petabyte storage cluster, HA should be one of your key concerns. Downtime in such a cluster doesn’t only lead to production loss but might be a real PR disaster or even lead to foreclosure. When end-customers start leaving your service, it can become a slippery slope and before you are aware there is no customer left on your cluster. Hence, asking the HA question beforehand is a best practice for every storage engineer challenged with doing a due diligence of a new storage technology. Over the past few years we already devoted a lot of words to Open vStorage HA so I thought it was time for a summary.

In this blog post I will discuss the different HA scenarios starting from top (the edge) to bottom (the ASD).

The Edge

To start an Edge block device, you need to pass the IP and port of a Storage Router with the vPool of the vDisk. On initial connection the Storage Router will return to the Edge a list of fail-over Storage Routers. The Edge caches this information and switches automatically to another Storage Router in case it can’t communicate with the Storage Router for 15 seconds.
Periodically the Edge also asks the Storage Router to which Storage Router it should connect. This way the Storage Router can instruct the Edge to connect to another Storage Router, for example because the original Storage Router will be shut down.
For more details, check the following blog post about Edge HA.

The Storage Router

The Storage Router also has multiple HA features for the data path. As a vDisk can only be active and owned by a single Volume Driver, the block to object conversion process of the Storage Router, a mechanism is in place to make sure the ownership of the vDisks can be handed over (happy path) or stolen (unhappy path) by another Storage Router. Once the ownership is transferred the volume is started on the new Storage Router and IO requests can be processed. In case the old Storage Router would still try to write to the backend, fencing will kick in which prevents data to be stored on the backend.
The ALBA proxy is responsible for encrypting, compressing and erasure code the Storage Container Objects (SCOs) coming from the Volume Driver and sending the fragments to the ASD processes on the SSD/SATA disks. Each Storage Router also has multiple proxies and can switch between these proxies in cases of issues and timeouts.

The ALBA Backend

An ALBA backend typically consist out of a multiple, up to 100, physical disks across multiple servers. The proxies generates multiple fragments which are redundantly stored across all devices of the backend. This has as result that a device or even a complete server failure doesn’t lead to data loss. On top, backends can be stacked on top of each other. Let’s take as example the case where you have 3 data centers. One could create a (local) backend containing the disks of each data center and create a (global) backend on top of these these (local) backends. Data could for example be replicated 3 times, one copy in each data center, and erasure coded within the data center for storage efficiency. Using this approach a data center outage wouldn’t cause any data loss.

The management path HA

The previous sections of this blog post discussed the HA features of the data path. The management path is also high available. The GUI and API can be reached from all master nodes in the cluster. The metadata is also stored redundantly and is spread across multiple nodes or even data centers. Open vStorage has 2 types of metadata: the volume metadata and the backend metadata. The volume metadata is stored in a networked RocksDB using a master-slave concept. More information about that can be found here and in a video here.
The backend metadata is stored in our own, in-house developed, always consistent key-value store named Arakoon. More info on Arakoon can be found here.

That’s in a nutshell how Open vStorage makes sure a disk, server or data center disaster doesn’t lead to storage downtime.

Jobs, Jobs, Jobs, …

iNuron, the company behind Open vStorage, is growing rapidly. With more and more customers selecting Open vStorage and more and more multi-petabyte storage clusters being deployed, we are looking for more hands to help out. Currently we are looking for 2 profiles:

OPERATIONS ENGINEER

In case you are not afraid of large storage pools and you like to assist international, high profile customers with Proof of Concepts, then look no further as we have the ideal job for you: Operation Engineer. As part of our brilliant OPS team you will be responsible for keeping our large scale storage clusters up and running. Our engineering teams aren’t perfect so their code may lead to actual issues and customer support requests. As a preventive measure our OPS engineers are also responsible for developing, executing and maintaining software test plans and take part in the acceptance process of each new Open vStorage version.

Read the complete jobpost here.

SOFTWARE ENGINEER

Do you eat Python for breakfast and breath Javascript? In that case we have the ideal job for you as we are looking for additional software engineers for our framework team. Being part of our framework team means you will be responsible for the GUI, API and managing the different components that make up the whole Open vStorage cluster.

Read the complete jobpost here.

Docker and persistent Open vStorage volumes

Docker, the open-source container platform, is currently one of the hottest projects in the IT infrastructure business. With support of some of the world’s leading companies such as PayPal, Ebay, General Electric and many more, it is quickly becoming a cornerstone of any large deployment. Next, it also introduces a paradigm shift in how administrators see servers and applications.

Pets vs. cattle

In the past servers were treated like dogs and cats or any family pet: you give it a cute name, make sure it is in optimal condition, take care of it when it is sick, … With VMs a shift already occurred: names became more general like WebServer012 but keeping the VM healthy was still a priority for administrators. With Docker, VMs are decomposed into a sprawl of individual, clearly, well-defined applications. Sometimes there can even be multiple instances of the same application running at the same time. With thousands of containerized applications running on a single platform, it becomes impossible to treat these applications as pets but instead they are treated as cattle: they get an ID, when having issues they are taken off-line, terminated, and replaced.

Docker Storage

The original idea behind Docker was that containers would be stateless and hence didn’t need persistent storage. But over the years the insight has grown that also some applications and hence containers require persistent storage. Since the Docker platforms at large companies are housing thousands of containers, the required storage is also significant. Typically these platforms also span multiple locations or even clouds. Storage across locations and clouds is the sweet spot of the Open vStorage feature set. In order to offer distributed, persistent storage to containers, the Open vStorage team created a Docker plugin on top of the Open vStorage Edge, our lightweight block device. Note that the Docker plugin is part of the Open vStorage Enterprise Edition.

Open vStorage and Docker

Using Open vStorage to provision volumes for Docker is easy and straightforward thanks to Docker’s volume plugin system. To show how easy it is to create a volume for a container, I will give you the steps to run Minio, a minimal , open-source object store, on top of a vDisk.

First install the Open vStorage Docker plugin and the necessary packages on the compute host running Docker:
apt-get install libovsvolumedriver-ee blktap-openvstorage-ee-utils blktap-dkms volumedriver-ee-docker-plugin

Configure the configuration of the plugin by updating /etc/volumedriver-ee-docker-plugin/config.toml

[volumedriver]
hostname="IP"
port=26203
protocol="tcp"
username="root"
password="rooter"

Change the IP and port to the IP on which the vPool is exposed on the Storage Router you want to connect to (see Storage Router detail page).

Start the plugin service
systemctl start volumedriver-ee-docker-plugin.service

Create the Minio container and attach a disk for the data (minio_export) and one for the config (minio_config)

docker run --volume-driver=ovs-volumedriver-ee -p 9000:9000 --name minio \
-v minio_export:/export \
-v minio_config:/root/.minio \
minio/minio server /export

That is it. You now have a Minio object store running which stores its data on Open vStorage.

PS. Want to see more? Check the “Docker fun across Amazon Google and Packet.net”-video

The Open Core Model

It was Paul Dix, Founder and CTO of InfluxDB, that rocked the boat with his opening keynote at the last PerconaLive conference. His talk, titled “The Open Source Business Model is Under Siege”, discussed the existential struggle that open source software companies are facing. The talk is based on his experience building a viable business around open source over the last three and a half years with InfluxDB. You can see the full video here.

Infrastructure Software, a tough market …

Paul is right, building a viable open source company around infrastructure software is hard. Building a company around infrastructure tout court is hard these day. Need some examples? HPE buying storage unicorn Simplivty well below its recent valuation, Tintri doing an IPO as last option, Nutanix keeps piling up the losses quarter after quarter, RethinkDB and Basho shutting down and there are many more examples.

Open Core Model

I can offer only 1 advice for the above companies, It’s never too late to do the next right thing. And that next right thing was for Open vStorage moving away from a pure-play open source business model. Currently Open vStorage goes with the open core model. This means that we have a core distributed block storage project which is open source and free to use. But on the other hand we also have a closed source, commercial Enterprise Edition which adds more functionality to the core.
Maybe the term open core sounds a bit too pejorative. What we release as core is a fully functional distributed block storage platform. Deciding which feature ends up in the core and which in the Enterprise Edition is a difficult assessment. As rule of thumb, the core version should allow small clusters to be set up and operated without data loss and with decent performance. Even block storage clusters which span across multiple data centers can be set up with the core version. Enterprises which are looking to build their company (or part of it) on a service which couldn’t be built without the Open vStorage technology are gently steered towards the Enterprise Edition. These are typically well established, large enterprises which are looking to offer a new or better service to their customers. They also understand that one size doesn’t fit all and they want to be able to fiddle with all bells and whistles of Open vStorage. They want for example full control over which vDisk is using which part of the distributed cache. Or they want best in class performance and to achieve this they need features like the High Performance Read Mesh. Over time the list of ‘Enterprise Edition only’- features will grow. On the other hand nothing prevents us from moving features from the Enterprise Edition to the open source version down the line.

A final note

The open core model might offend some people. Yet, we aren’t the only one operating under an open core model. The open core business model is for example also used by Docker, MySQL, InfluxDB, MongoDB, Puppet, Midokura and many, many other software companies. It isn’t an easy business model as there is always discussion on which features to release as part of the open source project and which as part of the Enterprise Edition. But, we are confident that the open core model is the path forward. Not only for us but also for the whole software infrastructure market.

PS: Keep following our blog as over the next few weeks we will demonstrate the success of our open core business model with some extensive, multi petabyte, multi data center implementations.

Cache Policy Management: A Closer Look

Don’t you hate a noisy neighbour? Someone who blasts his preferred music just loud enough so you can hear it when trying to get some sleep or having a relaxing commute. Well the same goes for noisy neighbours in storage. It is not their deafening music that is annoying but the fact that other volumes can’t meet their desired performance as one volume gobbles up all IOPS.

Setting cache quota

This situation typically occurs when a single volume takes up the whole cache. In order to allow every vDisk to get a fair share of the cache, the Open vStorage Enterprise Edition allows to put a quota on the cache usage. When creating a vPool you can set a default quota per vDisk allowing each vDisk to get a fair share of the cache. Do note that the quota system is flexible. It is for example possible to set a larger value than the default for a specific vDisk in case it would benefit from more caching. It is even possible to oversubscribe the cache. This way the cache space can be optimally used.

Block and Fragment cache

One more point about cache management in Open vStorage. There are actually 2 types of cache which can be configured in Open vStorage. The first one caches complete fragments, the result of erasure coding a Storage Container Object (SCO). Hence it is called the fragment cache and it is typically used for newly written data. The stored fragments are typically large in size as to limit the amount of metadata and consequently these aren’t ideal to be used for (read) caching. The cache hit ratio is under normal circumstances inversely proportional to the size of the fragments. For that reason another cache, specifically tuned for read caching, was added. This block cache gets filled on reads and limits the size of the blocks in the cache to a couple of KB (f.e. 32-256KB). This means a more granular approach can be taken during cache eviction, eventually leading to a higher cache hit ratio.

The Open vStorage High Performance Read Mesh (HPRM)

When you are developing a storage solution your biggest worry is data loss. As an Open vStorage platform can lose a server or even a complete data center without actual data loss, we are pretty sure we have that base covered. The next challenge is to make sure that safely stored data can be quickly accessed when needed. In this blog section we already discussed a lot of the performance improvements we made over the past releases. We introduced the Edge component for guaranteed performance, the accelerated ALBA as read cache, multiple proxies per volume driver and various performance tuning options.

Today it is time to introduce the latest performance improvement: High Performance Read Mesh (HPRM). This HPRM is an optimization of the read path and allows the compute host to directly fetch the data from the drives where the data is located. Earlier the read path always had to go through the Volume Driver before the data was fetched from the ASD. This newly introduced short read path can only be taken in case the Edge has the necessary metadata of where (SCO, fragment, disk) each LBA’s data is stored. In case the Edge doesn’t have the needed metadata, for example because the cached metadata is outdated, the slow path is taken through the Volume Driver. For the write path nothing is changed as all writes go through the Volume Driver.

The short read path which bypasses the Volume Driver has 2 direct advantages: lower latency on reads and less network traffic as data only goes once over the network. Next, the introduction of the HPRM also allows for a cost reduction on the hardware front. Since the hosts running the Volume Driver are no longer in the read path in many cases, they are freed up and can focus on processing incoming writes. This means the ratio between compute hosts running the Edge and the Volume Driver can be increased. Since the Volume Driver hosts are typically beefy servers with expensive NVMe devices for the write buffer and the distributed databases, a significant change in the Compute/Volume Driver ratio means a significant reduction of the hardware cost.

HPRM, the technical details

Let’s have a look under the hood on how the HPRM works. First we will have a look at the write path. The application, f.e. the hypervisor, writes to the block device exposed by the Edge client. The Edge client will connect to its server part which in its turn, writes the data to the write buffer of the Volume Driver. Once enough writes are accumulated in the buffer, a SCO (Storage Container Object) is created and dispatched to the ALBA backend through the proxy. The proxy makes sure the data is spread across different ASDs according to the specified ALBA preset. Which ASDs contain the fragments of the SCO is stored in a manifest.
Once a read comes for the LBA, the Edge client will check its local metadata cache for the SCO info and manifest of the SCO. If the info is available the Edge will get the LBA data through the PRACC (Partial Read ACCelerator) client which can directly fetch the data from the ASDs. If the info isn’t available in the cache or if it is outdated, the manifest and SCO info are retrieved by the Edge client from the Volume Driver and stored in the Edge metadata cache.
The Edge also pushes the IO statistics to the Volume Driver so these can be queried by the Framework or the monitoring components. Gathering IO statistics is done by the Edge as it is the only component that has a view on both the fast path, through the PRACC, and the slow path through the Volume Driver.


Note that the High Performance Read Mesh is part of the Open vStorage Enterprise Edition. Contact us for more info on the Open vStorage Enterprise Edition.

Connecting Open vStorage with Amazon

In an earlier blog post we already discussed that Open vStorage is the storage solution to implement a hybrid cloud. In this blog post we will explain the technical details on how Open vStorage can be used in a hybrid cloud context.

The components

For frequent readers of this blog the different Open vStorage components should not hold any secrets anymore. For newcomers we will give a short overview of the different components:

  • The Edge: a lightweight software component which exposes a block device API and connects across the network to the Volume Driver.
  • The Volume Driver: a log structured volume manager which converts blocks into objects.
  • The ALBA Backend: an object store optimized as backend for the Volume Driver.

Let’s see how these components fit together in a hybrid cloud context.

The architecture

The 2 main components of any hybrid cloud are an on-site, private part and a public part. Key in a hybrid cloud is that data and compute can move between the private and the public part as needed. As part of this thought exercise we take the example where we want to store data on premises in our private cloud and burst with compute into the public cloud when needed. To achieve this we need to install the components as follows:

The Private Cloud part
In the private cloud we install the ALBA backend components to create one or more storage pools. All SATA disks are gathered in a capacity backend while the SSD devices are gathered in a performance backend which accelerates the capacity backend. On top of these storage pools we will deploy one or more vPools. To achieve this we run a couple of Volume Driver instances inside our private cloud. On-site compute nodes with the Edge component installed can use these Volume Drivers to store data on the capacity backend.

The Public Cloud part
For the Public Cloud part, let’s assume we use Amazon AWS, there are multiple options depending on the desired performance. In case we don’t require a lot of performance we can use an Amazon EC2 instance with KVM and the Edge installed. To bring a vDisk live in Amazon, a connection is made across the internet With the Volume Driver in the private cloud. Alternatively an AWS Direct Connect link can be used for a lower latency connection. Writes to Vdisk which is exposed in Amazon will be sent by the Edge to the write buffer of the Volume Driver. This means that writes will only be acknowledged to the application using the vDisk once the on premises located write buffer has received the data. Since the Edge and the Volume Driver connect over a rather high latency link, the write performance isn’t optimal in this case.
In case more performance is required we need an additional Storage Optimized EC2 instance with one or more NVMe SSDs. In this second EC2 instance a Volume Driver instance is installed and the vPool is extended from the on-site, private cloud into Amazon. The NVMe devices of the EC2 instance are used to store the write buffer and the metadata DBs. It is of course possible to add some more EBS Provisioned IOPS SSDs to the EC2 instance as read cache. For an even better performance, use dedicated Open vStorage powered cache nodes in Amazon. Since the write buffer is located in Amazon the latency will be substantially lower than in the first setup.

Use cases

As last part of this blog post we want to discuss some use cases which can be deployed on top of this hybrid cloud.

Analytics
Note that based upon the above architecture, a vDisk in the private cloud can be cloned into Amazon. The cloned vDisk can be used for business analytics inside Amazon without impacting the live workloads. When the analytics query is finished, the clone can be removed. The other way around is of course also possible. In that case the application data is stored in Amazon while the business analytics run on on-site compute hardware.

Disaster Recovery
Another use case is disaster recovery. As disaster recovery requires data to be on premises but also in the cloud additional instance need to be added with a large amount of HDD disks. Replication or erasure coding can be used to spread the data across the private and public cloud. In case of a disaster where the private cloud is destroyed, one can just add more compute instances running the Edge to bring the workloads live in the public cloud.

Data Safety
A last use case we want to highlight is for users that want to use public clouds but don’t thrust these public cloud providers with all of their data. In that case you need to get some instances in each public cloud which are optimized for storing data. Erasure coding is used to chop the data in encrypted fragments. These fragments are spread across the public clouds in such a way that non of the public clouds store the complete data set while the Edges and the Volume Drivers still can see the whole data set.

NSM and ABM, Arakoon teamwork

In an earlier post we shed some light on Arakoon, our own always consistent distributed key-valuedatabase. Arakoon is used in many parts of the Open vStorage platform. One of the use cases is to store the metadata of the native ALBA object store. Do note that ALBA is NOT a general purpose object store but specifically crafted and optimized for Open vStorage. ALBA uses a collection of Arakoon databases to store where and how objects are stored on the disks in the backend. Typically the SCOs and TLogs of each vDisk end up in a separate bucket, a namespace, on the backend. For each object in the namespace there is a manifest that describes where and how the object is stored on the backend. To glue the namespaces, the manifests and the disks in the backend together, ALBA uses 2 types of Arakoon databases: the ALBA Backend Manager (ABM) and one or more NameSpace Managers (NSM).

ALBA Manager

The ALBA Manager (ABM) is the entry point for all ALBA clients which want to store or retrieve something from the backend. The ALBA Manager DB knows which physical disks belong to the backend, which namespaces exist and on which NSM hosts they can be found.
To optimize the Arakoon DB it is loaded with the albamgr plugin, a collection of specific ABM user functions. Typically there is only a single ABM manager in a cluster.

NSM

A NameSpace Manager (NSM) is an Arakoon cluster which holds the manifests for the namespaces assigned to the NSM. Which NSM is managing which namespaces is registered with the ALBA Manager. The NSM is also the remote API offered by the NSM host to manipulate most of the object metadata during normal operation. Its coordinates can be retrieved from the ALBA Manager by (proxy) clients and maintenance agents.

To optimize the Arakoon DB it is loaded with the nsm_host plugin, a collection of specific NSM host user functions. Typically there are multiple NSM clusters for a single ALBA backend. This allows to scale the backend both capacity and performance wise.

IO requests

Let’s have a look at the IO path. Whenever the Volume Driver needs to store an object on the backend, a SCO or a TLog, it hands the object to one of the ALBA proxies on the same host. The ALBA proxy contains an ALBA client which communicates with the ABM to know on which NSM and disks it can store the object. Once the object is stored on the disks, the manifest with the metadata is registered in the NSM. For performance reasons the different fragment of the object and the manifest can be cached by the ALBA proxy.

In case the Volume Driver needs data from the backend, because it is no longer in the write buffer, it request the proxy to fetch the exact data by asked for a SCO location and offset. In case the right fragment are in the fragment cache, the proxy returns the data immediately to the Volume Driver. Otherwise it can use the manifest from the cache or the manifest isn’t in the cache, the proxy contacts the ABM to get the right NSM and from that the manifest. Based upon the manifest the ALBA client fetches the data it needs from the physical disks and provides it to the Volume Driver.

Fargo GA

After 3 Release Candidates and extensive testing, the Open vStorage team is proud to announce the GA (General Availability) release of Fargo. This release is packed with new features. Allow us to give a small overview:

NC-ECC presets (global and local policies)

NC-ECC (Network Connected-Error Correction Code) is an algorithm to store Storage Container Objects (SCOs) safely in multiple data centers. It consists out of a global, across data center, preset and multiple local, within a single data center, presets. The NC-ECC algorithm is based on forward error correction codes and is further optimized for usage with a multi data center approach. When there is a disk or node failure, additional chunks will be created using only data from within the same data center. This ensures the bandwidth between data centers isn’t stressed in case of a simple disk failure.

Multi-level ALBA

The ALBA backend now supports different levels. An all SSD ALBA backend can be used as performance layer in front of the capacity tier. Data is removed from the cache layer using a random eviction or Least Recently Used (LRU) strategy.

Open vStorage Edge

The Open vStorage Edge is a lightweight block driver which can be installed on Linux hosts and connect with the Volume Driver over the network (TCP-IP). By creating different components for the Volume Driver and the Edge compute and storage can scale independently.

Performance optimized Volume Driver

By limiting the size of a volume’s metadata, the metadata now fits completely in RAM. To keep the metadata at an absolute minimum, deduplication was removed. You can read more about why we removed deduplication here. Other optimizations are multiple proxies per Volume Driver (the default amount is 2), bypassing the proxy and go straight from the Volume Driver to the ASD in case of partial reads, local read preference in case of global backends (try to read from ASDs in the same data center instead of going over the network to another data center).

Multiple ASDs per device

For low latency devices adding multiple ASDs per device provides a higher bandwidth to the device.

Distributed Config Management

When you are managing large clusters, keeping the configuration of every system up to date can be quite a challenge. With Fargo all config files are now stored in a distributed config management system on top of our distributed database, Arakoon. More info can be found here.

Ubuntu 16.04

Open vStorage is now supported on Ubuntu 16.04, the latest Long Term Support (LTS) version of Ubuntu.

Smaller features in Fargo:

  • Improved the speed of the non-cached API and GUI queries by a factor 10 to 30.
  • Hardening the remove node procedure.
  • The GUI is adjusted to better highlight clusters which are spread across multiple sites.
  • The failure domain concept has been replaced by tag based domains. ASD nodes and storage routers can now be tagged with one or more tags. Tags can be used to identify a rack, site, power feed, etc.
  • 64TB volumes.
  • Browsable API with Swagger.
  • ‘asd-manager collect logs’ identical to the ‘ovs collect logs’.
  • Support for the removal of the ads-manager packages.

Since this Fargo release introduces a completely new architecture (you can read more about it here) there is no upgrade possible between Eugene and Fargo. The full release notes can be found here.

Hybrid cloud, the phoenix of cloud computing

Introduction


Hybrid cloud, an integration between both on-site, private and public clouds, has been declared dead many times over the past few years but like a phoenix it keeps on resurrecting in the yearly IT technology and industry forecasts.

Limitations, hurdles and issues

Let’s first have a look at the numerous reasons why the hybrid cloud computing trend hasn’t taken off (yet):

  • Network limitations: connecting to a public cloud was often cumbersome as it requires all traffic to go over slow, high latency public internet links.
  • Storage hurdles: implementing a hybrid cloud approach means storing data multiple times and keeping these multiple copies in sync.
  • Integration complexity: each cloud, whether private or public, has its own interface and standards which make integration unnecessary difficult and complex.
  • Legacy IT: existing on-premise infrastructure is a reality and holds back a move to the public cloud. Next to the infrastructure component, applications were not built or designed in such a way that you can scale them up and down. Nor are they designed to store their data in an object store.

Taking the above into account it shouldn’t come as a surprise that many enterprises saw public cloud computing as a check-in at Hotel California. The technical difficulties and the cost and the risk of moving back and forth between clouds was just too big. But times are changing. According to McKinsey & Company, a leading management consulting firm, over the next 3 years enterprises are planning to transition IT workloads at a significant rate and pace to a hybrid cloud infrastructure.

Hybrid cloud (finally) taking off

I see a couple a reasons why the hybrid cloud approach is finally taking off:

Edge computing use case
Smart ‘devices’ such as self driving cars are producing such large amounts of data that they can’t rely on public clouds to process it all. The data sometimes even drives real-time decisions where latency might be the difference between life or dead. Evolutionary, this requires that computing power shifts to the edges of the network. This Edge or Fog Computing concept is a textbook example of a hybrid cloud where on-site, or should we call it on-board, computing and centralized computing are grouped together into a single solution.

The network limitations are removed
The network limitations have been removed by services like AWS Direct Connect. With these you have a dedicated network connection from your premises to the Amazon cloud. All big cloud providers now offer the option for a dedicated network into their cloud. Pricing for dedicated 10GbE links in metropolitan regions like New York have also dropped significantly. For under $1.000 a month you can now get a sub millisecond fibre connection from most building in New York to one of the many data centers in New York.

Recovery realisation
More and more enterprises with a private cloud realise the need for a disaster recovery plan.
In the past this meant getting a second private cloud. This approach multiplies the TCO by at least a factor 2 as twice the amount of hardware needs to be purchased. Keeping both private clouds in sync makes disaster recovery plans only more complex. Instead of making disaster recovery a cost, enterprises are now turning disaster recovery into an asset instead of a cost. Enterprises now use cheap, public cloud storage to store their off-site backups and copies. By adding compute capacity in peak periods or when disaster strikes they can bring these off-site copies online when needed. On top, additional business analytics can also use these off-site copies without impacting the production workloads.

Standardization
Over the past years standards in cloud computing have crystallized. In the public cloud Amazon has set the standard for storing unstructured data. On the private infrastructure side, the OpenStack ecosystem has made significant progress in streamlining and standardizing how complete clouds are deployed. Enterprises such as Cisco for example are now focussing on new services to manage and orchestrate clouds in order to smooth out the last bumps in the migration between different clouds.

Storage & legacy hardware: the problem children

Based upon the previous paragraphs one might conclude that all obstacles to move to the hybrid model have been cleared. This isn’t the case as 2 issues still strike up:.

The legacy hardware problem
All current public cloud computing solutions ignore the reality that enterprises have a hardware legacy. While starting from scratch is the easiest solution, it is definitely not the cheapest. In order for the hybrid cloud to be successful, existing hardware must in some form or shape be able to be integrated in the hybrid cloud.

Storage roadblocks remain
In case you want to make use of multiple cloud solutions, the only solution you have is to store a copy of each bit of data in each cloud. This x-way replication scheme solves the issue of data being available in all cloud locations but it solves it at a high cost. Next to the replication cost, replication also adds significant latency as writes can only be acknowledged if all location are up to date. This means that in case replication is used hybrid clouds which span the east and west coast of the US are not workable.

Open vStorage removes those last obstacles

Open vStorage, a software based storage solution, allows multi-datacenter block storage in a much more nimble and cost-effective way than any traditional solution. This way it removes the last roadblocks towards the hybrid cloud adoption.
Solving the storage puzzle
Instead of X-way replication Open vStorage uses a different approach which can be compared to solving a Sudoku puzzle. All data is chopped up in chunks and additionally some parity chunks are adjoined. All these chunks, the data and parity chunks, are distributed across all the nodes, datacenters and clouds in the cluster. The amount of parity chunks can be configured but allows for example to recover from a multi node failure or a complete data center loss. A failure, whether it is a disk, node or data center will cross out some numbers from the complete Sudoku puzzle but as long as you have enough numbers left, you can still solve the puzzle. The same goes for data stored with Open vStorage: as long as you have enough chunks (disks, nodes, data centers or clouds) left, you can always recover the data.
Unlike X-way replication where data is only acknowledged once all copies are stored safely, Open vStorage allows to store data sub-optimally. This has as big advantage that it allows to acknowledge writes in case not all data chunks are written to disk. This makes sure that a single slow disk, datacenter or cloud, doesn‘t detain applications and incoming writes. This approach lowers the write latency while keeping data safety at a high level.

Legacy hardware
Open vStorage also allows to include legacy storage hardware. As Open vStorage is a software based storage solution, it can turn any x86 hardware into a piece of the hybrid storage cloud.
Open vStorage leverages the capabilities of new media technologies like SSDs and PCI-e flash but also those of older technologies like large capacity traditional SATA drives. For applications that need above par performance additional SSDs and PCI-e flash cards can be added.

Summary

Hybrid Cloud has long been a model that was chased by many enterprises without any luck. Issues such as network and storage limitations and integration complexity have been major roadblocks on the hybrid cloud path. Over the last few years a lot of these roadblocks have been removed but issues with storage and legacy hardware remained. Open vStorage overcomes these last obstacles and paves the path towards hybrid cloud adoption.