The Storage and Networking DNA.

Today I want to discuss a less technical but more visionary assertion: the concept that storage and networking share the same DNA. This is especially the case for large scale deployments. This insight surfaced as the rumor of Cisco buying Netapp roared again. Allow to me to explain why I believe exabyte storage clusters and large scale networks have a lot in common.

The parallels in storage and networking:

The first feature both networks and exabyte storage share is that they are highly scalable. Both topologies typically start small and grow overtime. Adding more capacity can be achieved seamlessly by adding more hardware to the cluster. This allows for a higher bandwidth, higher capacity and more users to be served.

Downtime is typically unacceptable for both and SLAs to ensure a multi-nine availability are common. To achieve this level of availability both rely on hyper-meshed, shared nothing architectures. These highly redundant architectures ensure that if one component fails another component takes over. To illustrate, switches typically are used in a redundant fashion as a single server is connected to 2 independent switches. If one switch fails the other one takes over. The same holds for storage. Data is also stored redundant. This could be achieved with replication or erasure coding across multiple disks and servers. If a disk or server would fail, data can still be retrieved from other disks and servers in the storage cluster.

These days you can check your Facebook timeline or Twitter account from almost anywhere in the world. Large scale networks allow users to have access from anywhere in the world. This global network spans across the globe and interlinks different smaller networks. The same holds for storage as we are moving to a world where data is stored in geographically dispersed places and even in different clouds.

With new technologies like Software-Defined Networking (SDN) network management has moved towards a Single point of Governance. Accordingly the physical network can be configured on a high level while the detailed network topology is pushed down to the physical and virtual devices that make up the network. The same trend is happening in the storage industry with Software-Defined Storage (SDS). These software applications allow to configure and manage the physical hardware in the storage cluster, even across multiple devices, sites and even different clouds through a single high-level management view.

A last point I’d like to touch is that for both networking and storage, the hardware brands and models hardly matter as they can all work together due to network standards. The same goes for storage hardware. Different brands of disks, controllers and servers can all be used to build an exabyte storage cluster. Users of the network are not aware of the exact topology of the network (brands, links, routing, …). The same holds for storage. The user shouldn’t know on which disk his data is stored exactly, the only thing he cares about is that he or she gets the right data on time when needed and it is safely stored.

Open vStorage, taking the network analogy to the next step

Let’s have a look at the components of a typical network. On the left we have the consumer of the network, in this case a server. This server is physically connected with the network through a Network Interface Controller (NIC). A NIC driver provides the necessary interfaces for the application on the server to use the network. Data which is sent down the network traverses the TCP-IP stack down to the NIC where data is converted into individual packets. Within the network various components play a specific role. A VPN provides encrypted tunnels, WAN accelerators provide caching and compression features, DNS services store the hierarchy of the network and switches/routers route and forward the packets to the right destination. The core-routers form the backbone of the network and connect multiple data centers and clouds.

Each of the above network components can be mapped to an equivalent in the Open vStorage architecture. The Open vStorage Edge offers a block interface to the applications (Hypervisors, Docker, …) on the server. Just like the TCP-IP stack converts the data into network packets, the Volume Driver converts the data received through the block interface into objects (Storage Container Objects). Next we have the proxy which takes up many roles: it encrypts the data for security, provides compression and routes the SCOs after chopping them down in fragments to the right backend. For reads the proxy also plays an important caching role by fetching the data from the correct cache backend. Lastly we have Arkoon, our own distributed key-value store, which stores the metadata of all data in the backend of the storage cluster. A backend consists out of SSDs and HDD in JBODs or traditional x86 servers. There can of course be multiple backends and they can even be spread across multiple data centers.


When reading the first alinea of this blog post it might have crossed your mind that I was crazy. I do hope that after reading through the whole post you realized that networking and storage have a lot in common. As a Product Manager I keep the path that networking has already covered in mind when thinking about the future of storage. How do you see the future of storage? Let me know!

2015, the Open vStorage predictions

New Year Wallpaper.The end of 2014 is near so it is time to look forward and see what 2015 will have to offer. Some people say there’s no point in making predictions and that it’s not worth speculating because nothing is set in stone and things change all the time in storage land. Allow us to prove these people wrong by sharing our 2015 storage predictions*:

Acceleration of (hyper-)converged platforms
Converged platforms are here to stay. Converged solutions are even the fastest growing segment for large storage vendors. But it will be players like Open vStorage who will really break through in 2015. Hyperconverged solutions showed that there is an alternative to expensive SAN or all-flash solutions by adding a software based caching layer to the host. Alas, these overhyped hyperconverged solutions are even more expensive per TB storage than an already expensive all-flash array. In 2015 we will see more solutions which unite the good of the hyperconverged appliance and the converged platform but at a significantly lower cost. Storage solutions that will be extremely hot and prove to be future proof will have to have following characteristics:

  • Caching inside the (hypervisor) host: caching on SSD or PCIe flash should be done inside the host and not a couple of hops down the network.
  • Scalability: all-flash will continue to be a waste of money due to the huge cost of flash. It is better to go for a Tiered solution: Tier 1 on flash, Tier 2 on scalable, cheap (object) storage. In case the Tier 1 and Tier 2 storage are inside the same appliance (hyperconverged), your scalability and flexibility will be limited. A much better solution is to keep the 2 Tiers completely separate in a different set of appliances but managed by the same software layer.
  • Open and programmatically: storage silo’s should in 2015 be something of the past. Integration and openness should be key. Automation will be one of the hottest and most important features of a storage solution.

It should not come as a surprise that Open vStorage checks all of the above requirements.

OpenStack will dominate the cloud in 2015
This is probably the most evident prediction. During the OpenStack conference in Paris it was very clear that OpenStack will dominate the cloud the next few years. In 2014 some new kids showed support for OpenStack such as VMware (they understand that hypervisor software is now a commodity and that the data center control plane has become the high-margin battle arena). With VMware releasing their own OpenStack distribution the OpenStack distribution battlefield will be crowded in 2015. We have RedHat, Ubuntu, Mirantis, HP, VMware and many more so it is safe to say that some consolidation will happen in this area.
A new OpenStack battlefield that will emerge in 2015 will be around OpenStack storage. Currently this area is being dominated by the traditional arrays but as the software-defined storage solutions gain traction, solutions such as Open vStorage will grab a huge market share from these traditional vendors. They can compete with these SANs and all-flash arrays as they offer the same features and have the benefit of a much lower cost and TCO. While they maybe not top the total revenue achieved by the big vendors, they will definitely seize a large share of the OpenStack storage market.
If we add the fact that the Chinese government is promoting OpenStack and open-source in general, you can bet your Christmas bonus on the fact that open-source (OpenStack) storage projects (Swift, Open vStorage, …) will be booming next year. These projects will get a lot of support from Chinese companies both in adoption and development. It will be essential for traditional high-tech companies and countries not to miss the boat as once it has left the harbor it will be very hard to catch up.

New markets for object storage vendors
2015 will be the year where object storage will break out of its niche market of large video or object repositories. This has been said for many years now but 2015 will be THE year as many companies have started to realize which benefits they achieved by implementing their first object storage projects. The next target for these enterprises is to make better use of their current object storage solution. Changing all of their legacy code will not happen in 2015 as this might impact their business. Solutions where they don’t have to change their existing code base and still benefit from the cost saving of object storage will be selected. Open vStorage is one of those solutions but we are pretty sure other solutions like for example storage gateways to object storage will flourish in 2015.
Another reason why object storage vendors will enter new markets is because currently too many players are after the same customer base. This means that if they want to keep growing and provide ROI for the venture capital invested, new markets will definitely need to be addressed. The 15-25 billion dollar SAN market is a logical market to address. But entering this market will not be a bed of roses as object storage vendors have no experience in this highly-competitive market or sometimes not even the right sales competencies and knowledge. They will have to look for partnerships with companies such as CloudFounders who are seasoned in this area.

Seagate Kinetic drives
The Kinetic drives are the most exciting, fundamental change in the storage industry in several years. These drives went GA at the end of 2014 but in 2015 we will gradually see new applications and solutions who make use of this technology. With these IP drives, you will, for the first time, be able to manage storage as a scalable pool of disks. Open vStorage will support the Kinetic drives as Tier 2 backend. This means Virtual Machine will have their hot data inside the host on flash and their cold data on a pool of Kinetic drives.

* We will look back on this post at the end of 2015 to see how good we scored.

Open vStorage Q1 US Roadshow

OpenStack started to gain momentum in 2014 but will really kick off in 2015. Gartner has put software-defined storage in its top 10 technology trends for 2015. The amounts of data will continue to grow as the Internet of Things is unfolding before our eyes. As Open vStorage is in the middle of OpenStack and storage,  2015 will be the year of Open vStorage. To kick off the year in style we are doing a US Roadshow in cooperation with local OpenStack User Groups. We had to make some heartrending decisions and  disappoint some groups as we couldn’t do a Meetup in their community but below are the lucky ones for the first  Open vStorage Roadshow:

During this session we will discuss the current status of OpenStack storage projects and give an overview of the new  features in Open vStorage 2.0 (to be released in Q1 2015).

Note that registration is required and there is an attendee limit.

PS. In case you organize an OpenStack User Group and would like to host an Open vStorage session, contact us by email at

What is the big deal with Virtual Volumes, VMware?

June 30 2014, mark the date, people. This is the day when VMware announced their public beta of Virtual Volumes. Virtual Volumes, or VVOL as VMware likes to call them, put a Virtual Machine and its disks, rather than a LUN, into the storage management spotlight. Through a specific API, vSphere APIs for Storage Awareness (VASA), your storage array becomes aware of Virtual Machines and their Virtual Disks. VASA allows to offload certain Virtual Machine operations such as snapshotting and cloning to the (physical) storage array.

Now, what is the big deal with Virtual Volumes, VMware? Open vStorage has been designed to allow administrators to manage each disk of a Virtual Machine individually from day one. We don’t call it Virtual Volumes but call it VM-centric, just like anyone else in storageland does. VMware, don’t get me wrong, I applaud that you are validating the VM-centric approach of software-defined storage solutions like Open vStorage. For over 4 years, the Open vStorage team has worked at creating a VM-centric storage solution which supports multiple hypervisors such as VMware ESXi and KVM but also many backends. It is nice to see that the view we had back then is now validated by a leader in the virtualization industry.

What confuses me a bit is that while the whole world is moving towards shifting storage functionality into software, that you take the bold, opposite approach and push VM-centric functionality towards the hardware. This behavior is strange as everyone else is taking functionality out of the legacy storage arrays and is more and more treating storage as a bunch of disk managed by intelligent software. If I remember it correctly, you declared at VMworld 2013 a storage array to be something of the past by announcing VSAN. The fact that storage arrays are according to most people past their expiry date was recently confirmed by another IT behemoth, Dell, by OEM-ing a well-known hyperconverged storage appliance.

A said before, Open vStorage has been designed with VM-centric functionality across hypervisor flavors in mind. This means that taking a snapshot or cloning a single Virtual Machine is as easy as clicking a button. Being a VM-centric solution doesn’t stop there. One of the most important features is replication on a per Virtual Machine basis. Before implementing this critical feature, the Open vStorage team has had a lot of discussion about where the replication functionality should be in the stack. We could have taken a short-cut and pushed the replication back to the storage backend (or storage array as VMware calls it). Swift and Ceph for example have replication as their middle name and can replicate data across multiple locations worldwide. But, by moving the replication functionality towards the storage backend you lose your VM-awareness. Pushing functionality towards the storage array is not the solution, intelligent storage software is the only answer to a VM-centric future.

The Open vStorage future looks bright

Jason Ader, analyst at the William Blair & Company, released a new research document: The State of Storage – a 2014 Update. In this detailed report, which is a must-read for everyone in the storage industry, he discusses the biggest short and long term headwinds for traditional storage vendors. Some of these headwinds are caused by newcomers in the storage industry. He estimates that these newcomers will grow with a rate of around 40%, which will cause the combined impact to reach almost 10% of industry sales in 2014. These headwinds can not be ignored in terms of revenue or growth so it is worthwhile to discuss these headwinds in more detail and explain where Open vStorage fits in.

  • Object-Based Storage gets a second wind as migration of data to the cloud (private or public) is on the rise. Use cases and applications adjusted for Object Stores are still limited but on the rise. With Open vStorage you can turn your Object Store (Swift, Cloudian, …) into a block device for Virtual Machines and turn it into a high performance, distributed, VM-centric storage platform.
  • Emergence of Flash-Centric Architectures: Open vStorage leverages flash inside the ESXi or KVM hosts to accelerate reads and writes. It brings together the benefits of storage local to the server and external scalable storage by leveraging SSDs and PCIe flash technology inside the server for acceleration.
  • Software-defined Storage (SDS) will, according to report, “have the largest impact on the storage market in the long term as it will drive ASPs down and push toward the commoditization of specialized storage hardware platforms.” This trend is clearly to be seen in the rise of open-source file systems such as GlusterFS and Ceph and VMware’s vSAN. Open vStorage is a software-defined storage layer which brings storage intelligence into software that runs closer to compute with the ability to use multiple storage back ends in a uniform way.

Why both converged storage and external storage make sense

Today the peace in storageland has been disturbed by a post of Storage Swiss, The Problems With Server-Side Storage, Like VSAN. In this post they highlight the problems with VSAN. At the same time Coho Data, developer of flash-tuned scale-out storage appliances, released a post about why converged storage, such as VSAN, only seems to be usefull in a niche area. You can read the full post here. VMware had to respond these “attacks” and released a post countering the problems raised by Storage Swiss. As with every black-and-white story both sides have valid points, so we took the liberty to summarize these valid points. On top, we believe both sides can play a role in your storage need and that is why we believe Open vStorage, as only solution in storage land, is the solution by allowing to mix and match converged storage and external storage.

Valid points for the converged stack:

  • VSAN and other converged software stacks typically can run on almost any hardware as they are software based.
  • Pricing for a software based solution to give the same reliability as a hardware based solution can be a fraction of the cost and will become commodity over time. Take as example traditional storage redundancy and fault tolerance which is typically implemented via dual controller hardware systems. Software based techniques provide the same level of reliability at much lower costs and much better scaling.
  • Converged stacks are VM-centric, treating a single volume as a LUN. This allows for flexibility and cost reduction. Test tiers can only have a limited or no protection and important volumes might be replicated multiple times across the storage pool and run with best performance.
  • The network in a converged stack is also less complex as only the hosts need to be connected. With external storage appliances you also need to take the dedicated storage network into account.
  • Easy to setup and manage, even for less experienced IT support. This can well be the case in a branch office.
  • Running storage as close as possible to the compute power makes sense. If not, we all would be using Amazon S3 to power our VM storage needs.

Valid points for the external storage stack:

  • External storage arrays take resource intensive tasks upon themself. It allows for example processor and network resources to manage replication and recovery.
  • External storage arrays allow to be linked to different hypervisors while converged infrastructure solutions are tightly linked to the hypervisor. For example VSAN can only be used with the VMware ESXi hypervisor.
  • The storage network can be completely separated from the computer and outbound traffic.
  • Better usage of flash drives as they are shared between multiple hypervisors and writes don’t have to be mirrored to a second appliance.

The Open vStorage view:
Open vStorage believes there isn’t a ‘One Size Fits All’ solution to storage demands. Today, specific applications and functionalities demand specific storage. That is why Open vStorage is configured to work as a truly open solution and works with any storage backend. Open vStorage provides the flexibility of using an existing distributed file system, external storage appliances and even object stores. This allows for more flexibility but at the same time keeps the ability to protect existing investments.

We are firm believers in the approach taken by VMware vSAN where software running on a cluster of machines provides you a reliable storage pool and where the storage intelligence is moved closer to the compute layer. However, we believe this is too important a piece of the virtualization stack for a proprietary solution that is either hypervisor specific, hardware specific, management stack specific and storage back-end specific. Our goal behind Open vStorage is not only build something open and non-proprietary but something modular enough which allows developers and hardware manufacturers to innovate on top of Open vStorage. We believe with contributors from the open source community Open vStorage can become a far superior compared to proprietary offerings whether these be hardware or software based.

To conclude Open vStorage:

  • Leverages server based flash and storage back-ends to create shared virtual machine storage
  • Redundant, shared storage without the costs and scaling problems of dual controller systems
  • Works with any storage back-end, including filesystems, NAS, S3 compatible object storage, …
  • Works with KVM and VMware
  • Delivered under the Apache License, Version 2.0

Standing at the software defined crossroads…

Cross roads horizonIt seems that the term “software defined…” is rapidly becoming the hottest topic across the IT landscape. Across networking, storage, security and compute – the idea of using commodity hardware mixed with open standards to break the monopoly of hardware suppliers will radically change the face of information technology over the next decade. Software defined fundamentally changes the economics of computing and is potentially both a blessing and a curse to the vested interests within the IT industry.

If we start with three clear evolutional trends that have progressed for the last three decades. The performance of Intel based x86 architectures have grown almost exponentially every 18 months. Alongside compute; storage capacity over density has also grown at another almost exponential rate based on a 24 month cycle. Both these critical IT elements have also become more energy efficient and physically smaller. In the background, the performance of internal data buses and external network connections has improved with speeds heading towards 100GBp/s over simple Ethernet.

In previous times, to gain a competive edge, vendors would invest in custom ASICs to make up for deficiencies in performance and features offered by Intel’s modest CPUs or limitation in storage performance – in the days before low cost SSD and flash. The rise of the custom ASIC provided the leaders across areas such as storage and networking with a clear and identifiable benefit over rivals but at the cost of expensive R&D and manufacturing process. For this edge, customers paid a price premium and were often locked into a technology path dependent on the whims of the vendor.

The seeds of change were sown with the arrival of the hypervisor. The proof point that a software layer could radically improve the utilisation and effectiveness of the humble server prompted a change of mind-set. Why not the storage layer or the network? Why are we forced to pay a premium for proprietary technologies and custom hardware? Why are we locked into these closed stacks even as Intel, AMD, ARM, Western Digital, Matsushita and others spending tens of billions to develop faster “standards based” CPU, larger hard-disks and connectivity? Why indeed!

In a hardware centric world, brand values like the old “never got fired for buying IBM” adage could make sense but well written software that can undergo rigorous testing breaks that dependency. It is still fair to say that the software defined revolution is at an early stage but if you look at the acquisition frenzy of start-ups acquired by heavyweights across networking and storage, it’s clear that the firms with the biggest vested interests are acutely aware that they need to either get on board or get out of the way.

There is one significant danger. Our software defined future needs adherence to open standards or at least an emergent standard that becomes the defacto baseline for interoperability. We are at the crossroads of a major shift and without global support for initiatives such as OpenStack there is a danger that software defined may fragment back into the vendor dominated silos of the legacy IT era. Ourselves and other pioneers in the software defined movement need to resist the temptation to close gates to open API’s and backward compatibility – we must treat the rise of software defined as a real chance to change the status quo for the benefit of both customers and an industry that thrives on true innovation.