The Storage and Networking DNA.

Today I want to discuss a less technical but more visionary assertion: the concept that storage and networking share the same DNA. This is especially the case for large scale deployments. This insight surfaced as the rumor of Cisco buying Netapp roared again. Allow to me to explain why I believe exabyte storage clusters and large scale networks have a lot in common.

The parallels in storage and networking:

The first feature both networks and exabyte storage share is that they are highly scalable. Both topologies typically start small and grow overtime. Adding more capacity can be achieved seamlessly by adding more hardware to the cluster. This allows for a higher bandwidth, higher capacity and more users to be served.

Downtime is typically unacceptable for both and SLAs to ensure a multi-nine availability are common. To achieve this level of availability both rely on hyper-meshed, shared nothing architectures. These highly redundant architectures ensure that if one component fails another component takes over. To illustrate, switches typically are used in a redundant fashion as a single server is connected to 2 independent switches. If one switch fails the other one takes over. The same holds for storage. Data is also stored redundant. This could be achieved with replication or erasure coding across multiple disks and servers. If a disk or server would fail, data can still be retrieved from other disks and servers in the storage cluster.

These days you can check your Facebook timeline or Twitter account from almost anywhere in the world. Large scale networks allow users to have access from anywhere in the world. This global network spans across the globe and interlinks different smaller networks. The same holds for storage as we are moving to a world where data is stored in geographically dispersed places and even in different clouds.

With new technologies like Software-Defined Networking (SDN) network management has moved towards a Single point of Governance. Accordingly the physical network can be configured on a high level while the detailed network topology is pushed down to the physical and virtual devices that make up the network. The same trend is happening in the storage industry with Software-Defined Storage (SDS). These software applications allow to configure and manage the physical hardware in the storage cluster, even across multiple devices, sites and even different clouds through a single high-level management view.

A last point I’d like to touch is that for both networking and storage, the hardware brands and models hardly matter as they can all work together due to network standards. The same goes for storage hardware. Different brands of disks, controllers and servers can all be used to build an exabyte storage cluster. Users of the network are not aware of the exact topology of the network (brands, links, routing, …). The same holds for storage. The user shouldn’t know on which disk his data is stored exactly, the only thing he cares about is that he or she gets the right data on time when needed and it is safely stored.

Open vStorage, taking the network analogy to the next step

Let’s have a look at the components of a typical network. On the left we have the consumer of the network, in this case a server. This server is physically connected with the network through a Network Interface Controller (NIC). A NIC driver provides the necessary interfaces for the application on the server to use the network. Data which is sent down the network traverses the TCP-IP stack down to the NIC where data is converted into individual packets. Within the network various components play a specific role. A VPN provides encrypted tunnels, WAN accelerators provide caching and compression features, DNS services store the hierarchy of the network and switches/routers route and forward the packets to the right destination. The core-routers form the backbone of the network and connect multiple data centers and clouds.

Each of the above network components can be mapped to an equivalent in the Open vStorage architecture. The Open vStorage Edge offers a block interface to the applications (Hypervisors, Docker, …) on the server. Just like the TCP-IP stack converts the data into network packets, the Volume Driver converts the data received through the block interface into objects (Storage Container Objects). Next we have the proxy which takes up many roles: it encrypts the data for security, provides compression and routes the SCOs after chopping them down in fragments to the right backend. For reads the proxy also plays an important caching role by fetching the data from the correct cache backend. Lastly we have Arkoon, our own distributed key-value store, which stores the metadata of all data in the backend of the storage cluster. A backend consists out of SSDs and HDD in JBODs or traditional x86 servers. There can of course be multiple backends and they can even be spread across multiple data centers.


When reading the first alinea of this blog post it might have crossed your mind that I was crazy. I do hope that after reading through the whole post you realized that networking and storage have a lot in common. As a Product Manager I keep the path that networking has already covered in mind when thinking about the future of storage. How do you see the future of storage? Let me know!

Open vStorage 2.1

It is with great pleasure I introduce Open vStorage 2.1. Yes, we went straight from version 1.6 to 2.1. We just had so much interesting features to add that we just couldn’t call it 2.0.

It is important to know that Open vStorage now comes in 2 flavors: a free, unrestricted version and a free, restricted community version which includes our own new Open vStorage backend and allows to run Open vStorage as hyperconverged solution. At the moment both versions only feature community support. The unrestricted version is open-source and allows to add almost any S3 compatible backend (Ceph, Swift, Cloudian, …). The community version is the restricted version of our future paying product which includes support. A paying Open vStorage version will be released in June. In case you want to run Open vStorage hyperconverged out of the box, you will need to have the Open vStorage Backend which is highly optimized to be used with Open vStorage.

So what is new in 2.1 compared to 1.6:

  • Run Open vStorage as hyperconverged solution: you can now use local SATA disks inside the host as (cold) storage backend for data coming out of the write cache. Open vStorage is now hyperconverged and supports hot-swap disks. For our free community edition you can go upto 4 hosts, 16 disks and 49 vDisks. Currently only a limited set of RAID controllers are supported (LSI). In case you want to use Open vStorage in combination with the Seagate Kinetic drives, the Open vStorage Backend will also be required (future version).
  • Flexible cache layout: the Open vStorage setup is now more flexible and allows to identify multiple SSDs as read cache device. During the setup you can also indicate which device you want to select as write cache. When you create a vPool this will be taken into account when presenting default values.
  • Improved supportability: you now have the option to send heartbeats to our datacenter and if necessary open a VPN connection so we can offer remote help. There is also an option to download all logs straight from the GUI with a single mouse click.
  • New metadata server: when a volume was moved from one host to another, you typically had a few seconds up to a minute of downtime as the metadata had to be rebuilt on the new host. We now have a metadata server topology which supports a master/slave concept. In case the volume is moved and the master server is no longer accessible you can contact the slave metadata server. This means that the downtime will be only a few milliseconds. (Some more info
  • Performance improvements: we now allow more outstanding data in the write cache before the data ingest coming from the VM will be limited.

In case you have questions, feel free to create a post in the Support Forum.