The different ALBA Backends explained

open vstorage alba backendsWith the latest release of Open vStorage, Fargo, the backend implementation received a complete revamp in order to better support the geoscale functionality. In a geoscale cluster, the data is spread over multiple datacenters. If one of the datacenters would go offline, the geoscale cluster stays up and running and continues to serve data.

The geoscale functionality is based upon 2 concepts: Backends and vPools. These are probably the 2 most important concepts of the Open vStorage architecture. Allow me to explain in detail what the difference is between a vPool and a Backend.

Backend

A backend is a collections of physical disks, devices or even backends. Next to grouping disks or backends it also defines how data is stored on its constituents. Parameters such as erasure coding/replication factor, compression, encryption need to be defined. Ordinarily a geoscale cluster will have multiple backends. While Eugene, the predecessor release of Fargo, only had 1 type of backend, there are now 2 types: a local and a global backend.

  • A local backend allows to group physical devices. This type is typically used to group disks within the same datacenter.
  • A Global backend allows to combine multiple (local) backends into a single (global) backend. This type of backend typically spans multiple datacenters.

Backends in practice

In each datacenter of an Open vStorage cluster there are multiple local backends. A typical segregation happens based upon the performance of the devices in the datacenter. An SSD backends will be created with devices which are fast and low latency and an HDD backend will be created with slow(er) devices which are optimised for capacity. In some cases the SSD or HDD backend will be split in more backends if they contain many devices for example by selecting every x-th disk of a node. This approach limits the impact of a node failure on a backend.
Note that there is no restriction for a local backend to only use disks within the same datacenter. It is perfectly possible to select disks from different datacenters and add them to the same backend. This doesn’t make sense of course for an SSD backend as the latency between the datacenters will be a performance limiting factor.
Another reason to create multiple backends is if you want to offer each customer his own set of physical disks for security or compliance reasons. In that case a backend is created per customer.

vPool

A vPool is a configuration template for vDisks, volumes being served by Open vStorage. This template contains a whole range of parameters such as blocksize to be used, SCO size on the backend, default write buffer size, preset to be used for data protection, hosts on which the volume can live, the backend where the data needs to be stored and whether data needs to be cached. These last 2 are particularly interesting as they express how different ALBA backends are tied together. When you create a vPool you select a backend to store the volume data. This can be a local backend, SSD for an all-flash experience or a global backend in case you want to spread data over multiple datacenters. This backend is used for every Storage Router serving the vPool. If you use a global backend across multiple datacenters, you will want to use some sort of caching in the local datacenter where the volume is running. Do this in order to keep the read latency as low as possible. To achieve this by assign a local SSD backend when extending a vPool to a certain Storage Router. All volumes being served by that Storage Router will on a read first check if the requested data is in the SSD backend. This means that Storage Routers in different datacenters will use a different cache backend. This approach allows to keep hot data in the local SSD cache and store cold data on the capacity backend which is distributed across datacenters. By using this approach Open vStorage can offer stunning performance while distributing the data across multiple datacenters for safety.

A final note

To summarise, an Open vStorage cluster can have multiple and different ALBA backends: local vs. global backends, SSD and HDD backends. vPools, a grouping of vDisks which share the same config, are the glue between these different backends.

Seagate Kinetic Open Storage Project Plugfest

Open vStorage was invited to host a session during the Seagate Kinetic plugfest on Tuesday, September 20 to demo and discuss advances in Ethernet-connected storage. Kinetic is a drive architecture in which the drive is a key/value server with Ethernet connectivity. With Open vStorage we have created ALBA ASD software that mimics this key/value behaviour for normal SATA drives. Kinetic drives can of course also be used as archiving backend for an Open vStorage cluster.

Read more about the Kinetic Open Storage Project here.

I like to move it, move it

The vibe at the Open vStorage office is these days best explained by a song of the early nineties:

I like to move it, move it ~ Reel 2 Reel

While the summer time is in most companies a more quiet time, the Open vStorage office is buzzing like a beehive. Allow me to give you a short overview of what is happening:

  • We are moving into our new, larger and stylish offices. The address remains the same but we are moving into a completely remodeled floor of the Idola business center.
  • Next to physically moving desks at the Open vStorage HQ, we are also moving our code from BitBucket to GitHub. We have centralized all our code under https://github.com/openvstorage. To list a few of the projects: Arakoon (our consistent distributed key-value store), ALBA (the Open vStorage default ALternate BAckend) and of course Open vStorage itself. Go check it out!
  • Finishing up our Open vStorage 2.2 GA release.
  • Adding support for RedHat and Cent OS by merging in the Cent-OS branch. There is still some work to do around packaging, testing and upgrades so feel free to give a hand. As this was really a community effort, we owe everyone a big thank you.
  • Working on some very cool features (RDMA anyone?) but let’s keep those for a separate post.
  • Preparation for VMworld (San Francisco) and the OpenStack summit in Tokyo.

As you can see, many things going on at once so prepare for a hot Open vStorage fall!

Open vStorage 2.2 alpha 4

We released Open vStorage 2.2 Alpha 4 which contains following bugfixes:

  • Update of the About section under Administration.
  • Open vStorage Backend detail page hangs in some cases.
  • Various bugfixes for the use case when adding a vPool with a vPool name which was previously used.
  • Hardening the vPool removal.
  • Fix daily scrubbing not running.
  • No log output from the scrubber.
  • Failing to create a vDisk from a snapshot tries to delete the snapshot.
  • ALBA discovery starts spinning if network is not available.
  • ASD is no longer used by the proxy even after it has been requalified.
  • Type checking through Descriptor doesn’t work consistently.

Open vStorage 2.2 alpha 3

Today we released Open vStorage 2.2 alpha 3. The only new features are on the Open vStorage Backend (ALBA) front:

  • Metadata is now stored with a higher protection level.
  • The protocol of the ASD is now more flexible in the light of future changes.

Bugfixes:

  • Make it mandatory to configure both read- and writecache during the ovs setup partitioner.
  • During add_vpool on devstack, the cinder.conf is updated with notification_driver which is incorrectly set as “nova.openstack.common.notifier.rpc_notifier” for Juno.
  • Added support for more physical disk configuration layouts.
  • ClusterNotReachableException during vPool changes.
  • Cannot extend vPool with volumes running.
  • Update button clickable when an update is ongoing.
  • Already configured storage nodes are now removed from the discovered ones.
  • Fix for ASDs which don’t start.
  • Issue where a slow long-running task could fail because of a timeout.
  • Message delivery from albamgr to nsm_host can get stuck.
  • Fix for ALBA Namespace doesn’t exists while it exists.