The different ALBA Backends explained

open vstorage alba backendsWith the latest release of Open vStorage, Fargo, the backend implementation received a complete revamp in order to better support the geoscale functionality. In a geoscale cluster, the data is spread over multiple datacenters. If one of the datacenters would go offline, the geoscale cluster stays up and running and continues to serve data.

The geoscale functionality is based upon 2 concepts: Backends and vPools. These are probably the 2 most important concepts of the Open vStorage architecture. Allow me to explain in detail what the difference is between a vPool and a Backend.

Backend

A backend is a collections of physical disks, devices or even backends. Next to grouping disks or backends it also defines how data is stored on its constituents. Parameters such as erasure coding/replication factor, compression, encryption need to be defined. Ordinarily a geoscale cluster will have multiple backends. While Eugene, the predecessor release of Fargo, only had 1 type of backend, there are now 2 types: a local and a global backend.

  • A local backend allows to group physical devices. This type is typically used to group disks within the same datacenter.
  • A Global backend allows to combine multiple (local) backends into a single (global) backend. This type of backend typically spans multiple datacenters.

Backends in practice

In each datacenter of an Open vStorage cluster there are multiple local backends. A typical segregation happens based upon the performance of the devices in the datacenter. An SSD backends will be created with devices which are fast and low latency and an HDD backend will be created with slow(er) devices which are optimised for capacity. In some cases the SSD or HDD backend will be split in more backends if they contain many devices for example by selecting every x-th disk of a node. This approach limits the impact of a node failure on a backend.
Note that there is no restriction for a local backend to only use disks within the same datacenter. It is perfectly possible to select disks from different datacenters and add them to the same backend. This doesn’t make sense of course for an SSD backend as the latency between the datacenters will be a performance limiting factor.
Another reason to create multiple backends is if you want to offer each customer his own set of physical disks for security or compliance reasons. In that case a backend is created per customer.

vPool

A vPool is a configuration template for vDisks, volumes being served by Open vStorage. This template contains a whole range of parameters such as blocksize to be used, SCO size on the backend, default write buffer size, preset to be used for data protection, hosts on which the volume can live, the backend where the data needs to be stored and whether data needs to be cached. These last 2 are particularly interesting as they express how different ALBA backends are tied together. When you create a vPool you select a backend to store the volume data. This can be a local backend, SSD for an all-flash experience or a global backend in case you want to spread data over multiple datacenters. This backend is used for every Storage Router serving the vPool. If you use a global backend across multiple datacenters, you will want to use some sort of caching in the local datacenter where the volume is running. Do this in order to keep the read latency as low as possible. To achieve this by assign a local SSD backend when extending a vPool to a certain Storage Router. All volumes being served by that Storage Router will on a read first check if the requested data is in the SSD backend. This means that Storage Routers in different datacenters will use a different cache backend. This approach allows to keep hot data in the local SSD cache and store cold data on the capacity backend which is distributed across datacenters. By using this approach Open vStorage can offer stunning performance while distributing the data across multiple datacenters for safety.

A final note

To summarise, an Open vStorage cluster can have multiple and different ALBA backends: local vs. global backends, SSD and HDD backends. vPools, a grouping of vDisks which share the same config, are the glue between these different backends.

Open vStorage 2.2 alpha 4

We released Open vStorage 2.2 Alpha 4 which contains following bugfixes:

  • Update of the About section under Administration.
  • Open vStorage Backend detail page hangs in some cases.
  • Various bugfixes for the use case when adding a vPool with a vPool name which was previously used.
  • Hardening the vPool removal.
  • Fix daily scrubbing not running.
  • No log output from the scrubber.
  • Failing to create a vDisk from a snapshot tries to delete the snapshot.
  • ALBA discovery starts spinning if network is not available.
  • ASD is no longer used by the proxy even after it has been requalified.
  • Type checking through Descriptor doesn’t work consistently.

Open vStorage 2.2 alpha 3

Today we released Open vStorage 2.2 alpha 3. The only new features are on the Open vStorage Backend (ALBA) front:

  • Metadata is now stored with a higher protection level.
  • The protocol of the ASD is now more flexible in the light of future changes.

Bugfixes:

  • Make it mandatory to configure both read- and writecache during the ovs setup partitioner.
  • During add_vpool on devstack, the cinder.conf is updated with notification_driver which is incorrectly set as “nova.openstack.common.notifier.rpc_notifier” for Juno.
  • Added support for more physical disk configuration layouts.
  • ClusterNotReachableException during vPool changes.
  • Cannot extend vPool with volumes running.
  • Update button clickable when an update is ongoing.
  • Already configured storage nodes are now removed from the discovered ones.
  • Fix for ASDs which don’t start.
  • Issue where a slow long-running task could fail because of a timeout.
  • Message delivery from albamgr to nsm_host can get stuck.
  • Fix for ALBA Namespace doesn’t exists while it exists.

vDisks, vMachines, vPools and Backends: how does it all fit together

With the latest version of Open vStorage, we released the option to use physical SATA disks as storage backend for Open vStorage. These disks can be inside the hypervisor host, hyper-converged, or in a storage server, an x86 server with SATA disks*. Together with this functionality we introduced some new terminology so we thought it would be a good idea to give an overview of how it all fits together.

vPool - Backend

Let’s start from the bottom, the physical layer, and work up to the virtual layer. For the sake of simplicity we assume a host has one or more SATA drives. With the hyper-converged version (openvstorage-hc) you can unlock functionality to manage these drives and assign them to a Backend. Basically a Backend is nothing more than a collection of physical disks grouped together. These disks can even be spread across multiple hosts. The implementation even gives you the freedom to assign some physical disks in a host to one Backend and the other disks to a second Backend. The only limitation is that a disk can only be assigned to a single Backend at the same time. The benefit of having this Backend concept allows to separate customers even on physical disk level by assigning each of them its own set of hard drives.

On top of a Backend you create one more vPools. You could compare this with creating a LUN on a traditional SAN. The benefit of having the split between a Backend and a vPool allows to have additional flexibility:

  • You can assign a vPool per customer or per department. This means you can for example bill per used GB.
  • You can set a different encryption passphrase per vPool.
  • You can enable or disable compression per vPool.
  • You can set the replication factor per vPool. For example a “test” vPool, where you store only a single copy of the Storage Container Objects (SCOs) but for production servers you can configure 3-way replication.

On top of the vPool, configured as a Datastore in VMware or a mountpoint in KVM, you can create vMachines. Each vMachine can have multiple vDisks, Virtual disks. For now vmdk and raw files are supported.

*A quick how-to convert a server with only SATA drives into a storage server you can use as Backend:
On the hypervisor host:

Install Open vStorage (apt-get install openvstorage-hc) as explained in the documentation and run the ovs setup command.

On the storage server:

  • The storage server must have an OS disk and at least 3 empty disks for the storage backend.
  • Setup the storage server with Ubuntu LTS 14.04.2. Make sure the storage server is in the same network as the compute host.
  • Execute following to setup the program that will manage the disks of the storage server:
    echo "deb http://apt-ovs.cloudfounders.com beta/" > /etc/apt/sources.list.d/ovsaptrepo.list
    apt-get update
    apt-get install openvstorage-sdm
  • Retrieve the automatically generated password from the config:
    cat /opt/alba-asdmanager/config/config.json
    ...
    "password": "u9Q4pQ76e0VJVgxm6hasdfasdfdd",
    ...

On the hypervisor host:

  • Login and go to Backends.
  • Click the add Backend button, specify a name and select Open vStorage Backend as type. Click Finish and wait until the status becomes green.
  • In the Backend details page of the freshly created Backend, click the Discover button. Wait until the storage server pops up and click on the + icon to add it. When asked for credentials use root as login and the password retrieved above.
  • Next follow the standard procedure to claim the disks of the storage server and add them to the Backend.

Open vStorage 1.0.3 Beta

We couldn’t wait until the end of April before pushing out our next release. That is why the Open vStorage Team decided to release today (April 17 2014) a small version with following set of features:

  • Remove a vPool through the GUI: in version 1.0.2 it was already possible to create a vPool and extend the vPool across Hosts through the GUI. To make the whole vPool management process complete and more user friendly, we have added the option to remove a vPool from a Host. This is of course only possible in case there are no vDisks being served from the vPool to a vMachine on that Host.
  • New version of Arakoon: Arakoon 1.7, the distributed key-value store that guarantees consistency above anything else, integrated in Open vStorage had some major rework done. This release also fixes bugs we encountered.
  • Some small improvements include adding an NTP server by default to the Host and displaying the different used ports on the VSA detail page.

The version also includes bug fixes:

  • Issue with connecting to SAIO via S3.
  • Footer displaying incorrect values.
  • Update sshd configuration to avoid client to send its own specific LOCALE settings.
  • Shutting down first node in 4 node setup caused ovs_consumer_volumerouter to halt on node 2.
  • KVM clones are now no longer automatically started.
  • Open vStorage GUI sometimes didn’t list a second VM.
  • Clone from vTemplate on KVM does not work in the absence of “default” network on Archipel.
  • Restrict vTemplate deletion from the Open vStorage GUI when a child vMachine is present.
  • Add vpool without copying ceph.* files now gives error.

To give this version a try, download the software and install with Quality Level Test.