Domains and Recovery Domains

In the Fargo release we introduced a new concept: Domains. In this blog post you can find a description of what Domains exactly are and why and how you should configure them.

A Domain is a logical grouping of Storage Routers. You can compare a domain to an availability zone in OpenStack or a region in AWS. A Domain typically group Storage Routers which can fail for a common reason f.e. because they are on the same power feed or within the same datacenter.

Open vStorage can survive a node failure without any data loss for the VMs on that node. Even data in the write buffer which isn’t on the backend yet is safeguarded on another node by the Distributed Transaction Log. The key element in having no data loss is that the node running the volume and the node running the DTL should not be down at the same time. To limit the risk of both being down at the same time, you should make sure the the DTL is on a node which is not on the same rack or on the same power feed. The Open vStorage can of course not detect which servers are in the same rack so it is up to the user to define different Domains and assign Storage Routers to them.

As a first step create the different Domains in the Administration section (Administration > Domains). You are free to select how you want to group the Storage Routers. A few possible examples are per rack, power feed or even per datacenter, … . In the below example we have grouped the Storage Routers per datacenter.


Next, go to the detail page of each Storage Router and click the edit button.

storage router

Select the Domain, where the actual volumes is hosted, and optionally select a Recovery Domain. In case the Recovery Domain is empty, the DTL will be located in the Domain of the Storage Router. In case a Recovery Domain is selected, it will host the DTL for volumes being served by that Storage Router. Note that you can only assign a Domain as Recovery Domain if at least a single Storage Router is using it as Domain. To make sure that the latency of the DTL doesn’t become a bottleneck for the write IO it strongly advised to have a low latency network between the Storage Routers in the Domain and the Recovery Domain.

Another area where Domains play a role is the location of the MetaDataServer (MDS). The master and a slave MDS will always be located in the Domain of the Storage Router.
In case you configure a Recovery Domain, a MDS slave will also be located on one of the hosts of the Recovery Domain. This additional slave will make sure there is only a limited metadata rebuild necessary to bring the volume live.

Eugene Release

To start the new year with a bang, the Open vStorage Team is proud to release Eugene:

The highlights of this release are:

Policy Update
Open vStorage enables you to actively add, remove and update policies for specific ALBA backend presets. Updating active policies might result in Open vStorage to automatically rewrite data fragments.

ALBA Backend Encryption
When configuring a backend presets, AES-256 encryption algorithms can be selected.

Failure Domain
A Failure Domain is a logical grouping of Storage Routers. The Distributed Transaction Log (DTL) and MetaDataServer (MDS) for Storage Router groups can be defined in the same Failure Domain or in a Backup Failure domain. When the DTL and MDS are defined in a Backup Failure Domain, data loss in case of a non-functioning Failure Domain is prevented. Defining the DTL and MDS in a backup Failure Domain requires low latency network connections.

Distributed Scrubber
Snapshots which are out of retention period are indicated as garbage and removed by the Scrubber. With the Distributed Scrubber functionality you can now decide to run the actual scrubbing process away from the host that holds the volume. This way, hosts that are running Virtual Machines do not experience any performance hit when the snapshots of those Virtual Machines are scrubbed.

Scrubbing Parent vDisks
Open vStorage allows to create clones of vDisks. The maximal depth of the clone tree is limited to 255. When a clone is created, scrubbing is still applied to the actual parent of the clone.

New API calls
Following API’s are added:

  • vDisk templates (set and create from template)
  • Create a vDisk (name, size)
  • Clone a vMachine from a vMachine Snapshot
  • Delete a vDisk
  • Delete a vMachine snapshot

These API calls are not exposed in the GUI.

Removal of the community restrictions
The ALBA backend is no longer restricted and you are no longer required to apply for a community license to use ALBA. The cluster needs to be registered within 30 days otherwise the GUI will stop working until the cluster is registered.

Remove Node
Open vStorage allows for nodes to be removed from the Open vStorage Cluster. With this functionality you can remove any node and scale your storage cluster along with your changing storage requirements. Both active and broken nodes can be consistently removed from the cluster.

Some smaller Feature Requests were added also:

  • Removal of the GCC dependency.
  • Option to label a manual snapshot as ‘sticky’ so it doesn’t get removed by the automated snapshot cleanup.
  • Allow stealing of a volume when no Hypervisor Management Center is configured and the node rowning the volume is down.
  • Set_config_params for vDisk no longer requires the old config.
  • Automatically reconfigure the DTL when DTL is degraded.
  • Automatic triggering of a repair job when an ASD is down for 15 minutes.
  • ALBA is independent of broadcasting.
  • Encryption of the ALBA socket communication.
  • New Arakoon client (pyarakoon).

Following are the most important bug fixes in the Eugene release:

  • Fix for various issues when first node is down.
  • “An error occurred while configuring the partition” while trying to assign DB role to a partition.
  • Cached list not updated correctly.
  • Celery workers are unable to start.
  • Nvme drives are not correctly detected.
  • Volume restart fails due to failure while clearing the DTL.
  • Arakoon configs not correct on 4th node.
  • Bad MDS Slave placement.
  • DB role is required on every node running a vPool but isn’t mandatory.
  • Exception in tick crashes the ovs-scheduled-task service.
  • OVS-extensions is very chatty.
  • Voldrv python client hangs if node1 is down.
  • Bad logic to decide when to create extra alba namespace hosts.
  • Timeout of CHAINED ensure single decorator is not high enough.
  • Possibly wrong master selected during “ovs setup demote”.
  • Possible race condition when adding nodes to cluster.