Climbing into the cloud with Apache CloudStack

Cloud Cover

Article from Issue 180/2015

Everybody's talking about the OpenStack cloud, but many users prefer Apache's CloudStack – another open source cloud system with a long history and a more unified design.

A cloud management system is really just an API or a web interface that configures the underlying hypervisor infrastructure. But the devil lies in the details. Apache CloudStack helps administrators deploy and manage a cloud environment.

CloudStack was founded as a start-up and, after a stop-over at Citrix, finally ended up with the Apache Software Foundation, where it has seen several updates. CloudStack differs from its major open source competitor OpenStack in that all of the cloud system's components were developed together. (OpenStack is made up of several separately developed subprojects, which has some disadvantages in terms of deployment and operations, as well as increased complexity.) What CloudStack has going for it – in addition to its open source code and monolithic structure – is the fact that it is widespread. The management API, which is based on REST principles – helps third parties implement add-ons that extend the feature scope in many directions.

Complex System; Simple Architecture

Complex systems are easiest to understand if you look at the easiest implementation. Figure 1 shows a very simply representation of a CloudStack cluster.

Figure 1: Management and VM servers make up a simple CloudStack domain.

The intelligence of the cluster resides on the management server. A program running in a Tomcat container is responsible for managing and configuring the various instances. A MySQL database provides the storage; depending on your needs, the database can run locally or on a separate server. Another role the CloudStack server plays is managing or replicating ISO images and other images deployed in the context of setting up new virtual machine instances.

The actual work then takes place on "worker machines," which can either be virtual machines or legacy workstations running on a supported host system [1]. When this article went to press, CloudStack supported the following hypervisors:

  • Windows Server 2012 R2 (Hyper-V role must be active)
  • Hyper-V 2012 R2
  • CentOS 6.2 with support for KVM
  • Red Hat Enterprise Linux 6.2 with KVM support
  • XenServer 6.0.2, 6.1, 6.2 SPI
  • VMware 5.0, 5.1, 5.5

When you deploy a hypervisor system, all hot fixes offered by the vendor must be in place. The CloudStack developers assume the hypervisor you are using is state of the art. Listing patches will lead to undefined behavior. The feature scope offered on the virtual machines depends on your choice of hypervisor. The CloudStack documentation [2] has a table with the details.

A Question of Structure

Cloud systems are only rarely restricted to a single location. One of the most powerful selling points for cloud service providers is that their globally distributed data centers can provide computing power wherever the customer is. Apache's product is capable of implementing this kind of structure. CloudStack breaks down complex systems using the schema shown in Figure 2. Regions are used to represent countries and continents: A region groups zones that are in close geographical proximity. Each region has one or multiple dedicated management servers.

Figure 2: CloudStack organizes the network into regions and zones. A pod represents one or more racks of clusters.

Zones represent data centers where one or multiple pods – that is, racks of clusters – reside. Apache recommends defining pods based on the existing Layer 2 switches. The individual groups of machines are referred to as clusters, and the individual hypervisor servers are referred to as hosts. All of the listed elements do not need to exist in each deployment. In the case of a simple virtual machine provisioning server for a workgroup, deployment of regions and zones would be totally over the top. You can thus create a dummy zone and a dummy region that contain all the required elements.

Two-Tier Storage

CloudStack categorizes the storage available in the cluster into two groups. Primary storage contains the hard disk images of the individual virtual machines. The precise architecture depends on the hypervisor you use. Not every piece of hypervisor software can handle the three storage systems supported by CloudStack (NFS, iSCSI, Fiber Channel). The performance of the physical storage is definitive for the speed of the overall system: Slow primary storage results in slow I/O. Images, snapshots, and other information not required directly for running virtual machines reside on secondary storage, which is accessible via NFS. Because secondary storage is only used when setting up and removing virtual machines, the performance requirements are not as strict. Smaller deployments will tend to merge primary and secondary storage. A classical NFS server is often used as a universal storage system.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More