Thursday, September 29, 2011

OpenStack Meet & Drink: Toast to Diablo – Event Highlights

As usual, here are the highlights from the last Bay Area OpenStack Meet & Drink: Toast to Diablo – September 28th, 2011. Thanks to WireRE for hosting us, Dave Nielsen – for helping to organize, and all the attendees – for coming. Once again, this was the biggest MeetUp thus far with 150 in attendance. For those of you that didn’t come – here is what you missed:

We started our Diablo release celebration with wine, beer and pizza. Fun mingling with fellow stackers. As people kept arriving it got almost too crowded.

Mirantis founder – Alex Freedland – passionately explaining something to David Allen.

Mike Scherbakov from Mirantis, Josh McKenty from Pison and Eric from CloudScaling debating OpenStack with noticeable vigor.

Eric Windisch proudly sporting his uber cool CloudScaling shirt, listing to Mike Scherbakov from Mirantis.

While the crowd was mingling, Dave Nielsen took people on datacenter tours. The datacenter basically looked like this.

As usual, I opened with some thank you's and acknowledgements to our sponsors and organizers. Marc Padovani of HP Cloud Services – clapping and anxiously waiting his turn to tell the crowd about OpenStack based

With 150 stackers in attendance, we didn’t have quite enough chairs to accommodate everyone.

Dave Nielsen talking about our venue host – WiredRE.

Chris Kemp – CEO and Founder of Nebula announced the OpenStack Silicon Valley LinkedIn group that Nebula recently started.

…meanwhile, Josh McKenty was waiting for his turn to speak…

Don’t remember why, but for some reason Josh’s presentation involved talking about O-Ren Ishi from Kill Bill. Whatever it was, Chris Kemp got a kick out of it.

Everybody likes Kill Bill, so the crowd was cheering.

Geva Perry shared his perspective on why OpenStack’s strength is in its ecosystem of developers and partners.

Jason Venner of talked about OpenStack and CloudFoundry. He was careful not to reveal anything with respect to the upcoming “October 13th” announcement of X.commerce platform.

In closing we had Marc Padovani from HP talk about hpcloud and HP’s commitment to OpenStack. The presentation quickly turned into a Q&A grilling session, with stackers expressing their suspicions over being a smoke screen, rather than real offering. Marc did his best to address the questions without incriminating his big corporation… My wife got too tired of taking pictures at that point, so there are none of Marc… sorry Marc.

Hungry stackers drank most of the wine and ate most of the food. Whatever was left over, people took home. We kept one last bottle of Cloud Wine. I intend to give it as a gift to our 500th MeetUp member – Ilan Rabinovich. Ilan – if you read this, ping me on twitter @zer0tweets to claim your prize!

Thank you to everyone and we’ll do it again in 3 months.

Friday, September 23, 2011

What is this Keystone anyway?

The simplest way to authenticate a user is to ask for credentials (login+password, login+keys, etc.) and check them over some database. But when it comes to lots of separate services as it is in the OpenStack world, we have to rethink that. The main problem is an inability to use one user entity to be authorized everywhere. For example, a user expects Nova to get one's credentials and create or fetch some images in Glance or set up networks in Quantum. This cannot be done without a central authentication and authorization system.

So now we have one more OpenStack project - Keystone. It is intended to incorporate all common information about users and their capabilities across other services, along with a list of these services themselves. We have spent some time explaining to our friends what, why, and how it is and now we decided to blog about it. What follows is an explanation of every entity that drives Keystone’s life. Of course, this explanation can become outdated in no time since the Keystone project is very young and it has developed very fast.

The first basis is the user. Users are users; they represent someone or something that can gain access through Keystone. Users come with credentials that can be checked like passwords or API keys.

The second one is tenant. It represents what is called the project in Nova, meaning something that aggregates the number of resources in each service. For example, a tenant can have some machines in Nova, a number of images in Swift/Glance, and couple of networks in Quantum. Users are always bound to some tenant by default.

The third and last authorization-related kinds of objects are roles. They represent a group of users that is assumed to have some access to resources, e.g. some VMs in Nova and a number of images in Glance. Users can be added to any role either globally or in a tenant. In the first case, the user gains access implied by the role to the resources in all tenants; in the second case, one's access is limited to resources of the corresponding tenant. For example, the user can be an operator of all tenants and an admin of his own playground.

Now let’s talk about service discovery capabilities. With the first three primitives, any service (Nova, Glance, Swift) can check whether or not the user has access to resources. But to try to access some service in the tenant, the user has to know that the service exists and to find a way to access it. So the basic objects here are services. They are actually just some distinguished names. The roles we've talked about recently can be not only general but also bound to a service. For example, when Swift requires administrator access to create some object, it should not require the user to have administrator access to Nova too. To achieve that, we should create two separate Admin roles - one bound to Swift and another bound to Nova. After that admin access to Swift can be given to user with no impact on Nova and vice versa.

To access a service, we have to know its endpoint. So there are endpoint templates in Keystone that provide information about all existing endpoints of all existing services. One endpoint template provides a list of URLs to access an instance of service. These URLs are public, private and admin ones. The public one is intended to be accessible from the global world (like, the private one can be used to access from a local network (like http://compute.example.local), and the admin one is used in case admin access to service is separated from the common access (like it is in Keystone).

Now we have the global list of services that exist in our farm and we can bind tenants to them. Every tenant can have its own list of service instances and this binding entity is named the endpoint, which “plugs” the tenant to one service instance. It makes it possible, for example, to have two tenants that share a common image store but use distinct compute servers.

This is a long list of entities that are involved in the process but how does it actually work?

  1. To access some service, users provide their credentials to Keystone and receive a token. The token is just a string that is connected to the user and tenant internally by Keystone. This token travels between services with every user request or requests generated by a service to another service to process the user's request.
  2. The users find a URL of a service that they need. If the user, for example, wants to spawn a new VM instance in Nova, one can find an URL to Nova in the list of endpoints provided by Keystone and send an appropriate request.
  3. After that, Nova verifies the validity of the token in Keystone and should create an instance from some image by the provided image ID and plug it into some network.
    • At first Nova passes this token to Glance to get the image stored somewhere in there.
    • After that, it asks Quantum to plug this new instance into a network; Quantum verifies whether the user has access to the network in its own database and to the interface of VM by requesting info in Nova.
    All the way this token travels between services so that they can ask Keystone or each other for additional information or some actions.

Here is a rough diagram of this process:

Friday, September 16, 2011

Cloudpipe Image Creation Automation

Cloudpipe is used in OpenStack to provide access to project’s instances when using VLAN networking mode. It is just a custom Virtual Machine (VM) prepared in a special way, i.e. coming with an accordingly configured openvpn and startup scripts. More details on what cloudpipe is and why it is needed are available in OpenStack documentation.
The process of creating an image involves a lot of manual steps which crave to be automated. To simplify these steps, I wrote a simple script that uses some libvirt features to provide fully automated solution, in a way that you don't even have to bother with preparing base VM manually.
The solution can be found on a github and consists of 3 parts:
  • The first is the main part. Only this part should be executed. When you run it, it will configure the virtual network and PXE. Then it will start a new VM to install a minimal server Ubuntu by kickstart, so the installation is fully automated and unattended.
  • The second is used to turn minimal server Ubuntu to cloudpipe. It is being executed when the VM is ready to make this turning.
  • The last ssh.fs is used to ssh into the VM and shutdown it.
So, if you need the cloudpipe image, just run and wait. You'll get the cloudpipe image without any mouse clickings and keyboard pressings!
More detailed information about how it works can be found in README file.
Don’t hesitate to leave a comment If you have any questions or concerns.

Thursday, September 8, 2011

Cloud Accelerates Open Source Adoption

Historically, commercial software provided enterprises with reliability and scalability, especially for mission-critical tasks. No one wanted to risk failure in finance, operations, or any essential or enterprise-wide areas. So, enterprises considered open source technology only for less important, more tactical purposes.

Recently, however, many large IT organizations have developed significant open source strategies. Cisco, Dell, NASA, and Rackspace came together to give birth to OpenStack. VMWare acquired SpringSource and shortly thereafter, announced Cloud Foundry, their open source PaaS. Amazon,, and others built solutions entirely on an open source stack. Whole categories of technologies, such as noSQL databases, made their way to mass adoption shortly after being open sourced by Google and Facebook. There has been more activity in open source during the last two years than in the preceding decade. So what’s going on here?

Without a doubt, cloud is the IT topic that’s been grabbing headlines and investment dollars in the past few years. The recent high level of activity in open source noticeably correlates with the cloud movement, because there is a deep, synergetic relationship between the two. In fact, cloud is the primary driver for the increased adoption of open source.

In general, open source projects typically require two components to get community uptake. First, the nature of the project itself has to be technologically challenging. Successful open source projects are largely about solving a set of complex technological tasks vs. just writing a lot of code to support complex business process, such as the case with building enterprise software. Linux, MySQL and BitTorrent are all good examples here. Second, it requires a high rate of end user adoption. The more people and organizations that start using the open source technology at hand, the more mature the community and the technology itself becomes.

Cloud has created an enormous amount of technologically challenging fodder for the open source community. The adoption of cloud translates to greater scale at the application infrastructure layer. Consequently, all cloud vendors, from infrastructure to application, are forced to innovate and build proprietary application infrastructure solutions aimed at tackling scale-driven complexity. Facebook’s Cassandra and Google’s Google File System/Hadoop/BigTable stack are prime examples of this innovation.

However, it is important to note that neither Facebook, nor Google are in the business of selling middleware. Both make money on advertising. Their middleware stack may be a competitive advantage, but it is by no means THE competitive advantage. Because companies want to keep IT investments as low as possible, a the natural way to drive down costs associated with scale-driven complexity is to have the open developer community help address at least some of the issues to support and growing the stack. The result? Instances like Facebook’s open sourcing of Cassandra and Rackspace contributing its object storage code to OpenStack. Ultimately, cloud drives complexity while cloud vendors channel that complexity down to the open developer community.

What about end user adoption? Historically, enterprises were slow to adopt open source. Decades of lobbying by vendors of proprietary software have drilled the idea of commercial software superiority deep into the bureaucracy of enterprise IT. Until recently, the biggest selling point for commercial enterprise software was reliability and scalability for mission-critical tasks; open source was “OK” for less important, more tactical purposes. Today, after leading cloud vendors like Amazon, Rackspace, and Google built solutions on top of an open source stack, the case against open source for mission-critical operations or incapable of supporting the required scale is no longer valid.

But the wave of open source adoption is not just about the credibility boost it received in recent years. It is largely about the consumption model. Cloud essentially refers to the new paradigm for delivery of IT services. It is an economic model that revolves around “pay for what you get, when you get it.” Surprisingly, it took enterprises a very long time to accept this approach, but last year was pivotal in showing that it is tracking and is the way of the future. Open source historically has been monetized leveraging a model that is much closer to “cloud” than that of commercial software. In the case of commercial software, you buy the license and pay for implementation upfront. If you are lucky to implement, you continue to pay for a subscription that is sold in various forms – support, service assurance, etc. With open source, you are free to implement first, and if it works, you may (or may not) buy commercial support, which is also frequently sold as a subscription to a particular SLA. The cloud hype has helped initiate the shift in the standard for the IT services consumption model. As enterprises wrap their minds around cloud, they shy further away from the traditional commercial software model and move closer to the open source / services-focused model.

It is also important to note that the consumption model issue is not simply a matter of perception. There are concrete, tactical drivers behind it. As the world embraces the services model, it is becoming increasingly dominated by service-level agreements (SLAs). People are no longer interested in licensing software products that are just a means to an end. Today, they look for meaningful guarantees where vendors (external providers or internal IT) assure a promised end result. This shift away from end user licensing agreements (EULAs) and toward SLAs is important. If you are a cloud vendor such as, you are in the business of selling SLA-backed subscription services to your customer. If, at the same time, you rely on a third party vendor for a component of your stack, the SLA of your vendor has to provide the same or better guarantees that you pass on to your client. If your vendor doesn’t offer an SLA or only offers an end user license agreement, you end up having to bridge the gap. These gaps that an organization is forced to bridge ultimately affects its enterprise value. As we move away from the EULA-driven economy and more towards SLAs, open source stands to benefit.

Ultimately, as cloud continues to mature, we will continue to see more and faster growth in open source. While the largest impact so far has been in the infrastructure space, open source popularity will eventually start spreading up the stack towards the application layer.