2014-12-31

How Did 2014 Turn Out For You?

60 Blog posts in a 2014 – sometimes I don’t understand how that happens. Is that a lot? A little?

I have always said that I do not blog for the sake of blogging, but to share information and my thoughts. It is good to see that people find this useful – and do take an interest in what I have to say.

The 5 posts that received the highest number of visitors in the past year were:

  1. The Quickest Way to Get Started with Docker
  2. VSAN - The Unspoken Truth
  3. vCenter is Still a Single Point of Failure
  4. Nova-Docker on Juno
  5. Introducing VIRL Personal Edition

What did I blog about? I calculated this with the tags I attached to each of the posts:

  1. OpenStack (30)
  2. VMware (19)
  3. Cloud (12)
  4. DevOps (7)
  5. VMworld (6)
  6. Automation, Administration, Architecture, Design and Docker (5)

(Ok I lied – that was more than 5 topics)

If you would like more of a graphic presentation – here you go.

Tag Cloud

For me it has is interesting to see that my focus has changed, and not VMware centric any more. I guess that is an expected thing – with my current role and also that I have a more overall solution in mind during my daily work – which is so much more than virtualization.

I hit a Milestone – on December 03, 2014 where I surpassed over 2,000,000 Pageviews on my blog.

2 Million

It took me 5 years to achieve my first million, the second was achieved in only two years. I have been blogging for 7 years now, it always good to share my thoughts, my insights, and sometimes my rants. I hope you all benefit, and will continue to make use of the articles I write.

Here is looking to a great 2015, filled with opportunities, community and exciting things ahead.

And how did 2014 turn out for you? Please feel free to leave your comments and thoughts below.

Catch you on all the flip side!

2014-12-11

vCenter is Still a Single Point of Failure

A few days ago VMware (Mike Brown, Anil Kapur and Justin King are the authors) announced the updated document for the vCenter Server 5.5 Availability guide.

I would like to make clear a few things from the start.

  • This is not a VMware bashing post. (Even it might be perceived as such)
  • I hold all three of the authors in very high regard.

Here goes.

When reading this document I was hoping to hear something new, something refreshing, something that VMware customers have been asking and verbally complaining about for a very long time.

Alas – this is not the case.

vCenter is a single point of failure. There I have said it. I have said it before, I will continue to say it in the future until this if fixed.

In the following article I will be taking statements directly from the text, providing my thoughts as I go along.

Overview

Great start – This document will discuss the requirements… After re-reading that statement – I understood what VMware did. VMware has not provided us with a method of providing HA for vCenter – but rather – have explained what they think should be defined as High Availability be for your vCenter server.

SLA

The authors then go into explaining all about MTBF and MTTR – they did a great job. I will not go into the details here – you should read the document

SLA’s are extremely important – and for and every environment – an SLA is something different – yours may differ from your neighbor, so it is important to understand what you need to achieve.

tests

They then go into describing the tests that were run in order to measure the amount of time it would take for a vCenter server to recover. Fair enough.

results

Here is where it starts to get interesting. Let us look at this in a picture.

Timeline

Bottom line is – that once a vCenter server has gone down – it will take a little over 5 minutes until it is fully functional.

recommendations

This part of the document states that having vSphere HA – and having vCenter running as a virtual machine actually provides some level of protection.

A dedicated management cluster is of course advised – that way you have a dedicated environment to run your management components without having to worry that the client workloads will interfere.

ESXi Hosts

Also putting the database in the same management cluster is recommended – seems logical.

I then noticed that the only SQL version that is supported for vCenter 5.5 is Enterprise and up – which was news to me. I gather this is a documentation bug – because the VMware Product Interoperability Matrix says that Standard is supported.

Matrix

So how do you protect vCenter?

Replication

It would really be great if they would explain exactly how that would be possible and how that should be done. It still might be possible? How exactly? In order to protect vCenter – I will need another vCenter? Licensing? Implications?

VDP

Emergency Restore was a new one to me – but it is only available in vSphere Data Protection Advanced Edition – that is something that was left out – which is approximately $1,500 (list price) / per socket. As a result of the feedback received in the comments – I have amended this. It seems that Emergency restore is also available in all editions of VDP – not only Advanced (more information here).

Definition

OK, enough copy and paste. This piece above is what set me off.

Essentially what VMware are saying the following:

  1. Use a separate management cluster
  2. Run vCenter in a VM
  3. Run the Database in a VM
  4. No matter what happens – if your vCenter crashes then it will be down for 5 minutes.
  5. Your  workloads are safe because they are running on your ESXi hosts are protected by HA and can continue running without a vCenter server.

Points 1-4 - I totally agree. With point #5 I also agree.

But there are environments that cannot afford to have a 5 minute outage. VMware might say that having vCenter go done and out for five minutes, is not really an outage per se, but I would very much like to disagree here.

If I cannot provision a new VM because my vCenter is not available – that is an outage.

Where would this be an issue?

  • VDI environments – If a user logs in and his desktop is not provisioned because vCenter is down? How about the whole 100 or 1000 employees?
  • Highly automated environments – ones that use products like vRealize Automation or vRealize Code Stream. Imagine having your code builds fail for 5 minutes because vCenter is not available? The whole continuous delivery process breaks down.

I might be exaggerating a bit – but I have voiced this more than once – I started more than 4 years ago - Troubleshooting Tools for vCenter.

vCenter is probably the most crucial part of your virtual infrastructure. And all that you can expect from from an availability perspective is that you have to accept as a given that vCenter might go down for 5 minutes at a time.

There are environments that will accept this - I would actually say that the large majority are fine with this – but what about those who are not? Those who cannot afford having this kind of outage? What do they do?

There used to be a product called vCenter Server Heartbeat – which was retired.

Heartbeat

Where are those promised options? When will they be available? What do companies do in the interim? Pray that there vCenter does not crash?

Embedded below is the Twitter conversation that sparked this post.

 

The scenario on which VMware based their whole presumption was on the fact that the host on which vCenter was running would crash, HA would kick in and the VM would be restarted on another host within 5 minutes.

The whole scenario of having a problem with your database, or a vCenter service problem (and believe me it happens), that was not covered.

Take the following scenario. You have a vCenter appliance. For some reason the vCenter service stops responding on the VM. There is no automatic restart. Eventually you get a call, something is not right. You try and restart the service, nothing happens. You restart the VM, nothing happens.

Now what? Open a call with VMware? Deploy another vCenter appliance and hope that nothing goes wrong? I can guarantee you that will take a hell of a lot longer than 5 minutes.

Why does the document even go into providing a clustered solution for the MSSQL database? Because that might fail? Yes it could happen. But guess what – the whole system is only as strong as its single weakest link. So providing a clustered database solution might give some piece of mind – but it will not protect you from an outage – because there is no way to cluster a vCenter server.

conclusion

In conclusion – yes there are considerations. I would definitely not say that VMware have a High Availability solution for vCenter, they have done their best to minimize the impact it will have when it vCenter crashes – but that is not HA!

What do you think? Am I making a mountain out of  molehill? Or this a real and valid concern? Please feel free to leave your comments and thoughts below.

2014-12-08

Keeping up to date with OpenStack Blueprints

OpenStack is a living product – and because it is community driven - changes are being proposed almost constantly.

So how do you keep up with all of these proposed changes? And even more so why would you?

The answer to the second question is because if you are interested in the projects then you should be following what is going on. In addition there could be cases where you see that the proposed blueprint could break something that you currently use or is in directly contradiction to what you are trying to do – and you should leave your feedback.

OpenStack wants you to leave your feedback – so please do!

About the first question - the answer is here – http://specs.openstack.org. This is an aggregate of the new blueprints (specs) for each of the projects as they are approved.

I use RSS feeds available for the blueprints which helps me keep up to date as soon as a new blueprint is added.

I have compiled an OPML file with all the current projects that you can add to your favorite RSS reader.
You can download it in the link below.

file

I hope this will be as useful to you as it is to me.

As always, comments, suggestions and thoughts are always welcome.

2014-12-02

Landing your first OpenStack Contribution

In my previous post I showed you how to get your OpenStack git environment up and running by using a container.

In this post we will go through the steps needed to actually contribute code. This will not be a detailed tutorial on how to use git and gerrit, and its functionality, but rather a simple step by step tutorial on how to get your code submitted for review in OpenStack.

First we start up the container.

start container

Since playing around with real OpenStack code is not a good idea when you are just learning – there is a sandbox repository where you can perform all your tests.

https://github.com/openstack-dev/sandbox

First things first we need to clone the repository so that we have a local copy of the files

git clone https://github.com/openstack-dev/sandbox

git clone

What this does is, you copy all the files in the repository to a folder of the same name under your current working directory. Depending on the size of the repo – this could take seconds or minutes.

Enter the directory and look at the files.

cd sandbox
ls –la

file structure

You will see the files are the same as the those on the repository on the web.

github

With the exception of one file the .git folder which is not visible on the github repository. This link will give you some more explanation as to what is in the folder.

.git folder

Now make sure you have the latest code from Github.

git checkout master
git pull origin master
git checkout
Create a branch to do your work in that you'll do commits from.
git checkout -b MYFIRST-CONTRIBUTION
branch
Now we get to the changes.
I am going to create folder named maish with two files inside, like the structure shown below
folder contents
Here I just created empty files – but it could be correcting someone else’s code or adding new code, the process is the same.
Once you have completed your work you will need to add all the changes and push them back up to the original branch.
Add all the files and changes by running
git add .
Next, you commit your changes with a detailed message (and you should really understand how to commit proper changes) that'll be displayed in review.openstack.org, creating a change set.
git commit –a
git add

A VI editor will open were you now can add the reasons for your change and mention any closed bugs. Follow the conventions about git commit messages giving a good patch description, adding a summary line as first line, followed by an empty line, descriptive text, backport lines and bug information:

commit message

Save the file by typing :wq, and you will see that your files and changes were added.

commit result

Set up the Gerrit Change-Id hook, which is used for reviews, and run git review to run a script in the /tools directory which sets up the remote repository correctly:

git review

You might be asked prompted to accept the SSH key, type yes.

If all goes you will see something similar to the output below.

git review

Looking back at the github repo – you will not see any changes. You might ask yourself – where did my code go?

The reason you do not see any change – is that before any code is accepted in the master branch it has to be reviewed, both by an automated set of tests and also by humans.

So where did it go?

If you go to https://review.openstack.org/#/ you will see the change you just submitted

review.openstack.com

Clicking on the change will take you to the details where you can see the following:

The change information.

change info

The commit message (you will notice that the Change-Id was automatically added)

commit message

The status of the reviews and feedback. This could be an automatic test or an actual person who reviewed and left a comment.

reviewers

Here are the files that were checked in.

files

And the comments themselves

Comments

I can also make and additional change as well – this could be based upon feedback from one of the reviewers, a failed test, or any other reason. Here I added another file – file3.

new patch set

I need to add the changed files and commit them again – this time with a flag –amend. You can change the commit message.

git add .
git commit –a –-amend

commit message2

commit result

And then push upstream.

git review

git review

Going back to the web page you will see a few differences.

The new commit message.

new commit

And that the code is now added as a new patch set – i.e. a new version of the code.

new patch set

One last thing.

Since this is a sandbox – please keep it clean. That means when you are finished with your tests you should mark your commit as Abandoned.

abandon

The status will change.

status change

And this will change the status in you list of changes to Closed

Closed

I hope this was useful and will alleviate some of the concerns and people have with contributing code back into OpenStack.

Please feel free to leave your comments and feedback below.

2014-12-01

Introducing VIRL Personal Edition

This is an internal Cisco tool which is so useful – that I am really pleased that it is finally available for public consumption.

VIRL Stands for Virtual Internet Routing Lab

What Is VIRL?

VIRL is comprehensive network design and simulation platform. VIRL includes a powerful graphical user interface for network design and simulation control, a configuration engine that can build complete Cisco configuration at the push of a button, Cisco virtual machines running same network operating systems as used in Cisco’s physical routers and switches, all running on top of OpenStack. virl

How Does VIRL Work?

VIRL uses the Linux KVM hypervisor and OpenStack as its virtual machine control layer, with a powerful API enabling the creation and operation of VMs in a simulated network topology. Users design their network using the VM Maestro design and control interface, with network elements such as virtual routers, switches and servers. The design is translated into a set of virtual machines running real Cisco network operating systems.

What Does VIRL Offer?

Design, learn and test with virtual machine running real Cisco network operating systems – IOS, IOS XE, IOS XR  and NX-OS as well as virtual machine running 3rd party operating systems. Build highly-accurate models of real-world or future networks, study the behaviour and configuration of routing protocols, break and fix your network and understand how to troubleshoot with a powerful integrated platform.

The original information can be found here

Cisco VIRL Personal Edition annual subscription license provides a scalable, extensible network design and simulation environment for several Cisco Network Operating Systems for students. This includes IOSv, IOS XRv, NX-OSv, CSR1000v as well as third party images such as Ubuntu Linux.

Educational pricing is available for this product for college students, parents buying for a college student, or teachers, homeschool teachers and staff of all grade levels – limited to one purchase.

VIRL enables users to:

• Build highly-accurate models of real-world or future networks.

• Learn and test with ‘real’ versions of Cisco network operating systems – IOSv, IOS XRv, NX-OSv and CSR1000v.

• Integrate virtual network simulations with real network environments.

• The download includes VIRL Personal Edition 1.0 Pre-Release software with a single-user annual license to manage up to 15 Cisco nodes.

You can view a short demo of the product in the link below.

VIRL Demo