Category Archives: KVM

List domain disks using Python-Libvirt

Short piece of code to return an array of DiskInfo tuple consisting of disk’s format , driver type and source file etc. The DiskInfo tuple is a user defined tuple to allow easy access to all info about a disk device.

import libvirt
import sys
from xml.etree import ElementTree
from xml.dom import minidom
from collections import namedtuple

DiskInfo = namedtuple('DiskInfo', ['device', 'source_file', 'format'])
#Return a list of block devices used by the domain
def get_target_devices(dom):

   #Create a XML tree from the domain XML description.
   tree=ElementTree.fromstring(dom.XMLDesc(0))
  
   #The list of block device names.
   devices=[]
  
   #Iterate through all disk of the domain.
   for target in tree.findall("devices/disk"):
       #Within each disk found, get the source file 
       #for ex. /var/lib/libvirt/images/vmdisk01.qcow2
       for src in target.findall("source"):
               file=src.get("file")
       
       #The driver type: For ex: qcow2/raw
       for src in target.findall("driver"):
                   type=src.get("type")
       
       #Target device like: vda/vdb etc.
       for src in target.findall("target"):
                   dev=src.get("dev")
       
       #Make them all into a tuple
       Disk = DiskInfo(dev, file, type)
       
       #Check if we have already found the device name for this domain.
       if not Disk in devices:
              devices.append(Disk)
             
   #Completed device name list.
   return devices
     

Here dom is the dom returned after call to conn.lookupByName or equivalent function.

Calling function was:

for dev in get_target_devices(dom):
            print( "Processing Disk: %s",dev)

Libvirt and Python Connect to Hypervisor

These are short posts on how to connect to qemu/KVM via libvirt using Python binding. Ability to talk to the hypervisor helps in various automation tasks. In this post, we show how to connect to hypervisor and display domain details. Its assumed that you have Qemu/KVM and have installed libvirt-python package. If not, its yum install libvirt-python to install it.

import libvirt
import sys

#Open a readonly connection to libvirt
conn = libvirt.openReadOnly(None)

if conn == None:
    Logger.critical( 'Could not connect to the hypervisor')
    sys.exit(1)
try:
    dom = conn.lookupByName("centos_vm")
except:
    Logger.critical( 'Could not find the domain')
    sys.exit(1)
print ("Domain : id %d running %s state = %d" % (dom.ID(), dom.OSType(), dom.state()[0]))
print (dom.info())

Output

Domain 0: id -1 running hvm state = 5
[5, 2097152L, 0L, 2, 0L]

From libvirt source code:

# virDomainState (dom.state()[0])
VIR_DOMAIN_NOSTATE     = 0 # no state
VIR_DOMAIN_RUNNING     = 1 # the domain is running
VIR_DOMAIN_BLOCKED     = 2 # the domain is blocked on resource
VIR_DOMAIN_PAUSED      = 3 # the domain is paused by user
VIR_DOMAIN_SHUTDOWN    = 4 # the domain is being shut down
VIR_DOMAIN_SHUTOFF     = 5 # the domain is shut off
VIR_DOMAIN_CRASHED     = 6 # the domain is crashed
VIR_DOMAIN_PMSUSPENDED = 7 # the domain is suspended by guest power management
More Help

Run this command to print out the entire libvirt API.

python -c "import libvirt; help(libvirt)" > help.txt

Other ways to display state names:

domain_state_name = {  libvirt.VIR_DOMAIN_NOSTATE     : "no state",
                       libvirt.VIR_DOMAIN_RUNNING     : "running",
                       libvirt.VIR_DOMAIN_BLOCKED     : "idle",
                       libvirt.VIR_DOMAIN_PAUSED      : "paused",
                       libvirt.VIR_DOMAIN_SHUTDOWN    : "in shutdown",
                       libvirt.VIR_DOMAIN_SHUTOFF     : "shut off",
                       libvirt.VIR_DOMAIN_CRASHED     : "crashed",
                       libvirt.VIR_DOMAIN_PMSUSPENDED : "suspended by PM" }

print("Current State:", domain_state_name[dom.state()[0]])

References

http://www.ibm.com/developerworks/library/os-python-kvm-scripting1/

Multi Node OpenStack Kilo On Single Physical Host

Introduction

In our venture into IaaS, we considered a small experimental project to install OpenStack on a single host with a view that when we get budgeted for multiple physical machines, we should be able to scale up from our 1 physical machine to N-physical machine with ease without fear of losing configuration. AKA Project OpenStack-in-a-box. In this article, we use OpenStack Kilo, CentOS 7.1 release using RDO Packstack .

What we get with All-in-One or multi node RDO Packstack?

In this article, we provide a multi node OpenStack installation on a single physical node without using nested virtualization. In our implementation, except the compute node all other nodes are virtual nodes. RDO Packstack provides an all-in-one or multi node install option. Depending on if you are deploying on bare metal server or inside a VM, the lay of the land looks like in figure below for all VM deployment:

Image of how OpenStack Deployment looks in an all VM Environment
Openstack Deployment in an all VM Environment

In the above picture, the individual control nodes such as Neutron & Compute are Virtual Machines  themselves. The instances spawned from OpenStack are running inside the Compute VM. While nested virtualization fine, performance is  not the best.

So why not use Packstack all in on bare metal? Solves performance but makes it hard to migrate out easily.

One theory followed is, moving entire VMs or LVM blocks is one of the simpler and safer ways to migrate a system compared to moving configuration files or databases. This is especially so of control nodes such as Horizon/Neutron.

What do we want?

With performance and ease of scale out in mind, we wanted all control nodes as VMs and the single compute node as the host itself. Its easier to replace or add new compute node than to move data from to make a new control node. On the other hand, Compute Nodes are treated as more dispensable. So when we get our shiny new physical servers, theoretically, we move our control nodes to the new machines. Environment we are aiming for is shown in the figure below:

Image showing OpenStack Environment we are aiming for
OpenStack Environment we are aiming for

In the above figure, we want the physical host to be the Nova Compute node. The other control nodes such as Neutron, Horizon, Cinder etc are VMs that are manually created and running on the physical host.

Say we scale out, we leave the physical host as compute node and move the VMs to the new hardware still running as VM.

For sake of pushing the limits, we wanted to keep available physical NICs to exactly 1.

We had some requirements in this project:

  1. Instances must be as fast as a running a VM in KVM
    1. Nested KVM was considered, but many companies don’t offer support for their software if run in a nested KVM environment.
  2. Easy to implement in enterprise environments
  3. Easy to migrate to multi physical out when we get more physical machines.
  4. Must use only one physical NIC and 1 physical host
  5. Must be repeatable

 Security wise, iptables of the host is difficult to control because Nova updates iptables for instance launch. But at least its in one server for now. One big thing we lose with this configuration is High Availability – which by definition needs 2 physical nodes. While you can implement HA as VMs, but physical or environmental failure means the environment is lost. So beware of this case.

 We evaluated various options:

Method Limitation
Triple O Too complex for simple setups as it targets multi hypervisors. Hard to justify any experiments, unless in a heterogeneous environment.
All virtual machines with nested KVM Performance limited. Not supported in enterprise environment (Red Hat and such)
Packstack all in one We eventually use packstack with multi node config, but this requires a little pre configuration

 We need a refined approach.

Solution

The solution has a key concept: Use the physical host to do two things:

1. Be the compute host (Nova) and

2. Run few virtual machines that we manually made from XML via virsh or virt-manager  that are used as controller, network, storage, authentication (FreeIPA) nodes. Everything else was driven by OpenStack.

…And the Trick for Packstack

Packstack install scripts need the name of the vNIC on Compute Node to match the name of the NIC on neutron. So we rename our virtual bridge name to the vNIC’s name on Neutron.

The overall single physical host OpenStack setup is shown in the  following diagram.

Image of Single Physical Host Network Architecture
Single Physical Host Network Architecture

In the above picture, the important pieces are:

  1. The network setup (virbrx) – Manually defined networks in XML for virsh.
  2. Our OpenStack internal network usually gets named virbr1. We  edited the network XML to have bridge name = eth1.  This is important to satisfy a packstack install requirement that interface name on control and compute match.
  3. The manually created Virtual Machines for use as OpenStack nodes such as Network, Storage and Controller nodes. These are simple CentOS templates with static IPs.
  4. After packstack is run, the OpenStack spawned VMs (“Open Stack VM Instance x”) communicate via the virbrx that was setup for OpenStack.

Here are the things necessary for this setup

Step 1: Create few virtual machines by hand or virt-manager.

Step 2: These virtual machines are controller, network and storage

Step 3: Create LVMs that you can attach to the storage VM

Step 4: Create the networks that OpenStack uses via virsh

Step 5: The important part is the host should become a part of private network

External IP Addresses

In general, the external network is any network that is connected to network host/L3 host. It could be your upstream ISP network or another internal corporate network. Normally, you’ll have multiple IP addresses assigned so that can be used as floating IP address ranges. Managing how to connect these IP addresses to the Neutron host is a decision based on performance and features requirements.

Note that we are using only one NIC. Sometimes we may need the NIC to access the host so it may not be possible to do PCI device pass through of the NIC to the Neutron host. 

Here are few options for using the NIC:

  1. Bridging needs to be done carefully as it has a performance impact because the NIC becomes a slave to the bridge and enters promiscuous mode, listening to all traffic on the network.
  2. Use libvirt‘s routed network to route the traffic to the Neutron host. By itself, this is not enough because you need to set the routes of your public IP block to point to your host as the router. If you have control over the upstream router, its best to add static routes that direct the public IP to your physical host as the the gateway. Then the physical host will forward it to the internal virtual bridge serving the external network. This is a clean and more maintainable approach.
  3. If you have no control over the upstream router, Proxy ARP method can be used. This needs to be carefully handled.
    1. Proxy ARP is a mode of the NIC where it answers to ARP requests for all IP addresses for which it has a non-default route (i.e static route that is not a default gateway)
    2. proxy_arp can easily lead to network issues if used without care. Common problem is the host responds to ARP requests for answer for internal (virtual) networks to requests coming from the external network.
    3. To be good citizen, you may add bridge filter rules to drop outgoing ARP requests that are not sourced from your public IP list. Or drop incoming ARP requests that don’t match the public IP addresses that you intend to answer for.

 OpenStack Node layout

     Upstream Gateway
     |
     |
--peth0------------------------Physical Host---------------------------,
|    |                                                                 |
|    |                                                                 |
|    |----->Routing Table                                              |
|          _______________________                                     |
|         | e.x.t.0/24 -> virbr2 |           virbr1--rename--eth1      |
|         | 0.0.0.0    -> gatewy |                            |        |
|         |_____________________ |                            |        |
|                 |                                           |        |
|                 v  Packets to virbr2 and back to GW         |        |
|          ______________                              Nova PrivIF     |
|        |   virbr2    |                                Same Net as    |
|        |_____________|                               Data (veth1)    |
|                  ^                                          |        |
|                  |                                          v        |
|         br-ex <--'                                                   |
|         veth1                               veth1                    |
|         veth0 ___________                   veth0 ___________        |
|         |               |                   |               |        |
|         |   VM0 -Net    |                   |   VM2 -stor   |        |
|         |_______________|                   |_______________|        |
|                                                                      |
|                                                                      |
|        veth1                               veth1                     |
|        veth0 ___________                   veth0 ___________         |
|        |               |                   |               |         |
|        |   VM1 -ctrl   |                   |   VM3 -Auth   |         |
|        |_______________|                   |_______________|         |
|                                                                      |
|                                                                      |
| -------------------------------------------------------------------- |

When installing with packstack there are some requirements that make us do interesting things. The first one is to rename virbr1 on the physical host to eth1 by editing the XML used for creating the network in virsh. This is needed because the packstack expects the interface on Nova and Controller node to have the same name (for ex. eth1 on both). If you’re uncomfortable with that naming, after packstack install, you could destroy the ‘eth1’ virtual network and rename it back to ‘virbr1’ before going back to full operations. For general operation, it does not matter what the virtual network is named as.

 You’ll have to decide which way you want the naming to be. This depends mainly on what NIC and how many you have.   We decided ‘eth1’ will be our NIC’s name in the control node. Our server did not use eth1 so it saved an additional step of renaming our physical NICs.

Planning the networks

The NIC to network mapping are provided as a reference:

eth2 -> 111.0.0.0/29 or /28  [External network, IP address provided by data center]
eth1 -> 172.16.xx.0/24       [OpenStack Data network]
eth0 -> 192.168.xx.0/24      [Corporate NAT access network to internet ]

IP address Management – Floating and Internal

An overall network diagram is shown in the following diagram. This also marks out IP addresses that are used in various networks. 111.0.0.x is external IP (floating IP). Other addresses are internal.

Image of Open Stack Network With IP Assigned
Open Stack Network With IP Assigned

The key aspect to understand about Floating IP network is the host owning the physical NIC (peth0) acts as a gateway. When the virtual network 111.0.0.0/28 is started in libvirt. This causes libvirt to add the routes to the physical host’s routing table. This presence of route to 111.0.0.0/28 network allows the physical NIC to accept packets arriving at the NIC for forwarding to its final destination in the Neutron host’s br-ex bridge which further routes it to the correct OpenStack VM instance that the IP has been assigned to. proxy_arp also causes the peth0 to respond to any ARP requests for the IP addresses in the route table. This means for security, we need an iptable rule to drop packets arriving at peth0 that are not destined for the floating IP or the peth0 range.

Any request for the 111.0.0.0 network from the OpenStack instances goes all the way from the OpenStack router to Phys Host as they are really on connected bridge:

OpenStack Router -> Br-Ex -> virbr0 -> routing table on physical host -> final interface selection.

 Any request for other IP addresses uses the gateways:

Router -> Neutron host (br-ex) -> Physical host -> Upstream gateway

Public IP address recovery:

 IF       IP          Recoverable?   Why?
 peth0    111.0.0.0   No        Network (zero) address can't be used by VMs anyway.
 virbr0   111.0.0.1   No        Hop 0 routing gateway for OpenStack instances. 
                                This gateway has a route to datacenter/ISP gateway.
                                For security, we should have IP tables to reject 
                                INPUT to 111.0.0.1 from instances.                  
 br-ex    111.0.0.2   Yes       Neutron host br-ex does not need a public IP. 
 OpStkRtr 111.0.0.x   No        Acts as gateway for instances

How to recover IP addresses?

Permanently: On the Neutron host (after you install OpenStack with Packstack) Update the ifcfg-br-ex to have no IP address and add a static route in route-br-ex. This means unless there is another NIC with access to external network, Neutron node itself will not have access to external network on br-ex. This is good for security no one can access Neutron host using external IP. It also saves a precious external IP address.

Prepare the host for packstack

Host is CentOS 7.1 with SELinux enabled.

Disable network manager:

service NetworkManager stop
chkconfig NetworkManager off
service network start
chkconfig network on

Disable firewalld because many scripts are not yet firewalld ready.

systemctl disable firewalld
systemctl stop firewalld

Enable iptables:

yum install iptables-services
systemctl start iptables
systemctl enable iptables

Control Node Virtual Machines

The main control nodes of OpenStack – Controller/Horizon, Neutron, Storage are all virtual machines. Before you begin make sure you have atleast virsh command installed. These VMs are created using virt-manager. If you plan not to install X window system, you can edit the XMLs based on some existing template XML after dumping it out using virsh-dumpxml command.

VM Sizing Guide

Overall, the size of OpenStack cloud is determined by the power of your CPU and memory. A dual socket with copious amount of RAM usually gives better runway for maintaining this setup for slightly longer. You’ll also need a lot of HDD space to hold all the VMs (including instances launched from OpenStack). We had 2 low end Haswell Xeon CPUs with 6 cores each. Giving us 24 cores with HT, 64GB of RAM & 1.5TB HDD.

Controller is heavy on various communications and CPU. General guideline is to allow atleast 4 vCPUs and 4 to 6 GB on controller. Neutron is CPU heavy so 2-4 vCPUs with 2 GB memory for a small sized cloud.

Roughly, 12 to 14 vCPUs were used for various control nodes (control, neutron, storage) with about 12 to 14 GB memory.

In reality only, 1-2 control VMs are actually actively used other are generally idle.

So most of the vCPUs can be over allocated. Memory may not be that flexible.

Storage

A separate VM which has attached LVMs created on the physical host. This allows for easy management of storage space for Cinder and Swift. Once we have a separate storage server, its easy to move and add more storage to the VM.

In our scenario, we use another separate VM for NFS for our internal reasons. But NFS can also be part of the VM. Note that for packstack, the LVM must already have a Volume Group: ‘cinder-volumes’ created.

Note that /etc/hosts and hostname must be be correctly set

External Network Setup

Conceptually, open stack connects to the external network via bridge (usually OVS). For production type usage,  you may have multiple external networks but basic concept is the same: create bridge to the NIC that is connected to the external network and tell Opens Stack about it.

Most of the material in this section is from RDO project’s page on external network.

On Neutron host, setup the external bridge.
External bridge name: br-ex
NIC used for external communications: eth2
File: ifcfg-br-ex

DEVICE=br-ex
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=none
ONBOOT=yes

Next, set how to route traffic to br-ex. This is needed if neutron has a different default gateway than that would carry the external traffic.
So create a following file.
File: route-br-ex (make it chmod +x)

#Network route
ADDRESS0=111.0.0.0
NETMASK0=255.255.255.0
#Gateway(Note can't add GW in 0
#  - route not found error - probably a bug)
ADDRESS1=0.0.0.0
NETMASK1=0.0.0.0
GATEWAY1=111.0.0.1  #Note this should be the reachable IP on virbr0

File: ifcfg-eth2

NAME=eth2
HWADDR=<your eth2’s HW address>
TYPE=OVSPort
DEVICETYPE=ovs
OVS_BRIDGE=br-ex
ONBOOT=yes

Edit /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini to associate the physical to bridge mappings.

bridge_mappings =physnet1:br-ex
service network restart

Better yet, reboot the neutron host VM
ensure: ovs-vsctl show
shows br-ex with eth2

Clean up any old / default routers

neutron subnet-delete public_subnet
neutron router-gateway-clear router1
neutron router-delete router1
neutron net-delete public

Setup the network, subnet and router

neutron net-create public_net --router:external
neutron subnet-create --name public_subnet --enable_dhcp=False --allocation-pool=start=111.0.0.2,end=111.0.0.6  --dns-nameserver=8.8.8.8 --gateway=111.0.0.1 public_net 111.0.0.0/29
neutron router-create public_router
neutron router-gateway-set public_router public_net

Opening for Business

If you used the static route method to forward packets, then you’re already in business.

If you used proxy_arp method, open the main physical NIC for business by allowing it to respond to ARP requests:

sysctl net.ipv4.conf.eno1.proxy_arp=1

Note this responds to all ARP requests that match the routes in

ip route show

So make sure your routes are really what you own.

Launch a VM

Launch an instance and assign a floating IP. For testing purposes, don’t forget to allow SSH / Ping in a security group and apply that to the instance.

Ping the floating IP!

Bugs/Things we hit
  • Packstack requires ‘default’ network to be present so that it can delete it else install fails [Juno & Kilo]
  • Error 500 “Oops something went wrong” on Horizon when we try to login after session expiry [Kilo]
  • Error 500 OR ISCSI Type Error during install of storage node [Juno]
  • Cinder Failure Due to Authentication [Juno]

Thanks

We learnt quite a bit from the following:

Networking in Too Much Detail“. rdoproject.org. Retrieved Jun/7/2015.

Diving into OpenStack Network Architecture“. Ronen Kofman. Retrived Jun/7/2015

RDO Juno Set up Two Real Node…“. Boris Derzhavets. Retrived Nov/20/2014

OpenStack Juno on CentOS 7“. therandomsecurityguy.com. Retrived Nov/20/2014

OpenStack Documentation“. docs.openstack.org. Retrived Jun/7/2015