[Openstack] Xen image starts Kernel Panic in Diablo

Rogério Vinhal Nunes rogervn at dcc.ufmg.br
Thu Oct 6 14:30:19 UTC 2011


I've got it working right now using FlatDHCPManager. :)
The problem with the tty's is that there's no tty configured to the instance
image, but I can access it quite fine with ssh, both the ttylinux and the
Ubuntu image. I may open a bug later to documentate the change I had to do
with libvirt.xml.template for it to work.

Just clear me out on this, why wouldn't FlatManager work injecting the fixed
IP? Is it fine to use FlatDHCPManager with just one interface? I have a
controller, two compute nodes and a gateway in a simple L2 network. The
controller bridge now seems to have 2 IPs because it responds both pings.

I'm still having some trouble with live migration, but I will do some more
research on it before and then, if I can't get it to work, I'll create
another topic regarding it.

Thanks for the help. :)

Em 5 de outubro de 2011 11:16, Rogério Vinhal Nunes
<rogervn at dcc.ufmg.br>escreveu:

> It's not clear to me why should I be using DHCP. I've configured the system
> to use the simple FlatManager, shouldn't it inject the desired IP in the
> image when the instance spawns? Is there a DHCP running even if I don't use
> Vlan or FlatDHCP?
>
> I've actually used tcpdump to look for DHCP packages and found out that the
> instances are indeed sending broadcast requests and having no response.
>
> My system is a cloud controller with two compute nodes, shouldn't I be able
> to set it up with FlatManager? My configuration was generated by the
> nova-install script and I did some changes as I needed to make it work. My
> bridge is configured to xenbr0 in the database and also in the system.
>
> This is the controller nova.conf:
>
> --dhcpbridge_flagfile=/etc/nova/nova.conf
> --dhcpbridge=/usr/bin/nova-dhcpbridge
> --logdir=/var/log/nova
> --state_path=/var/lib/nova
> --lock_path=/var/lock/nova
> --verbose
> --s3_host=127.0.0.1
> --rabbit_host=10.0.254.6
> --cc_host=10.0.254.6
> --ec2_url=http://10.0.254.6:8773/services/Cloud
> --fixed_range=10.0.10.0/24
> --network_size=100
> --FAKE_subdomain=ec2
> --routing_source_ip=10.0.254.6
> --verbose
> --sql_connection=mysql://root:root@10.0.254.6/nova
> --network_manager=nova.network.manager.FlatManager
> --flat_network_bridge=xenbr0
> --glance_api_servers=10.0.254.6:9292
> --image_service=nova.image.glance.GlanceImageService
>
> And this is the compute nodes nova.conf:
>
> --dhcpbridge_flagfile=/etc/nova/nova.conf
> --dhcpbridge=/usr/bin/nova-dhcpbridge
> --logdir=/var/log/nova
> --state_path=/var/lib/nova
> --lock_path=/var/lock/nova
> --verbose
> --s3_host=10.0.254.6
> --rabbit_host=10.0.254.6
> --cc_host=10.0.254.6
> --ec2_url=http://10.0.254.6:8773/services/Cloud
> --sql_connection=mysql://root:root@10.0.254.6/nova
> --network_manager=nova.network.manager.FlatManager
> --flat_network_bridge=xenbr0
> --libvirt_type=xen
> --xenapi_remap_vbd_dev=true
> --glance_api_servers=10.0.254.6:9292
> --image_service=nova.image.glance.GlanceImageService
> --nouse_cow_images
>
>
> Thanks for the support, I may be a little more confused about the
> configuration than I thought.
>
> Em 4 de outubro de 2011 18:24, Vishvananda Ishaya <vishvananda at gmail.com>escreveu:
>
> Yes that is the rule.  But that rule is not going to work if you don't
>> receive an ip address via dhcp.  So you need to make sure the dhcp piece is
>> working.  My guess is that once you get dhcp working, the metadata rule will
>> work since it looks like it is being created correctly.
>>
>> Vish
>>
>> On Oct 4, 2011, at 12:38 PM, Rogério Vinhal Nunes wrote:
>>
>> I don't think the ubuntu image is expecting xvda anywhere. The disk is sda
>> and after I've changed the libvirt.xml.template, so is "root=" option. The
>> image is the ubuntu localimage mentioned in the documentation.
>>
>> I've tried to "telnet 169.254.169.254 32" in the compute node and it
>> didn't find anything while "telnet 10.0.254.6 8773" does respond. This is
>> the part of the iptables-save that messes with the routing of nova, did the
>> Diablo configuration also change regarding to this?
>>
>> -A PREROUTING -j nova-compute-PREROUTING
>> -A PREROUTING -d 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT
>> --to-destination 10.0.254.6:8773
>> -A POSTROUTING -j nova-compute-POSTROUTING
>>
>> I understand that it should forward anything destined to
>> 169.254.169.254/32 port 80 to 10.0.254.6:8773. But that isn't happening
>> even with telnet. Does the nova-compute-PREROUTING rule have a part in this?
>> The configuration I've done is just running "iptables -A PREROUTING -d
>> 169.254.169.254/32 -p tcp -m tcp --dport 80 -j DNAT --to-destination
>> 10.0.254.6:8773" .
>>
>> Em 4 de outubro de 2011 16:22, Vishvananda Ishaya <vishvananda at gmail.com>escreveu:
>>
>>> It looks like your dhcp is failing for some reason.  There are a number
>>> of things that could theoretically cause this.   You might start using
>>> tcpdump to find out if the dhcp request packet is coming out of the vm and
>>> if it is being responded to by dnsmasq on the nova-network host.  I'm not
>>> sure about the ubuntu image, is it expecting xvda there?
>>>
>>> On Oct 4, 2011, at 12:02 PM, Rogério Vinhal Nunes wrote:
>>>
>>> That pretty much solved the disk image problem, thanks. You really should
>>> put that in the official documentation, that is no trace of that option in
>>> it.
>>>
>>> But it's still not working. After that I had to change the
>>> libvirt.xml.template to use sda instead of xvda in the root option. There
>>> should be sda in the root option or xvda in the disk option, mixing both of
>>> them, as of is happening right now, will never work.
>>>
>>> Even after this small setup, it won't work. ttylinux image boots and
>>> complains about not being able to reach IP 169.254.169.254 (I've done the
>>> PREROUTING iptables configuration in the compute nodes), and stalls just
>>> after the "setting shared object cache" with no console opened:
>>>
>>> -----------
>>> startup crond  [  OK  ]
>>> wget: can't connect to remote host (169.254.169.254): Network is
>>> unreachable
>>> cloud-userdata: failed to read instance id
>>> ===== cloud-final: system completely up in 31.15 seconds ====
>>> wget: can't connect to remote host (169.254.169.254): Network is
>>> unreachable
>>> wget: can't connect to remote host (169.254.169.254): Network is
>>> unreachable
>>> wget: can't connect to remote host (169.254.169.254): Network is
>>> unreachable
>>>   instance-id:
>>>   public-ipv4:
>>>   local-ipv4 :
>>> => First-Boot Sequence:
>>> setting shared object cache [running ldconfig]  [  OK  ]
>>> -----------
>>>
>>> The ubuntu image does almost the same, but nothing to do with networking,
>>> it mounts the ext4 filesystem and then just hangs with no console:
>>>
>>> -----------
>>> [    0.160477] md: ... autorun DONE.
>>> [    0.160659] EXT3-fs (sda): error: couldn't mount because of
>>> unsupported optional features (240)
>>> [    0.160909] EXT2-fs (sda): error: couldn't mount because of
>>> unsupported optional features (240)
>>> [    0.161908] EXT4-fs (sda): mounted filesystem with ordered data mode.
>>> Opts: (null)
>>> [    0.161930] VFS: Mounted root (ext4 filesystem) readonly on device
>>> 202:0.
>>> [    0.178088] devtmpfs: mounted
>>> [    0.178195] Freeing unused kernel memory: 828k freed
>>> [    0.178425] Write protecting the kernel read-only data: 10240k
>>> [    0.183386] Freeing unused kernel memory: 308k freed
>>> [    0.184074] Freeing unused kernel memory: 1612k freed
>>> mountall: Disconnected from Plymouth
>>> ------------
>>>
>>> Both of them are assigned IPs for the configured nova-network in the
>>> "euca-describe-instances", but none of them ping back.
>>>
>>> I feel I'm getting really close to get this working. If you guys could
>>> lend me a little more help I would be very much appreciated.
>>>
>>> Em 4 de outubro de 2011 09:12, Vishvananda Ishaya <vishvananda at gmail.com
>>> > escreveu:
>>>
>>>> You may need to set --nouse_cow_images
>>>> Sounds like your image might be a copy on write qcow2 with a backing
>>>> file. You can verify that with qemu-img info /var/lib/nova/instances/disk
>>>> That kind of image won't work with xen.
>>>> On Oct 3, 2011 9:44 AM, "Rogério Vinhal Nunes" <rogervn at dcc.ufmg.br>
>>>> wrote:
>>>> > Hey guys, I'm still trying to get this working, but I still don't
>>>> understand
>>>> > what's happening.
>>>> >
>>>> > In the ttylinux busybox I do a fdisk -l and it says the disk is only
>>>> 18 MB
>>>> > large and doesn't have a valid partition table:
>>>> >
>>>> > ------------
>>>> > / # fdisk -l
>>>> >
>>>> > Disk /dev/sda: 18 MB, 18874368 bytes
>>>> > 255 heads, 63 sectors/track, 2 cylinders
>>>> > Units = cylinders of 16065 * 512 = 8225280 bytes
>>>> >
>>>> > Disk /dev/sda doesn't contain a valid partition table
>>>> > ------------
>>>> >
>>>> > When looking in the instance directory as I said before, the image is
>>>> only
>>>> > 18 MB large (while I think ttylinux should be 24 MB), this may be a
>>>> problem.
>>>> > I'm using glance as a image server and mounted the /var/lib/instances
>>>> using
>>>> > NFS from the cloud controller.
>>>> >
>>>> > What can I do to get more information? I need to get this
>>>> configuration
>>>> > working.
>>>> >
>>>> > Em 28 de setembro de 2011 15:50, Rogério Vinhal Nunes
>>>> > <rogervn at dcc.ufmg.br>escreveu:
>>>> >
>>>> >> Yes, I've tried the ttylinux right now, it starts the instance, but
>>>> it
>>>> >> booted up a busybox, probably a recover from initrd (see output in
>>>> the end
>>>> >> of this e-maill). I can access the instance by doing a xl console in
>>>> the
>>>> >> host, describe-instances shows the status "running test".
>>>> >>
>>>> >> I've successfully booted a separated vm with an old image I used with
>>>> Xen
>>>> >> with Xen + libvirt just changing the openstack's libvirt.xml. It just
>>>> works
>>>> >> fine.
>>>> >>
>>>> >> The instance dir in /var/lib/nova/instances files look like this:
>>>> >>
>>>> >> 0 -rw-r----- 1 nova nogroup 0 2011-09-28 15:43 console.log
>>>> >> 15M -rw-r--r-- 1 nova nogroup 18M 2011-09-28 15:43 disk
>>>> >> 4,3M -rw-r--r-- 1 nova nogroup 4,3M 2011-09-28 15:43 kernel
>>>> >> 4,0K -rw-r--r-- 1 nova nogroup 1,3K 2011-09-28 15:43 libvirt.xml
>>>> >> 5,7M -rw-r--r-- 1 nova nogroup 5,7M 2011-09-28 15:43 ramdisk
>>>> >>
>>>> >> this is the last output I get when I get into the instance console:
>>>> >>
>>>> >> [ 0.078066] blkfront: sda: barriers enabled
>>>> >> [ 0.078394] sda: unknown partition table
>>>> >> [ 0.170040] XENBUS: Device with no driver: device/vkbd/0
>>>> >> [ 0.170051] XENBUS: Device with no driver: device/vfb/0
>>>> >> [ 0.170056] XENBUS: Device with no driver: device/console/0
>>>> >> [ 0.170074] Magic number: 1:252:3141
>>>> >> [ 0.170114] /build/buildd/linux-2.6.35/drivers/rtc/hctosys.c: unable
>>>> to
>>>> >> open rtc device (rtc0)
>>>> >> [ 0.170122] BIOS EDD facility v0.16 2004-Jun-25, 0 devices found
>>>> >> [ 0.170127] EDD information not available.
>>>> >> [ 0.170259] Freeing unused kernel memory: 828k freed
>>>> >> [ 0.170460] Write protecting the kernel read-only data: 10240k
>>>> >> [ 0.173975] Freeing unused kernel memory: 308k freed
>>>> >> [ 0.174481] Freeing unused kernel memory: 1620k freed
>>>> >> badness occurred in ramdisk
>>>> >>
>>>> >>
>>>> >> BusyBox v1.15.3 (Ubuntu 1:1.15.3-1ubuntu5) built-in shell (ash)
>>>> >> Enter 'help' for a list of built-in commands.
>>>> >>
>>>> >> /bin/sh: can't access tty; job control turned off
>>>> >> / #
>>>> >> / #
>>>> >>
>>>> >>
>>>> >> Em 28 de setembro de 2011 14:06, Joshua Harlow <
>>>> harlowja at yahoo-inc.com>escreveu:
>>>> >>
>>>> >> Can u try with the ttylinux images and see if those work for you?
>>>> >>>
>>>> >>> I know when I tried it I had to adjust the libvirt xml was creating
>>>> (which
>>>> >>> may have not been the right solution) to get those to work.
>>>> >>>
>>>> >>> I think the ttylinux ones might work better (from the last time I
>>>> tried).
>>>> >>>
>>>> >>>
>>>> >>> On 9/27/11 7:11 PM, "Todd Deshane" <todd.deshane at xen.org> wrote:
>>>> >>>
>>>> >>> 2011/9/27 Rogério Vinhal Nunes <rogervn at dcc.ufmg.br>:
>>>> >>> > Hello, I've upgraded to Diablo to see if this issue was resolved,
>>>> but
>>>> >>> > apparently it isn't.
>>>> >>> >
>>>> >>> > There is already a thread talking about it, but it didn't come to
>>>> a
>>>> >>> solution
>>>> >>> > that I could use. After having Openstack configured with Xen and
>>>> libvirt
>>>> >>> in
>>>> >>> > Ubuntu 10.04 whenever I run an instance it is started, but it
>>>> stops with
>>>> >>> a
>>>> >>> > kernel panic trying to mount root by using xvda, but sda is the
>>>> only
>>>> >>> > available.
>>>> >>> >
>>>> >>> > I'm using Diablo's nova + glance and the
>>>> >>> > ubuntu1010-UEC-localuser-image.tar.gz from the manual.
>>>> >>> >
>>>> >>> > The kernel panic is like this:
>>>> >>> >
>>>> >>> > [ 0.170563] VFS: Cannot open root device "xvda" or
>>>> unknown-block(0,0)
>>>> >>> > [ 0.170572] Please append a correct "root=" boot option; here are
>>>> the
>>>> >>> > available partitions:
>>>> >>> > [ 0.170585] ca00 32768 sda driver: vbd
>>>> >>> > [ 0.170594] Kernel panic - not syncing: VFS: Unable to mount root
>>>> fs
>>>> >>> on
>>>> >>> > unknown-block(0,0)
>>>> >>> > [ 0.170604] Pid: 1, comm: swapper Not tainted 2.6.35-24-virtual
>>>> >>> > #42-Ubuntu
>>>> >>> >
>>>> >>> > I've played with libvirt.xml.template, it changed a lot since
>>>> Cactus, so
>>>> >>> I
>>>> >>> > tried to replace xvda with sda, and the kernel panic didn't go
>>>> away,
>>>> >>> just
>>>> >>> > changed a little:
>>>> >>> >
>>>> >>> > [ 0.161237] List of all partitions:
>>>> >>> > [ 0.161248] ca00 32768 sda driver: vbd
>>>> >>> > [ 0.161257] No filesystem could mount root, tried: ext3 ext2 ext4
>>>> >>> > fuseblk
>>>> >>> > [ 0.161275] Kernel panic - not syncing: VFS: Unable to mount root
>>>> fs
>>>> >>> on
>>>> >>> > unknown-block(202,0)
>>>> >>> > [ 0.161286] Pid: 1, comm: swapper Not tainted 2.6.35-24-virtual
>>>> >>> > #42-Ubuntu
>>>> >>> >
>>>> >>>
>>>> >>> Are you able to start a guest manually with Xen + libvirt (without
>>>> >>> OpenStack)?
>>>> >>>
>>>> >>> That's the first step to debugging this issue.
>>>> >>>
>>>> >>> > I've tried using --xenapi_remap_vbd_dev=true, but it didn't work
>>>> (it
>>>> >>> > wouldn't also because I'm using libvirt, not xenapi).
>>>> >>> >
>>>> >>>
>>>> >>> Would XCP or XenServer work for you in general? (The XenAPI-based
>>>> >>> hypervisors are more tested and even have more features compared
>>>> with
>>>> >>> libvirt - http://wiki.openstack.org/XenAPI)
>>>> >>>
>>>> >>>
>>>> >>> Or even Project Kronos (also uses XCP/XenServer toolstack)
>>>> >>> http://blog.xen.org/index.php/2011/07/22/project-kronos/
>>>> >>>
>>>> >>> Thanks,
>>>> >>> Todd
>>>> >>>
>>>> >>> --
>>>> >>> Todd Deshane
>>>> >>> http://www.linkedin.com/in/deshantm
>>>> >>> http://www.xen.org/products/cloudxen.html
>>>> >>> http://runningxen.com/
>>>> >>>
>>>> >>> _______________________________________________
>>>> >>> Mailing list: https://launchpad.net/~openstack
>>>> >>> Post to : openstack at lists.launchpad.net
>>>> >>> Unsubscribe : https://launchpad.net/~openstack
>>>> >>> More help : https://help.launchpad.net/ListHelp
>>>> >>>
>>>> >>>
>>>> >>
>>>>
>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111006/fd160ed9/attachment.html>


More information about the Openstack mailing list