Monday, September 26, 2011

Stacks and stacks of PandaBoards

If you watch our scheduler at http://validation.linaro.org/lava-server/scheduler/ you may have noticed that even though we are increasing the number of continuous integration tests for Android and Linux kernel, the jobs are clearing out much more quickly in the past few days.  We've added infrastructure and boards and now have 24 PandaBoards in the Linaro Validation Farm!  We've also updated our rack design to more efficiently pack a lot more boards into less space, while keeping them accessible and serviceable.  Here'a picture Dave sent me, showing a bit of what he's put in place there.


We did hit a bit of a snag with one thing, and I anticipated this would be an issue quite a ways back.  We use linaro-media-create to construct the images in the same way anyone else using Linaro images would construct them, but running 30 of these in parallel will pretty much drag the server down to a crawl.  I did some quick tests of multiple processes running linaro-media-create locally and the completion time for every l-m-c process running in parallel increases significantly for each new process you add.  Combine this with lots of boards, lots of jobs, and other IO such as database hits, and it can take hours to complete just the image creation, which should only take minutes.  The long term solution is that we are looking at things like celery to distribute big tasks out to other systems.  In the short term, simply serializing the l-m-c processes results in a significant performance increase for all the jobs

Making LAVA more People-Friendly

One of the other new features of LAVA that's worth pointing out is a subtle, but significant step toward making it a little friendlier for those trying to find the results they are looking for.  Internally, LAVA uses things like SHA1 on the bundles, and UUIDs on the test runs to have a unique identifier that can be transferred between systems.  Previously, we displayed this as the name of the link.  If you're looking through a results stream and trying to find the test you just ran on the ubuntu-desktop image with the lt-mx5 hardware pack though, it's not very helpful.  You could, of course, go through the scheduler and link to the results there, but if you just wanted to browse the results in a bundle stream and look at ones that interest you, there was no easy way to do that.

Now, we use the job_name specified in the job you submit to the scheduler to give it a name. What you set the job_name field to, is entirely up to you.  It's all about helping it to mean something to the end user.  In the example above, the stream of results is for daily testing of hardware packs and images.  So the hwpack name, datestamp, image name, and image datestamp are simply used for the job_name.  Kernel CI results, Android CI results, and others will certainly have different names that mean more to them in their context.

Tuesday, September 20, 2011

Configuring LAVA Dispatcher

An important new change will be landing in the release of LAVA Dispatcher this week, and it should be good news to anyone currently deploying the dispatcher. Configuration for your board types and test devices will no longer be in python modules, but in configuration files that you can keep across upgrades.

First off, if you don't have a config, a default will be provided for you. You'll probably want to tell it more about your environment though. If you are configuring it for the whole system, you will probably want to put your configs under /etc/xdg/lava-dispatcher/. If you are developing locally on your machine, you may want to use ~/.config/lava-dispatcher/ instead. 

The main config file is lava-dispatcher.conf.  Here's an example:
#Main LAVA server IP in the boards farm
LAVA_SERVER_IP = 192.168.1.68

#Location for hosting rootfs/boot tarballs extracted from images
LAVA_IMAGE_TMPDIR = /var/www/images/tmp

#URL where LAVA_IMAGE_TMPDIR can be accessed remotely
#PWL - might not be needed
#LAVA_IMAGE_URL_DIR = /images/tmp
LAVA_IMAGE_URL = http://%(LAVA_SERVER_IP)s/images/tmp

#Default test result storage path
LAVA_RESULT_DIR = /lava/results

#Location for caching downloaded artifacts such as hwpacks and images
LAVA_CACHEDIR = /linaro/images/cache

# The url point to the version of lava-test to be install with pip
LAVA_TEST_URL = bzr+http://bazaar.launchpad.net/~linaro-validation/lava-test/trunk/#egg=lava-test

The big things to change here will be the LAVA_SERVER_IP, which should be set to the address where you are running the dispatcher, and the directories.  LAVA_TEST_URL, by default, will point at the lava-test in the trunk of our bzr branch.  This means you'll always get the latest, bleeding edge version.  If you don't like that, you can point it at a stable tarball, or even your own branch with custom modifications.

Next up is device-defaults.conf.  Look at the example under the lava_dispatcher/default-config branch, because it's a bit longer.  Fortunately, most of this can probably go unchanged. You'll want to specify things like the default network interface, command prompts, and client types here.  For most people using Linaro images, this can just remain as-is.

The part you will almost certainly want to customize is in the devices and device-types directories.  First, a device-type
device-types/panda.conf


boot_cmds = mmc init,
    mmc part 0,
    setenv bootcmd "'fatload mmc 0:3 0x80200000 uImage; fatload mmc
    0:3 0x81600000 uInitrd; bootm 0x80200000 0x81600000'",
    setenv bootargs "' console=tty0 console=ttyO2,115200n8
    root=LABEL=testrootfs rootwait ro earlyprintk fixrtc nocompcache
    vram=48M omapfb.vram=0:24M mem=456M@0x80000000 mem=512M@0xA0000000'",
    boot
type = panda

boot_cmds_android = mmc init,
    mmc part 0,
    setenv bootcmd "'fatload mmc 0:3 0x80200000 uImage;
    fatload mmc 0:3 0x81600000 uInitrd;
    bootm 0x80200000 0x81600000'",
    setenv bootargs "'console=tty0 console=ttyO2,115200n8
    rootwait rw earlyprintk fixrtc nocompcache vram=48M
    omapfb.vram=0:24M,1:24M mem=456M@0x80000000 mem=512M@0xA0000000
    init=/init androidboot.console=ttyO2'",
    boot
If you are using a pandaboard with Linaro images, you can probably just use this as it is.

Now to specify a device we want to test on:
devices/panda01.conf

device_type = panda
hostname = panda01
And that's it. You'll want one of those for each board you have, and a device-type config file for each type of device you have. Many thanks to David Schwarz and Michael Hudson-Doyle for pulling this important change together and getting it merged. Oh, and what else for this release? LOTS! But more than I want to include in a single post. I'll try to hit some of the highlights in other postings around the release though. Enjoy :)