Tuesday, November 8, 2011

Some Rock Solid things are Quite Beautiful

I mentioned at the closing session of UDS last week, that it was remarkable UDS due to the amount of time spent discussing how we build Ubuntu, not just what we will build. As a community, we have blazed many new trails in software development and delivery, and I feel strongly that we are standing at a nexus where we will be able to collect the wisdom of the past seven years of developing an Open Source Community Distro and apply that wisdom in the future in a way that introduces a step function in our adoption curve as we pursue our goal of a mainstream free desktop.

Most of these "how should we build it" discussions circled around building and maintaining development velocity in 12.04 so that we could add new features that users need while maintaining and delivering the quality they also need. Fortunately, we laid some ground work in the 11.10 cycle. Pete Graner led the QA team in 11.10. Along with some new QA staff, they instituted some important practices, like automatically testing the daily ISO each day, and they set up a QA lab for running tests automatically, along with reporting of those test results.

Acceptance Criteria
How we assess and accept code from upsreams within Canonical was an area ripe for discussion. We arrived at UDS with a strong view about how we should be doing this, and we had three discussions about it at UDS. Jason Warner this area of collective effort "Acceptance Criteria". Please note that for 12.04 and the foreseeable future, this applies ONLY to code developed by Canonical, or was call them "Canonical upstreams". There are two main goals of this effort:
  • Ensure that landing code from Canonical upstreams does not introduce issues that make the development release too hard to use and develop on, thus slowing down velocity for everyone.
  • Encourage Canonical upstreams to fix bugs throughout the cycle, and not to wait until after Feature Freeze to focus all efforts on bug fixing.
As a result, we should see faster development, and a higher quality final product.

Acceptance Criteria means that upstreams have some responsibilities, and Ubuntu has some responsibilities. For upstreams, it boils down to "treat your trunk as sacred". Practically, it requires:
  • There is a trunk of code bound for Ubuntu.
  • This trunk always builds automatically.
  • This trunk has tests that are always passing automatically.
  • All branches are properly reviewed for having both good tests and good implementation before merged into trunk.
  • Any branch that breaks trunk by causing automated tests to fail or causes trunk to stop building, are immediately reverted.
For Ubuntu Engineering, the responsibilities include:
  • Every maintainer in Ubuntu must have a test plan for upstream trunks that are run before uploading to the development release.
  • Tests in the test plan that are automated can be run with the help of the QA team.
  • Tests in the test plan that are manual can be run with the help of the community team, (and the community QA Lead that is to be hired)
  • Refrain from uploading a trunk into Ubuntu if there are serious bugs found in testing that will slow down people using the development release for testing and development.
  • Revert uploads that break Ubuntu, as there is no point in having the latest of a trunk in Ubuntu if it's broken and just slowing everyone down.
  • Add tests to upstream projects for the Ubuntu test plan if serious bugs do get through that cause a revert.
ISO integration testing
Every day the QA team automatically runs tests on the ISO produced that day, if any. This was set up in 11.10. For 12.04, we will expand on this effort substantially. First, by growing the body of tests run. Secondly, by automatically reporting on the quality of the ISO each day. Finally, by responding to the results of the tests immediately, see the next section.

Daily Quality
We will strive to ensure that we have a new daily ISO each day. If the QA team finds an ISO to be "untestable" or failing critical tests that will hamper development velocity that day, we will respond by trying to fix the issues. For issues that cannot be foreseeably fixed within 3 hours, we will typically back out those changes. After the issue is addressed by being fixed or backed out, we *will spin another ISO*. We will collect metrics on what percentage of days we have a working and testable ISO.

Pre-archive Testing
Of course, catching problems in ISO testing and fixing them everyday is nice, but stopping the problems from reaching Ubuntu in the first place is even better. With that end in mind, Evan Dandrea ran a very interesting session about testing library and other changes before uploading them to the archive for the development release.

This will start out simple. The QA team will be able to install the latest ISO in their test environment, then pull an updated package from a PPA. A tool that I lovingly named "tool 2" will be created that will use rdepends to find packages that both depend on the newly upagraded package and also have tests worth running. Tool 2 will then run those tests and report the results. In this way, issues with libraries and other transitions can be fixed before they get into the development release and slow everyone down.

The next step, which I sincerely hope we get to during 12.04 development is to make tool 1. Tool 1 takes the output of tool 2, and judges if it should copy the newly updated package, or some set of packages, into the release archive. If we get tool 2 set up, then uploads to the development release would first go into the proposed archive for the development release, tested there, and only added to the release archive when found to be "ok" by the tests.

+1 Maintenance Team
Colin Watson is leading an experiment for 12.04 development. He will be leading a small staff of rotating engineers who are focused soley on the stability of the development release. We plan to learn from this effort, and see if we should repeat it, tweak it, or drop it for future versions. In any case, these efforts are meant to enhance, not replace, the generally diffused responsibilities for quality of the ISO and the archive. Colin led a good UDS session on the topic. The priorities defined there being:
  1. Deal with ISO smoke test issues, includes install images being buildable
  2. Upgrades through development releases work day to day, look at conflict checker
  3. All packages in main are installable
  4. All packages in main are buildable
  5. Component-mismatches / MIRs / etc.
  6. Finishing library/NBS transitions through archive (beyond main)
  7. All packages in the archive are buildable
Summary
All in all, these are a lot of structural changes to how we approach building Ubuntu and ensuring the quality of it. Here is a table to highlight some of the key changes.


Practice 11.10 12.04
Canonical upstream automated tests prevelant in some projects, not others all projects will have automated tests
Canonical upstream automated builds prevelant in some projets, not others all projects will ahve automated tests
Canonical upstream branch reviews all projects all projects
Canonical upstreams reverting branches that "break" trunk occurs in some projects, not others, not always
all projects will revert branches that break trunk
testing of Canonical upstream code before uploading to development release not common or systematic all Canonical upstream projects will have a test plan that is executed before each upload
reverting Canonical upstream code from the development release development release waits for an upload with "fixes" uploads that slow velocity for others are backed out
Daily ISO testing Limited test coverage, test failures not immediately responded to more test coverage, fix issues and respin within the same day
Daily ISO Can go for days or longer wihtout a working daily A broken daily ISO becomes the #1 priority for whoever broke it
Pre-archive testing "Call for testing", not totally systematic, no way to know what tests were run Add ability to run tests for rdpends before release, potentially test in proposed for the development release
Archive Maintenance Responibility diffused throughout team Responibility diffused throughout team, complimented by a small dedicated crew

2 comments:

  1. Do we have a standard way for packages to declare their tests? Or is it be a part of the package build process?

    I'm trying to figure out how we'll know what tests to run when we search the rdepends for pre-archive testing.

    ReplyDelete
  2. @scottrichie http://dep.debian.net/deps/dep8/ is what we'll be using as a basis for the pre-archive testing.

    ReplyDelete