Diff for "AutoDistUpgradeTestingSpec"

AutoDistUpgradeTestingSpec

Differences between revisions 1 and 10 (spanning 9 versions)

Launchpad Entry: https://launchpad.net/distros/ubuntu/+spec/auto-dist-upgrade-testing
Created: Date(2006-06-08T07:54:09Z) by MichaelVogt
Contributors: MichaelVogt
Packages affected: update-manager

Summary

Automatic non-interactive testing to see if upgrades from the current to the next release work.

Rationale

During the development of the ReleaseUpgrader it turned out that a lot of bugs were found by users (the hard way) that could have been found with automated testing.

Use cases

Package A overwrites files in package B without declaring a conflict.
Package C has a failing postinst script on the edgy upgrade that only fails if the upgrade happens from the particular version in dapper.
If the package D is installed the upgrade can't be calculated because it declares bad dependencies.

Scope

The following class of bugs should be detected:

post-inst failing
file overwrite problems
bogus conffile prompts
dependency problems ($foo-desktop not installable/upgradable)
held-back packages (e.g. xserver-xorg-driver-$foo)

Design

The DistUpgrade code in update-manager is used for doinng the actual upgrade. A new non-interactive frontend is written which catches the above errors. Besides the automatic mode, there should be a way to quickly feed the application with a single package (or a selection of packages) to test upgradability of this particular set (quite useful to confirm bugreports).

This test is done in a chroot with dpkg-diverted invoke-rc.d. First we build the edgy chroot and then it will automatically upgrade by coping the dist-upgrader to the chroot and run it there. After the upgrade the upgrade logs from $chroot/var/log/dist-upgrade/* are copied and stored. The result of the test is mailed to a new mailing list, test-failures@lists.ubuntu.com.

We need to test the following cases:

{ubuntu,kubuntu,eduubuntu,xubuntu}-desktop upgrade (no other packages)
server mode
all of main (that we can possibly install, report what we can't install
- in parallel)

MattZimmerman: as discussed, we need an algorithm for generating the maximal subset of main which can be installed in parallel. Please document here.

We do this testing for every release architecture.

Before each test we run a simulation with a faked status file and simulate the upgrade to see how it goes. This is much much quicker than the actual upgrade and we can perform more tests for obscure combinations. We can do this using python-apt or aptitude -s. This allows us to catch certain common cases of failure due to dependency problems before performing a time-consuming full upgrade test.

We check for packages that were in main in edgy and installed but get removed by the upgrade (assuming that the removed packages are still in main for feisty). For the rare cases that this is not a bug we use a whitelist.

The test results will be mailed to a new testing mailing list.

Implementation

This will be implemented as an additional frontend to the ReleaseUpgrader + a tool which drives this by building the chroot, copying the right files into place and mailing the results. It will then be deployed on a machine in the datacenter were it will automatically run through a set of tests daily and report any errors as described.

Code

Coding started in the http://people.ubuntu.com/~mvo/bzr/update-manager/non-interactive/ branch. It will be merged into the main dist-upgrader branch eventually.

Future work

The initial code uses a chroot to do the testing, but this has the disadvantage that we don't catch all error (e.g. because we have to divert some binaries like invoke-rc.d). So running it inside XEN is probably a good thing for the future.

Additional tests for the future:

$foo-desktop + selection of popular packages from main (various permutations)
$foo-desktop + selection of popular packages from main+universe
iwj suggests that we randomly choose a uesr's PopularityContest data instead of just a random package

More upgrade scenarios could be auto-tested, like upgrades from stock installs (without any updates from -security, -updates). Upgrades with both -security, -updates. Upgrade with everything (security, updates, backports).

We should consider adding a feature to simulate a upgrade with a users setup. This would perform a non-interactive dist-upgrade in a chroot with the users settings (package selections+/etc) as the base of the setup. We could then ask users for real-world testing without risking broken systems.

The results could also be sent (via http POST) to the ScalableInstallTesting database.

CategorySpec

AutoDistUpgradeTestingSpec (last edited 2008-08-06 16:31:07 by localhost)

-  ⇤ ← Revision 1 as of 2006-06-08 07:54:09 → 
  Size: 1544
  Editor: p54A6673E
  Comment: initial creation
+   ← Revision 10 as of 2006-11-10 01:49:23 → ⇥
  Size: 4850
  Editor: 207
  Comment: refine
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
- * '''Launchpad Entry''': https://launchpad.net/distros/ubuntu/+spec/xdeltas
+ * '''Launchpad Entry''': https://launchpad.net/distros/ubuntu/+spec/auto-dist-upgrade-testing
 Line 6:
- * '''Packages affected''':
+ * '''Packages affected''': update-manager
 Line 10:
-Automatic testing if upgrades from the current to the next release work.
+Automatic non-interactive testing to see if upgrades from the current to the next release work.
 Line 14:
-During the development of the dist-upgrader it turned out that a lot of bugs were found by users (the hard way) that could have been easily found with automated testing (post-inst failing, file overwrite problems).
+During the development of the ReleaseUpgrader it turned out that a lot of bugs were found by users (the hard way) that could have been found with automated testing.
 Line 18:
-Package A overwrites files in package B without declaring a conflict. Package C has a failing postinst script on the dapper upgrade that only fails if the upgrade happens from the particular version in breezy.
+ * Package A overwrites files in package B without declaring a conflict. 
 * Package C has a failing postinst script on the edgy upgrade that only fails if the upgrade happens from the particular version in dapper.
 * If the package D is installed the upgrade can't be calculated because it declares bad dependencies.
-Line 22:
+Line 24:
-Initially it should test upgradability of {ubuntu,kubuntu,xubuntu}-desktop, then most of main, then selected ranges of universe. Failures must be reported automatically via mail.
+The following class of bugs should be detected:
 * post-inst failing
 * file overwrite problems
 * bogus conffile prompts
 * dependency problems ($foo-desktop not installable/upgradable)
 * held-back packages (e.g. xserver-xorg-driver-$foo)
-Line 26:
+Line 33:
-Use the dist-upgrader code for the basic functinality of updating the sources.list, doing the dist-ugprade and catching errors via a non-interactive frontend.
+The DistUpgrade code in update-manager is used for doinng the actual upgrade. A new `non-interactive` frontend is written which catches the above errors. Besides the automatic mode, there should be a way to quickly feed the application with a single package (or a selection of packages) to test upgradability of this particular set (quite useful to confirm bugreports).

This test is done in a chroot with dpkg-diverted invoke-rc.d. First we build the edgy chroot and then it will automatically upgrade by coping the dist-upgrader to the chroot and run it there. After the upgrade the upgrade logs from $chroot/var/log/dist-upgrade/* are copied and stored. The result of the test is mailed to a new mailing list, `test-failures@lists.ubuntu.com`.

We need to test the following cases:
 1. {ubuntu,kubuntu,eduubuntu,xubuntu}-desktop upgrade (no other packages)
 1. server mode
 1. all of main (that we can possibly install, report what we can't install
    in parallel) 

''MattZimmerman: as discussed, we need an algorithm for generating the maximal subset of main which can be installed in parallel.  Please document here.''

We do this testing for every release architecture. 

Before each test we run a simulation with a faked status file and simulate the upgrade to see how it goes. This is much much quicker than the actual upgrade and we can perform more tests for obscure combinations.  We can do this using {{{python-apt}}} or {{{aptitude -s}}}.  This allows us to catch certain common cases of failure due to dependency problems before performing a time-consuming full upgrade test.

We check for packages that were in main in edgy and installed but get removed by the upgrade (assuming that the removed packages are still in main for feisty). For the rare cases that this is not a bug we use a whitelist.

The test results will be mailed to a new testing mailing list.
-Line 30:
+Line 55:
+This will be implemented as an additional frontend to the ReleaseUpgrader + a tool which drives this by building the chroot, copying the right files into place and mailing the results. It will then be deployed on a machine in the datacenter were it will automatically run through a set of tests daily and report any errors as described.
-Line 32:
+Line 59:
-Code was written for the breezy->dapper testing in the http://people.ubuntu.com/~mvo/bzr/update-manager/non-interactive/ branch. This can be used as a basis for the automatic testing.
+Coding started in the http://people.ubuntu.com/~mvo/bzr/update-manager/non-interactive/ branch. It will be merged into the main dist-upgrader branch eventually.
-Line 34:
+Line 61:
-=== Data preservation and migration ===
+== Future work ==
-Line 36:
+Line 63:
-== Outstanding issues ==
+The initial code uses a chroot to do the testing, but this has the disadvantage
that we don't catch all error (e.g. because we have to divert some binaries
like invoke-rc.d). So running it inside XEN is probably a good thing for the future.
-Line 38:
+Line 67:
-== BoF agenda and discussion ==
+Additional tests for the future:
 * $foo-desktop + selection of popular packages from main (various permutations)
 * $foo-desktop + selection of popular packages from main+universe 
 * iwj suggests that we randomly choose a uesr's PopularityContest data instead of just a random package

More upgrade scenarios could be auto-tested, like upgrades from stock installs (without any updates from -security, -updates). Upgrades with both -security, -updates. Upgrade with everything (security, updates, backports).

We should consider adding a feature to simulate a upgrade with a users setup. This would perform a non-interactive dist-upgrade in a chroot with the users settings (package selections+/etc) as the base of the setup. We could then ask users for real-world testing without risking broken systems.

The results could also be sent (via http POST) to the ScalableInstallTesting database.

Ubuntu Wiki