Prefetch

Revision 4 as of 2007-11-22 09:36:32

Clear message

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

  • Launchpad Entry: prefetch

  • Packages affected: kernel, readahead, prefetch

Summary

One of our 2007 Summer of Code projects was the development of kernel patches and other necessary parts to prefetch files and speed up the boot sequence. Evaluate and plan to integrate.

Release Note

Booting the computer and starting applications now happens faster, due to the new "prefetch" project. This profiles which files are read in which order and automatically optimizes the order the data on the hard disk.

Rationale

Our current solution, readahead, has some problems which the new prefetch solves:

  • Boot order needs to be maintained and updated manually. It is prone to be forgotten by the developers at release crunch time.
  • Requires updating the CDs, since it has a static profile.
  • Boot time gets worse over time when the disk layout changes, since the profile does not update itself.

History and Features

The problem of hard disk reading optimization by was tackled by the following projects so far:

readahead:

  • introduced into Hoary in 2004, Gutsy still has the very same version
  • only solution that has ever been in Ubuntu
  • no automatic profiling
  • profiling steps: inotify the entire file system as very first step
  • in initramfs, boot and record which files are read (and their order)
  • profile is saved, put it into the package, and uploaded. This needs to happen for at least the beta, RC, and final releases and we must not forget about it.
  • high RAM usage

preload:

  • Google SoC 2005
  • Daemon, wakes up every 20 seconds and bursts reads which are currently in progress
  • pure user-space solution
  • no profiling necessary
  • inefficient, since much of the data has likely being read already in the time between wake-ups

bootcache/filecache:

  • Google SoC 2006
  • introduces sysctl "open by inode number" to speed up file access
  • above point bypasses file permissions and thus is a security risk
  • no automatic profiling
  • shuffling to move things to the disk front, which turned out to be non-optimal, and the implementation was bad, too

prefetch:

  • Google SoC 2007 (AutomaticBootAndApplicationPrefetchingSpec)

  • Consists of a kernel patch for automatically profiling boot and application startup, and an userspace daemon "prefetch-process-trace" which acquires the kernel data and dynamically updates profiles.
  • This speeds up things after booting, too.
  • Purely dynamic profiling.
  • Provides a disk reordering tool to optimize layout for boot. This is transactional and can be safely interrupted at any time. When run with ionice, it is appropriate for a cron job.
  • Only works for ext3 at the moment, though. (But this is our default file system)
    • ColinWatson: This process is not required according to Scott (it just provides a little extra speed over and above the rest of prefetch), and the upstream README file for this tool currently says "BEWARE!!! e2remapblocks is in experimental state and can destroy your filesystem. Backup if you value your data!" This part of the specification should be optional and only implemented if the upstream confidence level increases and we audit it very carefully.

  • Tests on a loaded Kubuntu test machine:
    • 1.6s faster startup for Firefox
    • 3 seconds faster startup for OO.o
    • Boot time drops from 52 to 46 seconds
  • Usually faster than with readahead, seldomly a bit slower.
  • Cannot optimize the live system startup time (no solution provides that so far).
    • ColinWatson: Since it seems that prefetch would provide almost all of this (all we really need is a roughly sorted list of files to feed to mksquashfs; the rest of the code was written some time ago), if possible I would like one of the outputs of this specification to be a method of generating a sorted list of files that can be fed to other tools. (If that turns out not to be possible with prefetch, that's fine, but it seems straightforward to try.)

Use Cases

  • Steve is Ubuntu's release manager. He is happy that he can drop the "update readahead profile" from the release checklist and does not need to update CDs for it any more.
  • Joe installs Ubuntu 8.04 and installs a few applications. The

    next time he reboots, Ubuntu and OpenOffice start much faster.

Design

  • The kernel patch has been proposed upstream, but there has not been a response yet. It is relatively unintrusive, though, it is just a standalone kernel module which goes into linux-ubuntu-modules and the initramfs. Get it in early in Hardy for maximum test coverage.
  • Review existing packaging of the user-space tool and bring it into

    Hardy. (Code is on https://launchpad.net/prefetch, documentation is on http://code.google.com/p/prefetch/).

  • Drop readahead and the release checklist items.

Implementation

Details about the implementation itself are on AutomaticBootAndApplicationPrefetchingSpec.

Test/Demo Plan

(TODO when beta is available)