Prefetch

Differences between revisions 2 and 3
Revision 2 as of 2007-10-31 16:41:07
Size: 3790
Editor: 12
Comment: bof notes
Revision 3 as of 2007-11-16 13:28:19
Size: 4422
Editor: 195
Comment: turn this into a proper spec
Deletions are marked like this. Additions are marked like this.
Line 6: Line 6:
 * '''Packages affected''':  * '''Packages affected''': kernel, readahead, prefetch
Line 14: Line 14:
This section should include a paragraph describing the end-user impact of this change. It is meant to be included in the release notes of the first release in which it is implemented. (Not all of these will actually be included in the release notes, at the release manager's discretion; but writing them is a useful exercise.)

It is mandatory.
Booting the computer and starting applications now happens faster, due
to the new "prefetch" project. This profiles which files are read in
which order and automatically optimizes the order the data on the hard
disk.
Line 20: Line 21:
This should cover the _why_: why is this change being proposed, what justifies it, where we see this justified. Our current solution, readahead, has some problems which the new
prefetch solves:

 * Boot order needs to be maintained and updated manually. It is prone
 to be forgotten by the developers at release crunch time.
 * Requires updating the CDs, since it has a static profile.
 * Boot time gets worse over time when the disk layout changes, since
 the profile does not update itself.

== History and Features ==

The problem of hard disk reading optimization by was tackled by the
following projects so far:

readahead:
 * introduced into Hoary in 2004, Gutsy still has the very same
 version
 * only solution that has ever been in Ubuntu
 * no automatic profiling
 * profiling steps: inotify the entire file system as very first step
 * in initramfs, boot and record which files are read (and their
 order)
 * profile is saved, put it into the package, and uploaded. This needs
 to happen for at least the beta, RC, and final releases and we must
 not forget about it.
 * high RAM usage

preload:
 * Google SoC 2005
 * Daemon, wakes up every 20 seconds and bursts reads which are
 currently in progress
 * pure user-space solution
 * no profiling necessary
 * inefficient, since much of the data has likely being read already
 in the time between wake-ups

bootcache/filecache:
 * Google SoC 2006
 * introduces sysctl "open by inode number" to speed up file access
 * above point bypasses file permissions and thus is a security risk
 * no automatic profiling
 * shuffling to move things to the disk front, which turned out to be
 non-optimal, and the implementation was bad, too

prefetch:
 * Google SoC 2007 (AutomaticBootAndApplicationPrefetchingSpec)
 * Consists of a kernel patch for automatically profiling boot and application
 startup, and an userspace daemon "prefetch-process-trace" which
 acquires the kernel data and dynamically updates profiles.
 * This speeds up things after booting, too.
 * Purely dynamic profiling.
 * Provides a disk reordering tool to optimize layout for boot. This
 is transactional and can be safely interrupted at any time. When run
 with ionice, it is appropriate for a cron job.
 * Only works for ext3 at the moment, though. (But this is our default
 file system)
 * Tests on a loaded Kubuntu test machine:
   * 1.6s faster startup for Firefox
   * 3 seconds faster startup for OO.o
   * Boot time drops from 52 to 46 seconds
 * Usually faster than with readahead, seldomly a bit slower.
 * Cannot optimize the live system startup time (no solution provides
 that so far).
Line 24: Line 87:
== Assumptions ==  * Steve is Ubuntu's release manager. He is happy that he can drop the
 "update readahead profile" from the release checklist and does not
 need to update CDs for it any more.

 * Joe installs Ubuntu 8.04 and installs a few applications. The
 next time he reboots, Ubuntu and OpenOffice start much faster.
Line 28: Line 96:
You can have subsections that better describe specific parts of the issue.  * The kernel patch has been proposed upstream, but there has not been
 a response yet. It is relatively unintrusive, though, it is just a
 standalone kernel module which goes into linux-ubuntu-modules and the
 initramfs. Get it in early in Hardy for maximum test coverage.

 * Review existing packaging of the user-space tool and bring it into
 Hardy. (Code is on https://launchpad.net/prefetch, documentation is
 on http://code.google.com/p/prefetch/).

 * Drop readahead and the release checklist items.
Line 32: Line 109:
This section should describe a plan of action (the "how") to implement the changes discussed. Could include subsections like:

=== UI Changes ===

Should cover changes required to the UI, or specific UI that is required to implement this

=== Code Changes ===

Code changes should include an overview of what needs to change, and in some cases even the specific details.

=== Migration ===

Include:
 * data migration, if any
 * redirects from old URLs to new ones, if any
 * how users will be pointed to the new way of doing things, if necessary.
Details about the implementation itself are on
AutomaticBootAndApplicationPrefetchingSpec.
Line 51: Line 114:
It's important that we are able to test new features, and demonstrate them to users. Use this section to describe a short plan that anybody can follow that demonstrates the feature is working. This can then be used during CD testing, and to show off after release.

This need not be added or completed until the specification is nearing beta.

== Outstanding Issues ==

This should highlight any issues that should be addressed in further specifications, and not problems with the specification itself; since any specification with problems cannot be approved.

== BoF agenda and discussion ==

current situation since 2004
readahead: inotify fire system, boot, record which files are read
no automatic profile, high RAM usage

2005 SoC: preload: daemon, wakes up every 20 seconds, and bursts reads; purely user -space; inefficient, since likely much of the data has been sucked in already

2006: bootcache/filecache: add sysctl "open by inode number"; no automatic profile itself, evil shuffle solution to move things to disk front, bypassing of file permissions

2007 SoC: AutomaticBootAndApplicationPrefetchingSpec
 * kernel patch for automatically profiling apps at startup, light profile
 * disk reordering tool to optimize layout for boot
 * prefetch-process-trace daemon
 * loaded kubuntu test machine: 1.6s faster for FFox, 14->11 seconds for OO.o, 52->46 seconds boot time
 * usually better than with readahead, seldomly a bit slower

drop readahead, apply kernel patch, include prefetch

 * reordering is transactional, can be interrupted at any time; only works for ext3; cron'able if run with ionice

 * does not change the live system situation (no support with either solution)

 * needs lots of testing and profiling

 * kernel patch has been proposed upstream, no response yet; relatively unintrusive (standalone kernel module for l-u-m initially); get it in early for maximum test coverage

 * user-space tool is already packaged; drop readahead

----
CategorySpec
(TODO when beta is available)

Please check the status of this specification in Launchpad before editing it. If it is Approved, contact the Assignee or another knowledgeable person before making changes.

  • Launchpad Entry: prefetch

  • Packages affected: kernel, readahead, prefetch

Summary

One of our 2007 Summer of Code projects was the development of kernel patches and other necessary parts to prefetch files and speed up the boot sequence. Evaluate and plan to integrate.

Release Note

Booting the computer and starting applications now happens faster, due to the new "prefetch" project. This profiles which files are read in which order and automatically optimizes the order the data on the hard disk.

Rationale

Our current solution, readahead, has some problems which the new prefetch solves:

  • Boot order needs to be maintained and updated manually. It is prone to be forgotten by the developers at release crunch time.
  • Requires updating the CDs, since it has a static profile.
  • Boot time gets worse over time when the disk layout changes, since the profile does not update itself.

History and Features

The problem of hard disk reading optimization by was tackled by the following projects so far:

readahead:

  • introduced into Hoary in 2004, Gutsy still has the very same version
  • only solution that has ever been in Ubuntu
  • no automatic profiling
  • profiling steps: inotify the entire file system as very first step
  • in initramfs, boot and record which files are read (and their order)
  • profile is saved, put it into the package, and uploaded. This needs to happen for at least the beta, RC, and final releases and we must not forget about it.
  • high RAM usage

preload:

  • Google SoC 2005
  • Daemon, wakes up every 20 seconds and bursts reads which are currently in progress
  • pure user-space solution
  • no profiling necessary
  • inefficient, since much of the data has likely being read already in the time between wake-ups

bootcache/filecache:

  • Google SoC 2006
  • introduces sysctl "open by inode number" to speed up file access
  • above point bypasses file permissions and thus is a security risk
  • no automatic profiling
  • shuffling to move things to the disk front, which turned out to be non-optimal, and the implementation was bad, too

prefetch:

  • Google SoC 2007 (AutomaticBootAndApplicationPrefetchingSpec)

  • Consists of a kernel patch for automatically profiling boot and application startup, and an userspace daemon "prefetch-process-trace" which acquires the kernel data and dynamically updates profiles.
  • This speeds up things after booting, too.
  • Purely dynamic profiling.
  • Provides a disk reordering tool to optimize layout for boot. This is transactional and can be safely interrupted at any time. When run with ionice, it is appropriate for a cron job.
  • Only works for ext3 at the moment, though. (But this is our default file system)
  • Tests on a loaded Kubuntu test machine:
    • 1.6s faster startup for Firefox
    • 3 seconds faster startup for OO.o
    • Boot time drops from 52 to 46 seconds
  • Usually faster than with readahead, seldomly a bit slower.
  • Cannot optimize the live system startup time (no solution provides that so far).

Use Cases

  • Steve is Ubuntu's release manager. He is happy that he can drop the "update readahead profile" from the release checklist and does not need to update CDs for it any more.
  • Joe installs Ubuntu 8.04 and installs a few applications. The

    next time he reboots, Ubuntu and OpenOffice start much faster.

Design

  • The kernel patch has been proposed upstream, but there has not been a response yet. It is relatively unintrusive, though, it is just a standalone kernel module which goes into linux-ubuntu-modules and the initramfs. Get it in early in Hardy for maximum test coverage.
  • Review existing packaging of the user-space tool and bring it into

    Hardy. (Code is on https://launchpad.net/prefetch, documentation is on http://code.google.com/p/prefetch/).

  • Drop readahead and the release checklist items.

Implementation

Details about the implementation itself are on AutomaticBootAndApplicationPrefetchingSpec.

Test/Demo Plan

(TODO when beta is available)

DesktopTeam/Specs/Prefetch (last edited 2008-08-06 17:01:43 by localhost)