SystemCleanUpTool

Differences between revisions 26 and 77 (spanning 51 versions)
Revision 26 as of 2006-06-22 20:54:11
Size: 13128
Editor: ALagny-109-1-2-101
Comment: more fixes
Revision 77 as of 2008-08-06 16:16:56
Size: 15731
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:
 * '''Created''': [[Date(2006-06-08T17:14:13Z)]] by SivanGreen  * '''Created''': <<Date(2006-06-08T17:14:13Z)>> by SivanGreen
Line 11: Line 11:
This specification discusses implemeting a tool , that would suggest to a user several ways to keep his system running undisurbed with a minimum amount of intervention on his side. On the list of ways This specification discusses implementing a computer house keeping tool. This tool will offer a user several ways to keep his system from getting too full, cluttered and confusing to use over time. This tool will attempt to require as little intervention as possible by the user. This should result in a running system always kept tidy, easy and enjoyable to use.
Line 15: Line 15:
In due time, a a newly installed system can become cluttered with all sorts of residual content. This can be left over dotfiles that are no longer used cluttering home directories, too many installed kernels, package dependencies that are no longer needed (since their dependant is no longer installed) or log files taking up precious disk space. This sort of content tends to accumulate over time, confusing the user, or even eventually lead to the computer becoming unusable, forcing users to actively put effort into cleaning it up.
There should be a solution to warn users before hand, and suggest and carry purge operations for common left over content.
In due course, a once fresh system can become cluttered with all sorts of residual content, such as too many installed kernels, log files taking up precious system disk space, contents of the Trash folder, packaging and browser cache, and various large and small user files such as audio visual content, aging documents, old chat logs and more. These tend to accumulate over time, confusing the user, or even eventually leading to the computer becoming unusable, forcing users to actively put effort into cleaning it up. There should be a solution to warn users beforehand, and suggest and perform purge operations for common leftover cruft.
Line 20: Line 19:
 * Brian is a Launchpad developer. For developing Launchpad he has several Zope instances installed, and the PostgreSQL data base server running. Those applications produce alot of log file data, especially under heavy development and experimentation, where they are usually set to maximum verbosity for debugging purposes. After a week of heavy work, his free space on his / fs reaches the low minimum. The system clean up tool detects that, and before Brian is running into operation problems, it offers him to free some old logs, and some files under /tmp that occupie most of the space currently on his / fs. He confirms the removal, and resumes working (the wizard takes care to do the removal in the background, first hunting for the biggest files, in order to maintain operational state of the system in the shortest possilbe time). The interaction with Brian is done through the desktop notification infrastructure.  * George has been receiving lots of audio visual content recently, from relatives overseas. He has been burning these to DVDs, and put each file in the garbage bin after writing it to DVD. After a while, the free space in his home filesystem has dropped to the minimum allowed, but he does not notice. He is also downloading a big ISO image of the edgy desktop CD for testing. The system clean up tool detects that there is not enough free space, and pops up a desktop notification bubble, suggesting that there is large amount of data in .Trash that can be purged to make more room. George acknowledges; space is freed and the download is saved.
Line 22: Line 21:
 * George has been reciving lots of audio visual content recently, from relatives overseas. He burnt them all to DVDs, and moved to the garbage bin every file he had burnt. After a while of doing so, he does not notice his free space in his home folder dropped to the minimum allowed. He is also downloading a big iso file of the edgy desktop-cd for testing. The system clean up tool detects that there is not enough free space, and pops up a desktop notification for George, suggesting that there is large amount of data in .Trash, that can be purged, to make more room. George acknowledges, space is freed and the download is saved.  * '''High Priority''': John is a Dapper user. A recent kernel upgrade has been released to cater for a security bug. After finishing to install the new kernel, and before rebooting to use it, the system informs him that he has some left over kernel packages that could be removed and asks him if he wants to do so. If he acknowledges, all of the kernels that are no longer needed are removed, leaving him with a clean /boot with only the currently running kernel and the newly installed one. After he reboots to use the new kernel, he could always revert back to the previous one that was kept, if for some reason the new kernel is faulty.
Line 24: Line 23:
 * David had an issue after burning a DVD where the drive would not unlock. In his wisdom, he used a paper clip to force the tray to open and retrieved the disk. Little did David know that the kernel.log and messages.log were filling up with information about the bad IDE communication thingo until the system clean up tool detects that the hard drive is filling up and provides a notification that the data can be purged to make more room. David acknowledges, space is freed and the system operates normally.

 * Dan has been using ubuntu for some time now, and has been experimenting with the wealth of software packages available from the various ubuntu repositories. He is not aware that many of the package pulled in as dependencies are not removed when he chooses to remove the package he originally installed. When disk space becomes low, the tool checks to see if there are any packages that can be good candidates for removal (orphaned, missing dependencies etc) and asks if he would like to have them removed. He acknowledges, and those package get removed, bringing back the lost disk space.

 * '''High Priority''':John is a dapper user. A recent kernel upgrade has been released to cater for a security bug. After rebooting to his new kernel, the system informs him that he has some left over kernel packages that could be removed and asks him if he wants to do so. If he acknowledges, all of the kernels that he no-longer needs are removed, leaving him with a clean /boot with only the currently running kernel and the newly installed one.
 * Brian is a Launchpad developer. He has several Zope instances installed for developing Launchpad, and runs the PostgreSQL database server. These applications produce a lot of log file data, especially when used for heavy development and experimentation; in this case they are usually set to maximum verbosity for debugging purposes. After a week of heavy work, his free space on his root filesystem reaches the low minimum. The system clean up tool detects the problem, and before Brian is running into operational problems, it offers to delete some old logs and some files in /tmp that occupy most of the currently used space. He confirms the removal, and resumes his work. The wizard takes care to do the removal in the background, first hunting for the biggest files, in order to keep the system operational and make more space in the shortest possible time. The interaction with Brian is done through the desktop notification infrastructure.
Line 37: Line 31:
     -- How can we know if it's such a pakcage ??      -- How can we know if it's such a package?
Line 39: Line 33:
   1. Conffiles,    1. Conffiles.
Line 41: Line 35:
   1. Dotfiles that were introduced using a package and their parent package is no longer on the system.
 * General letf over content:
   1. Browser caches,
 * Packaging system leftovers:
   1. Contents of /var/cache/apt/archives.
   1. Orphaned files.
 * General left over content:
   1. Web / File Browser caches. (e.g. ~/.thumbnails)
Line 46: Line 42:
   1. Orphan dotfiles.
Line 48: Line 43:
   1. Content of /tmp
   1. Content of /var/log
Line 51: Line 48:
'''Definition:''' The '''aging''' time of a file is defined as the result of subtracting the last access time of the particular file, from the time stamp used as a reference point. The reference point time stamp can be the current date and time of the system, or an eariler time in order to enable accurate aging calculation when a system hasn't been used for a long time. '''Aging''': The time period that had passed between the last time a file has been accessed, and the recorded time reference point. The reference point can be the current date and time reported by the system, or an eariler time in order to enable more accurate aging calculation when a system hasn't been used over a long period of time.
Line 53: Line 50:
1.'''Dealing with general Left over content:''' 1.'''Dealing with general left over content:'''
Line 55: Line 52:
A weighing algorithm needs to be developed to enable the tool to identify targets of opportunity. The following factors needs to be taken in consideration in producing the weight result per file:
 1. A relative time stamp for measuring the aging time of all files must be used. Using a
A weighing algorithm needs to be developed to enable the tool to identify targets of opportunity. The following factors need to be taken in consideration in producing the weight result per file:
 1. A relative time reference point should be used for measuring the aging time of all files. This is in order to overcome the "vacation problem" where a user hasn't been using his system for a long period of time, and by using the current time when he first login after his vacation, the weighing would get distorted to include files that the user accessed just before he went on vacation. This means that we need to measure the actual usage time. To do so, we will record the last access time of files that are accessed every login (for example, gdm files) and use this_login-1 's time stamp as our new reference point.
Line 58: Line 55:
 1. Which mime / file type.  1. MIME / file type.
Line 62: Line 59:
While calculating its opportunities, this alogrithm should make sure its not being mislead by raw agin
 1. The vacation problem. If a user goes on vacation and does not access his computer for a relatively long period of time, then we have to make sure that the wizard doesnt suggest files to be removed that the user "just accessed" before going on vacation. To deal with that, the time measured since the last access operation of a file
 * In order to not affect system performance too obtrusively, consideration should be made to have the aging measurement code to the updatedb periodical process. It already affects system performance to a great deal when it runs, but is still supported in Ubuntu. Either as a stand alone approach or combined with the previous one, we should take care to keep the calculation and scanning process held, until the system becomes idle and build it such that it does its processing in incremental chunks. E.g., progress each time the system is idle a bit more until covering all files / folders in the designated file system for clean up. We should also make sure to use the fastest system call to receive the file data we need for aging and oppurtunity measurement. If that can be only done in C, then we'd rather code it in C and have python bindings to access it.
Line 66: Line 62:
 * There should be also an option to configure the tool to never scan for specific directories, to be configured by more power users that would like to have some folders untouched.

 * The tools should ignore specific ~/.?* directories to not break applications in ubuntu-desktop and other special cases.

2.'''Package left over house keeping:''' (orphand files, unneeded dependency packages)

 1. Offer to remove orphaned files that no longer belong to any of the packages installed on the system.
 1. Offer to remove packages that were installed due to satisfying dependencies of other packages that are not longer installed.
 1. Offer to remove packages that are rarely or not used anymore.
2.'''Package left over house keeping:'''
 1. Offer to remove orphaned files that no longer belong to any of the packages installed on the system. Certain system configuration files created during installation that are not to be removed should be also automtically detected and added to the blacklist. We will achieve this by gathering a list of those files, and feeding it to the shipped blacklist.
 1. Offer to remove packages that are rarely or not used anymore, and consume substantial amount of disk space.
Line 78: Line 68:
 1. When a new kenrnel is being installed by the GUI packaging tools, the installing packaging tool will call the system clean up tool with a command line that will instruct it to deal only with kernels clean up on that specific invocation:
 1. Then, the system clean up tool will mark the packages of:
    1. The currently running kernel.
    1. The new kernel that was just installed.
 1. Checking if the user has any other kernels installed other then those in the afformentioned list.
 1. When a new kernel is being installed by the high level packaging tools (apt, syanptic) , the installing packaging tool will call the system clean up tool with a command line that will instruct it to deal only with kernels clean up on that specific invocation. We should try to make the callback intelligent and be able to detect weather it can use X GUI, or a text UI interface to cater for people using this tool on system that do not have X/GNOME installed.
 1. In order to not make the clean up tool mandatory on one's system, any call made to start it should first check if the executable file to be called exists. Any calling tools should gracefully ignore its absence and continue as usual.
 1. If installed, then, the system clean tool fires up and marks the packages of:
    1. The currently running kernel. This kernel is used as a reference point (is already used and running) , as we will be marking for removal older kernels that were installed previously excluding:
         1. Manually installed manually (e.g. using dpkg -i ..) ,
         1. The reference point kernel package (e.g. the current running kernel package version)
         1. The newly installed kernel package (e.g. the new kernel package just downloaded and installed).
    1. If the current running kernel was infact installed manually, then this logic is still valid.
 1. Checking if the user has any other kernels installed other then those detected for keeping in the previous items.
Line 85: Line 79:
 1. Popup a desktop notification to the user "You have un-used kernels installed on the system. Would you like me to clean them up?".
 1. If the user confirms, then present a dialog displaying the list of kernels that was constructed in steps 2-4 , in the a window that will contain a columnd table, each row representing a kernel package:
 1. Pop up a desktop notification to the user: "You have unused kernels installed on the system. Would you like to purge them?".
 1. If the user confirms, then present a dialog displaying the list of kernels that was constructed in steps 2-4 in a window that will contain a columned table, each row representing a kernel package:
Line 90: Line 84:
 1. All items in the list are by default unchecked (meaning , tool will remove all kenrels on the remove-list)
 1. The user can choose to keep any of the kernls on the list by checking the checkbox next to the name/version.
 1. Pressing "Commit" , will calculate the kenrels packages to be removed from the check list, remove the kernel packages, and notify the user of success, or error if any issues were encountered while removing.
 1. Left over kenrel removals should be also proposed when a user is running out of sufficient free space in his /boot fs or in his / while /boot is part of it. (this should be the last priority if /boot and / are on the same fs, we should first check for other bigger files that can be removed)
 
 1. All items in the list are by default unchecked (meaning that the tool will remove all kernels on the remove-list)
 1. The user can choose to keep any of the kernels on the list by checking the checkbox next to the name/version.
 1. Pressing "Commit" will calculate the kernels packages to be removed from the check list, remove the kernel packages and notify the user of success or failure if any issues were encountered during removal.
 1. Left over kernel removals should be also proposed when a user is running out of sufficient free space in his /boot fs or in his / while /boot is part of it. (This should be the last priority if /boot and / are on the same fs; we should first check for other bigger files that can be removed.)
Line 98: Line 91:
 1. When being executed as an unpriv'd user, the tool should touch running user's home folder only.
 1. Executed as a priv'd user, (sudo'd) tool should care about system wide cleanup (kernel, system folders, etc)
 1. When being executed as an unprivileged user, the tool should touch only the running user's home directory.
 1. When being executed as a privileged user (`sudo`ed), the tool should care about system wide cleanup (kernel, system folders, etc.).
Line 103: Line 96:
 1. The tools should either use gnome-volume-manager to catch for low disk space events being dispatched.
 1. If it cannot be customized to allow different criteria for this events to dispatch (by quota, user selected minimum free space) then the tool should probably have a deamon of it's own to do that. This would be suboptimal since it will mean another python interpreter running and another daemon consuming system resources.
 1. The tool should use gnome-volume-manager to catch for low disk space events. If g-v-m doesn't allow the flexiability to set notifications to be dispatched acoording to user's quota, or custom set minimum free space - we should implement our own daemon for monitoring the amount of free space on a given file system.
Line 106: Line 98:
6. '''User Interface:'''
Line 107: Line 100:
 1. After gathering all required information, and building the opportunity list, the tool will present the different found targets of oppurtunity, together with the amount of free space that will be reclaimed when they are removed. We should take care to describe those in a for the user understandable way(tm) , rather then just show cluttered lists of files. So this dialog could list items like:
   * 31MB of historical log files, last accessed: NEVER.
   * 200MB of audio visual content, last accessed: 2 Years ago
   * 46MB of old un-used kernels
   * 700MB of old downloaded package files
   * etc..

 1. Each item shall be assisted by a a drop-down each with identical predefined actions next to it, those actions will be:
    * "Leave on system" - Will just ignore the item for this run of the tool. It will show up again in next time the tool is run.
    * "Always leave" - Will add the item to the whitelist, so it will never come up as a removal target again.
    * "Remove" - Will remove the files related to this and identified under this item, actually freeing space.

 1. Per each top level item where an item list is applicable, we should also provide the functionality for the user to do item by item selection to tell if he wants to either keep, remove, or never bother about an item again. We should probably consider using checkboxes spread horizontally for this list, as using the drop down over a large number of files could be annoying UI wise.
 
Line 110: Line 117:
 * PackageDependencyManagement should be used as much as possible to achive design item #2.
 
 * The weight of a file for the opportunity calculation will be `weight = file size + aging factor`.
 * Kernel clean will use packaging interface to remove the old kernels, a bash or python script will be used to record the current running kernel and the one just newly installed.
 * PyGTK and Glade will be used for the UI development.
 * Desktop notification framework will be used to deliver the first interaction with user prior to launching the clean up application (we will have to replace / patch the current low disk space notification available from gnome-vfs)
 * Removal of conffiles, cron scripts and init scripts should be addressed using the "residual config removal" functionality available through python-apt.
Line 113: Line 123:
=== Code === == Kubuntu ==
 * What about KDE and Kubuntu?
    * Since a while I've contacted the [[[http://www.kde-apps.org/content/show.php?content=28631|KleanSweep]]] author and we're working together to deliver a unified back end, that will be assisted by two KDE/GNOME front ends making sure KDE and GNOME users have a consistent GUI for that kind of tasks. --SivanGreen
Line 115: Line 127:
=== Data preservation and migration ===

== Outstanding issues ==

== BoF agenda and discussion ==

* It would be better if the computer can handle this problem by itself. Certainly there should be an alert warning you that a disk is nearly full. But sources of ever-expanding files are bugs that should be fixed at the source: for example, [http://bugzilla.gnome.org/show_bug.cgi?id=149572 Gnome bug 149572]. Another very cool way to help people free space on a disk would be a folder view for Nautilus that does the same thing as [http://xdiskusage.sourceforge.net/ `xdiskusage`] or [http://derlien.com/ Disk Inventory X]. -- MatthewPaulthomas

* May be useful to have an enterprise mode that assists users in cleaning out their home directory if they meet their quota. Would entail adding quota support. (may need this anyhow?) Also limiting the tool to specific directories may be useful. -- ScottDier

* When running as a unpriv'd user, tool should only allow user to clean and tidy his own home folder, possibly use two different front ends per priv'd and unprived operation.

* Use gnome volume manager to catch notifcations about low disk space. Research should be put into evaluating how customizable gnome-volume-manager is for the purpose. (if we need to maybe check quotas, precentage and be able to have different minimum free space values per different file systems)

* Have a weighting to order files to display for possible deletion that uses space, last access time, and filetype detected by libmagic. (ie: large and old core files may be listed has highly probable for deletion)

* Only show ~/.?* directories as a summary unless user specifically asks. When a user wants to investigate these files further then warn user that removing such files may break applications or cause data loss and should only be used by experienced users.

* For ~/.?* directories only summarise them in the list if all contents are not used for long period of time.

* Ignore specific ~/.?* directories to not break applications in ubuntu-desktop and special cases.

* Write a research script , that will be executed on users machines , to find out:
   * How much time passed since last access time of the file.
   * Which mime / file type.
   * Size of file
   * Weigh those files , present it to the user and ask for the opinion for how good the weighing is, is the right files are on the list? maybe some files can grow big and never touch them, etc.
   
   * Have feedback system integrated into the application. For example a checkbox to send anonymized data. This data might incluse a-f-wizard usage patterns and calculated weights.
   
   * vacation problem -> if a user goes on vacation and gets back from it then we have to make sure that the wizard doesnt suggest files to be removed that the user "just accessed" before going on vacation. So actual computer use is important for the weighting.
   
   * human readable weighting file. Some simple scripting language. Users can adjust the weighting algorithm themselve and give feedback on things that work..... eventually templates.

* Allow a user to bring up nautilus in order to view or work out a file that has been on the "under-the-gun" list.
== Comments ==
 * [[JoeyStanford|Joey Stanford]] - It would be very nice if this also included, even at a rudimentary level, a home directory dot file cleanup wizard.
  * e.g. A user installs the Holotz Castle game. They decide they don't like it and remove it. The /home/user/.holotz-castle directory still exists and is not removed.
   * this may seem trivial but I, as example, copy my home directory over during upgrades. I've had the same directory since breezy (now on Edgy) and it's a royal mess of unused dot files.
  * A better way to do this might be to force all packages to include dot files in the postrm script as well as keeping "not installed (residual config)" entry in synaptic around while dot files exist.
  * An alternative way would be to incorporate the code inside [[http://linux.bydg.org/~yogin/|kleansweep]] (a KDE tool) into the system cleanup tool.
 * Pádraig Brady
  * An alternative to kleansweep is [[http://www.pixelbeat.org/fslint/|fslint]] (a pygtk tool).
 * PaulKishimoto
  * Removing old backup files with names list "filename.ext~" would also be helpful. I find these files in various places in my /home/ directory after the original files have been removed or moved.'
   * Wouldn't that be more of a job for Nautilus? --JeremyVisser
  * Packages identified as orhphaned (by deborphan or a simpler method) could be among those suggested for removal.
  * It would be nice if tools like hubackup and sbackup could require this tool be run first to clear cruft out of /home/ that would otherwise end up in backups.
 * Jean-Michel Frouin
  * Attempt to write this tools can be follow here : https://savannah.nongnu.org/projects/scleaner/ (Currently beta 2 :D but I improve it daily).
 * How about letting the user insert a removable drive (USB or CD-R) to move files on to. --SamTygier
 * We are running some thin client environments where the users have quota's on their home directories. When their home directories get full, by a hard limit, they won't be able to login, to clean it up. When the size reaches the soft limit, they won't get any warning at the moment. I have build a script that the users will get to see, when their soft limit is reached, on login. The message says that their home directory is full. A nice way to handle this, is to be able to attach a script on a disk full or a quota reached event (soft limit and hard limit apart). --MichielEghuizen

Summary

This specification discusses implementing a computer house keeping tool. This tool will offer a user several ways to keep his system from getting too full, cluttered and confusing to use over time. This tool will attempt to require as little intervention as possible by the user. This should result in a running system always kept tidy, easy and enjoyable to use.

Rationale

In due course, a once fresh system can become cluttered with all sorts of residual content, such as too many installed kernels, log files taking up precious system disk space, contents of the Trash folder, packaging and browser cache, and various large and small user files such as audio visual content, aging documents, old chat logs and more. These tend to accumulate over time, confusing the user, or even eventually leading to the computer becoming unusable, forcing users to actively put effort into cleaning it up. There should be a solution to warn users beforehand, and suggest and perform purge operations for common leftover cruft.

Use cases

  • George has been receiving lots of audio visual content recently, from relatives overseas. He has been burning these to DVDs, and put each file in the garbage bin after writing it to DVD. After a while, the free space in his home filesystem has dropped to the minimum allowed, but he does not notice. He is also downloading a big ISO image of the edgy desktop CD for testing. The system clean up tool detects that there is not enough free space, and pops up a desktop notification bubble, suggesting that there is large amount of data in .Trash that can be purged to make more room. George acknowledges; space is freed and the download is saved.
  • High Priority: John is a Dapper user. A recent kernel upgrade has been released to cater for a security bug. After finishing to install the new kernel, and before rebooting to use it, the system informs him that he has some left over kernel packages that could be removed and asks him if he wants to do so. If he acknowledges, all of the kernels that are no longer needed are removed, leaving him with a clean /boot with only the currently running kernel and the newly installed one. After he reboots to use the new kernel, he could always revert back to the previous one that was kept, if for some reason the new kernel is faulty.

  • Brian is a Launchpad developer. He has several Zope instances installed for developing Launchpad, and runs the PostgreSQL database server. These applications produce a lot of log file data, especially when used for heavy development and experimentation; in this case they are usually set to maximum verbosity for debugging purposes. After a week of heavy work, his free space on his root filesystem reaches the low minimum. The system clean up tool detects the problem, and before Brian is running into operational problems, it offers to delete some old logs and some files in /tmp that occupy most of the currently used space. He confirms the removal, and resumes his work. The wizard takes care to do the removal in the background, first hunting for the biggest files, in order to keep the system operational and make more space in the shortest possible time. The interaction with Brian is done through the desktop notification infrastructure.

Scope

  • Kernel left overs:
    1. Due to security upgrades.
    2. General bug fixes and version upgrades.
    3. Make sure never to touch a kernel package created by the user.
      • -- How can we know if it's such a package?
  • Residual packaging related content:
    1. Conffiles.
    2. Init scripts.
  • Packaging system leftovers:
    1. Contents of /var/cache/apt/archives.
    2. Orphaned files.
  • General left over content:
    1. Web / File Browser caches. (e.g. ~/.thumbnails)
    2. Aged audiovisual content.
    3. Aged and/or large log files.
    4. Large ISO files.
    5. Content of /tmp
    6. Content of /var/log

Design

Aging: The time period that had passed between the last time a file has been accessed, and the recorded time reference point. The reference point can be the current date and time reported by the system, or an eariler time in order to enable more accurate aging calculation when a system hasn't been used over a long period of time.

1.Dealing with general left over content:

A weighing algorithm needs to be developed to enable the tool to identify targets of opportunity. The following factors need to be taken in consideration in producing the weight result per file:

  1. A relative time reference point should be used for measuring the aging time of all files. This is in order to overcome the "vacation problem" where a user hasn't been using his system for a long period of time, and by using the current time when he first login after his vacation, the weighing would get distorted to include files that the user accessed just before he went on vacation. This means that we need to measure the actual usage time. To do so, we will record the last access time of files that are accessed every login (for example, gdm files) and use this_login-1 's time stamp as our new reference point.
  2. How much time passed since last access time of the file.
  3. MIME / file type.
  4. Size of file
  5. Capacity of the holding volume or the user set quota.
  6. In order to not affect system performance too obtrusively, consideration should be made to have the aging measurement code to the updatedb periodical process. It already affects system performance to a great deal when it runs, but is still supported in Ubuntu. Either as a stand alone approach or combined with the previous one, we should take care to keep the calculation and scanning process held, until the system becomes idle and build it such that it does its processing in incremental chunks. E.g., progress each time the system is idle a bit more until covering all files / folders in the designated file system for clean up. We should also make sure to use the fastest system call to receive the file data we need for aging and oppurtunity measurement. If that can be only done in C, then we'd rather code it in C and have python bindings to access it.

2.Package left over house keeping:

  1. Offer to remove orphaned files that no longer belong to any of the packages installed on the system. Certain system configuration files created during installation that are not to be removed should be also automtically detected and added to the blacklist. We will achieve this by gathering a list of those files, and feeding it to the shipped blacklist.
  2. Offer to remove packages that are rarely or not used anymore, and consume substantial amount of disk space.

3.Unused left over kernels:

  1. When a new kernel is being installed by the high level packaging tools (apt, syanptic) , the installing packaging tool will call the system clean up tool with a command line that will instruct it to deal only with kernels clean up on that specific invocation. We should try to make the callback intelligent and be able to detect weather it can use X GUI, or a text UI interface to cater for people using this tool on system that do not have X/GNOME installed.
  2. In order to not make the clean up tool mandatory on one's system, any call made to start it should first check if the executable file to be called exists. Any calling tools should gracefully ignore its absence and continue as usual.
  3. If installed, then, the system clean tool fires up and marks the packages of:
    1. The currently running kernel. This kernel is used as a reference point (is already used and running) , as we will be marking for removal older kernels that were installed previously excluding:
      1. Manually installed manually (e.g. using dpkg -i ..) ,
      2. The reference point kernel package (e.g. the current running kernel package version)
      3. The newly installed kernel package (e.g. the new kernel package just downloaded and installed).
    2. If the current running kernel was infact installed manually, then this logic is still valid.
  4. Checking if the user has any other kernels installed other then those detected for keeping in the previous items.
  5. If he does not, do nothing.
  6. If he does have, gather a list of all those kernel packages.
  7. Pop up a desktop notification to the user: "You have unused kernels installed on the system. Would you like to purge them?".
  8. If the user confirms, then present a dialog displaying the list of kernels that was constructed in steps 2-4 in a window that will contain a columned table, each row representing a kernel package:
    • Row 1: Kernel Version. (e.g. "2.6.15-25-686")
    • Row 2: Kernel Package Name. (e.g "linux-image-2.6.15-25-686").
    • Row 3: A check box indicating if this kernel package is to be removed, or left installed. (checked->keep, unchecked->remove)

  9. All items in the list are by default unchecked (meaning that the tool will remove all kernels on the remove-list)
  10. The user can choose to keep any of the kernels on the list by checking the checkbox next to the name/version.
  11. Pressing "Commit" will calculate the kernels packages to be removed from the check list, remove the kernel packages and notify the user of success or failure if any issues were encountered during removal.
  12. Left over kernel removals should be also proposed when a user is running out of sufficient free space in his /boot fs or in his / while /boot is part of it. (This should be the last priority if /boot and / are on the same fs; we should first check for other bigger files that can be removed.)

4.Modes of operation:

  1. When being executed as an unprivileged user, the tool should touch only the running user's home directory.
  2. When being executed as a privileged user (sudoed), the tool should care about system wide cleanup (kernel, system folders, etc.).

5.Catching disk space events:

  1. The tool should use gnome-volume-manager to catch for low disk space events. If g-v-m doesn't allow the flexiability to set notifications to be dispatched acoording to user's quota, or custom set minimum free space - we should implement our own daemon for monitoring the amount of free space on a given file system.

6. User Interface:

  1. After gathering all required information, and building the opportunity list, the tool will present the different found targets of oppurtunity, together with the amount of free space that will be reclaimed when they are removed. We should take care to describe those in a for the user understandable way(tm) , rather then just show cluttered lists of files. So this dialog could list items like:
    • 31MB of historical log files, last accessed: NEVER.
    • 200MB of audio visual content, last accessed: 2 Years ago
    • 46MB of old un-used kernels
    • 700MB of old downloaded package files
    • etc..
  2. Each item shall be assisted by a a drop-down each with identical predefined actions next to it, those actions will be:
    • "Leave on system" - Will just ignore the item for this run of the tool. It will show up again in next time the tool is run.
    • "Always leave" - Will add the item to the whitelist, so it will never come up as a removal target again.
    • "Remove" - Will remove the files related to this and identified under this item, actually freeing space.
  3. Per each top level item where an item list is applicable, we should also provide the functionality for the user to do item by item selection to tell if he wants to either keep, remove, or never bother about an item again. We should probably consider using checkboxes spread horizontally for this list, as using the drop down over a large number of files could be annoying UI wise.

Implementation

  • The weight of a file for the opportunity calculation will be weight = file size + aging factor.

  • Kernel clean will use packaging interface to remove the old kernels, a bash or python script will be used to record the current running kernel and the one just newly installed.
  • PyGTK and Glade will be used for the UI development.
  • Desktop notification framework will be used to deliver the first interaction with user prior to launching the clean up application (we will have to replace / patch the current low disk space notification available from gnome-vfs)
  • Removal of conffiles, cron scripts and init scripts should be addressed using the "residual config removal" functionality available through python-apt.

Kubuntu

  • What about KDE and Kubuntu?
    • Since a while I've contacted the KleanSweep] author and we're working together to deliver a unified back end, that will be assisted by two KDE/GNOME front ends making sure KDE and GNOME users have a consistent GUI for that kind of tasks. --SivanGreen

Comments

  • Joey Stanford - It would be very nice if this also included, even at a rudimentary level, a home directory dot file cleanup wizard.

    • e.g. A user installs the Holotz Castle game. They decide they don't like it and remove it. The /home/user/.holotz-castle directory still exists and is not removed.
      • this may seem trivial but I, as example, copy my home directory over during upgrades. I've had the same directory since breezy (now on Edgy) and it's a royal mess of unused dot files.
    • A better way to do this might be to force all packages to include dot files in the postrm script as well as keeping "not installed (residual config)" entry in synaptic around while dot files exist.
    • An alternative way would be to incorporate the code inside kleansweep (a KDE tool) into the system cleanup tool.

  • Pádraig Brady
    • An alternative to kleansweep is fslint (a pygtk tool).

  • PaulKishimoto

    • Removing old backup files with names list "filename.ext~" would also be helpful. I find these files in various places in my /home/ directory after the original files have been removed or moved.'
      • Wouldn't that be more of a job for Nautilus? --JeremyVisser

    • Packages identified as orhphaned (by deborphan or a simpler method) could be among those suggested for removal.
    • It would be nice if tools like hubackup and sbackup could require this tool be run first to clear cruft out of /home/ that would otherwise end up in backups.
  • Jean-Michel Frouin
  • How about letting the user insert a removable drive (USB or CD-R) to move files on to. --SamTygier

  • We are running some thin client environments where the users have quota's on their home directories. When their home directories get full, by a hard limit, they won't be able to login, to clean it up. When the size reaches the soft limit, they won't get any warning at the moment. I have build a script that the users will get to see, when their soft limit is reached, on login. The message says that their home directory is full. A nice way to handle this, is to be able to attach a script on a disk full or a quota reached event (soft limit and hard limit apart). --MichielEghuizen


CategorySpec

SystemCleanUpTool (last edited 2008-08-06 16:16:56 by localhost)