CrashReporting
Launchpad entry: https://launchpad.net/distros/ubuntu/+spec/crash-reporting
Created: Date(2006-07-31T12:05:45Z) by MatthewPaulThomas
Packages affected: apport, apport-gtk
Summary
When a program crashes, an alert should appear that explains what just happened, makes it easy to report the problem to Ubuntu developers, and makes it easy to reopen the crashed program if appropriate. Because a few bugs cause most crashes, this system should eventually involve a database of crash reports, automatically aggregated by type so that developers can allocate their time to the top crashers.
Rationale
We want to improve Ubuntu's reliability. See also [:UbuntuDownUnder/BOFs/AutomatedCrashReporting:AutomatedCrashReporting], BugReportingTool.
Use cases
- Willy was creating a logo for his soccer club in Inkscape when it crashed. Being the family's Ubuntu expert, he feels a responsibility to help improve the system. He's reported one or two bugs in Malone before, though it wasn't a particularly enjoyable experience.
- Aunt Martha was adding to her genealogical records in Gramps when it crashed. She doesn't know anything about reporting bugs, and has no desire ever to report any.
- Millie was logging in to her bank account when Firefox crashed. She's used to clicking the "Send" button for Windows Error Reporting, but it would be a bad idea for her to report this problem since anyone could find her banking password in the crash informaation.
- Thunderbird has just crashed on Billie's machine for the third time in three minutes. She angrily blats away the error reporting alert within 0.8 seconds of it opening.
Design
The crash reporting interface is an interruption; it is not at all related to the user's goal (designing a logo, recording genealogy, online banking, etc). To make things worse, the crash itself has probably just eaten some of the victim's work. They will likely be angry with computers in general, and Ubuntu in particular. Therefore the crash reporting interface must be very simple and apologetic.
Another design problem is that most people who have come from Windows XP or later, or Mac OS X 10.3 or later, will be used to crash reports that are confidential to Microsoft, Apple, or trusted ISVs. For Edgy, Ubuntu crash reports will not be like this: if someone reports a bug and attaches their crash report, anyone will be able to see it. Bug reports can be marked private after the fact, but the crash reporting interface itself must take some responsibility for discouraging leaking of sensitive information. (Mozilla.org can [http://www.mozilla.org/quality/qfa.html say that] "Sensitive data, such as passwords, Web sites visited, and e-mail addresses will not be collected". We can't guarantee the same, because our crash reporting system is package-agnostic and can't make assumptions about the data involved.)
Comparisons
[http://windowsdevcenter.com/pub/a/windows/2004/03/16/wer.html Windows Error Reporting under the covers]
[http://developer.apple.com/technotes/tn2004/tn2123.html Apple TN2123: CrashReporter]
[http://ramikayyali.com/archives/2005/07/26/krash Kool Krashing] (see also [http://amarok.kde.org/blog/archives/4-amaroK-1.2-It-Crashes-Somewhat-Less.html amaroK 1.2 - it crashes somewhat less])
[http://flickr.com/photos/jfpoole/143205824/ Adium Crash Reporter]
Edgy
For Edgy, it will not possible to report a bug without using Launchpad's Web interface. So we should (a) apologize for the error, (b) make it easy to report a bug if that would help, and (c) make it easy to reopen the program if appropriate.
The crash reporter should determine whether the crashed program generated a useful backtrace, then wait until three seconds have passed since the crash, and determine whether the crashed program is now running (meaning that it restarted automatically, that multiple copies were running, or that the user restarted it quickly).
|
There is a useful backtrace |
There is not a useful backtrace |
The program is not running |
attachment:edgy.jpg |
Same, but without the secondary text, and without the "Report a Bug…" button. |
The program is running |
attachment:edgy-no-reopen.jpg |
No alert at all. |
If a human-readable name can be found for the program (from a .desktop file), the primary text of the alert should be "Sorry, Name of Program closed unexpectedly." Otherwise, it should be "Sorry, the program “binary-name” closed unexpectedly."
The keyboard equivalent for the "Reopen" or "OK" button should be Enter, not a letter.
Clicking "Report a Bug…" should open both a Web browser to Ubuntu's Bugs page in Launchpad; and also a floating window near the top left corner of the screen, containing the bug information.
attachment:edgy-report.jpg
In the floating window, the icon should be draggable into the browser's filepicker to select that file, and the pathname should also be copyable text. The "What does the file contain?" expander should disclose a read-only text field containing the crash log as wrapped text.
Later Ubuntu versions
Eventually, crash reports should be stored in a separate database in LP (like [http://talkback-public.mozilla.org/search/start.jsp Mozilla Talkback] or [https://sodium.ubuntu.com/~jamesh/oops.cgi Launchpad Oops]). Crash victims should no longer fill out bug reports manually, because it's time-consuming, complicated, and usually not something they're interested in, and because [http://www.microsoft.com/whdc/maintain/WERHelp.mspx 80 percent of crashes come from 20 percent of the bugs].
This simplifies the initial crash alert a little ...
attachment:funky.jpg
... And it simplifies the resulting reporting interface a lot.
attachment:funky-report.jpg
The window should now be a normal window, not a floating window or a dialog. When the "Send" button is clicked, it should be made insensitive, and a progress animation should be shown in the bottom left corner of the window until the transmission succeeds or fails. If it fails, the progress animation should disappear, an error alert should be shown explaining the problem (for example, "The error report could not be sent because there is no Internet connection. Try again later."), and the "Send" button should be made available again. If the transmission succeeds, the window should disappear. If transmission fails, the report is kept for at most 7 days, so that users can still send it later by clicking on the bomb icon in the panel.
Implementation
Web user interface mockup
Start page and query form. This also shows the most recent crashes, top crashers first:
attachment:crashdb-query.jpg
The search form results are displayed in a list:
attachment:crashdb-result.jpg
Crash report details:
attachment:crashdb-details.jpg
Code
The crash database needs to offer an XML-RPC or HTTP POST interface for anonymous crash report submission (it might be possible to reuse the Malone cloakroom for this). apport uses this interface to send the report to the crash database.
Gnome's crash database has a fairly good duplicate finder even without full symbols in the stack trace; we need to do the same to avoid human work for duplicate elimination.
In a later stage of implementation, the crash database should automatically invoke apport-retrace for getting symbolic and useful stack traces for crash reports. However, this requires root access to a sandbox system to install the package where the crash occured in.
Access to the raw crash reports should be very limited, since they potentially contain sensitive information. Thus the web interface needs to ask for LP authentication, and limit acces to a trusted crash report triage team (initially, ubuntu-core-dev). Bug reports (without a core dump, just with textual data) can be created from crash reports for wider triaging and solving.
When the system works, we want to disable bug-buddy by default and use apport to intercept and report Gnome-related crashes, too.
Discussion with Sebastien revealed that email notifications about crashes are not requested. It is prefered to regularly check the crash database for new issues and provide good search options for good default filters.
Discussion
Telling [people] they will not receive a reply is awful, worse than awful even. One of the best things about open source is the fact that we have an open bug tracking system. -- CoreyBurger
It's unfortunate, but nowhere close to awful. This is about improving the quality of Ubuntu, not about giving support (compare [http://hendrix.mozilla.org/ hendrix.mozilla.org]). The openness of the bug tracking system is not relevant to this issue, especially as the user base becomes less geeky on average. -- mpt