CouchDB

Revision 2 as of 2009-09-02 18:25:39

Clear message

Dev Week -- Hooking your app into your desktop CouchDB - aquarius -- Wed Sep 2nd, 2009

UTC

(02:03:55 PM) aquarius: Hi, all. Welcome to Stuart's House Of Desktop Couch Knowledge.
(02:04:15 PM) aquarius: I'm Stuart Langridge, and I hack on the desktopcouch project!
(02:04:18 PM) aquarius: Over the next hour I'm going to explain what desktopcouch is, how to use it, who else is using it, and some of the things you will find useful to know about the project.
(02:04:28 PM) aquarius: I'll talk for a section, and then stop for questions.
(02:04:35 PM) aquarius: Please feel free to ask questions in #ubuntu-classroom-chat, and I'll look at the end of each section to see which questions have been posted. Ask at any time; you don't have to wait until the end of a section.
(02:04:40 PM) aquarius: You should prefix your question in #ubuntu-classroom-chat with QUESTION: so that I notice it :-)
(02:04:57 PM) aquarius: So, firstly, what's desktopcouch?
(02:05:06 PM) aquarius: Well, it's giving every Ubuntu user a CouchDB on their desktop.
(02:05:17 PM) aquarius: CouchDB is an Apache project to provide a document-oriented database. If you're familiar with SQL databases, where you define a table and then a table has a number of rows and each row has the same columns in it...this is not like that.
(02:05:39 PM) aquarius: Instead, in CouchDB you store "documents", where each document is a set of key/value pairs. Think of this like a Python dictionary, or a JSON document.
(02:05:57 PM) aquarius: So you can store one document like this:
(02:06:00 PM) aquarius: { "name": "Stuart Langridge", "project": "Desktop Couch", "hair_colour": "red" }
(02:06:16 PM) aquarius: and another document which is completely different:
(02:06:19 PM) aquarius: { "name": "Stuart Langridge", "outgoings": [ { "shop": "In and Out Burger", "cost": "$12.99" } , { "shop": "Ferrari dealership", "cost": "$175000" } ] }
(02:06:46 PM) aquarius: The interface to CouchDB is pure HTTP. Just like the web. It's RESTful, for those of you who are familiar with web development.
(02:06:59 PM) aquarius: This means that every programming language already knows how to speak it, at least in basic terms.
(02:07:10 PM) aquarius: CouchDB also comes with an in-built in-browser editor, so you can look at and browse around and edit all the data stored in it.
(02:07:18 PM) aquarius: So, the desktopcouch project is all about providing these databases for every user, so each user's applications can store their data all in one place.
(02:07:31 PM) aquarius: You can have as many databases in your desktop Couch as you or your applications want, and storage is unlimited.
(02:07:43 PM) aquarius: Desktop Couch is built to do "replication", synchronizing your data between different machines. So if you have, say, Firefox storing your bookmarks in your desktop Couch on your laptop, those bookmarks could be automatically synchronized to your Mini 9 netbook, or to your desktop computer.
(02:07:54 PM) aquarius: They can also be synchronized to Ubuntu One, or another running-in-the-cloud service, so you can see that data on the web, or synchronize between two machines that aren't on the same network.
(02:08:05 PM) aquarius: So you've got your bookmarks everywhere. Your own personal del.icio.us, but it's your data, not locked up solely on anyone else's servers.
(02:08:19 PM) aquarius: Imagine if your apps stored their preferences in desktop Couch. Santa Claus brings you a new laptop, you plug it in, pair it with your existing machine, and all your apps are set up. No work.
(02:08:35 PM) aquarius: But sharing data between machines is only half the win. The other half is sharing data between applications.
(02:08:45 PM) aquarius: I want all my stuff to collaborate. I don't want to have to "import" data from one program to another, if I switch from Thunderbird to Evolution to KMail to mutt.
(02:08:57 PM) aquarius: I want any application to know about my address book, to allow any application to easily add "send this to another person", so that I can work with people I know.
(02:09:05 PM) aquarius: I want to be able to store my songs in Banshee and rate them in Rhythmbox if I want -- when people say that the Ubuntu desktop is about choice, that shouldn't mean choosing between different incompatible data silos. I can choose one application and then choose another, you can choose a third, and we can all cooperate on the data.
(02:09:14 PM) aquarius: My choice should be how I use my applications, and how they work; I shouldn't have to choose between underlying data storage. With apps using desktopcouch I don't have to.
(02:09:24 PM) aquarius: All my data is stored in a unified place in a singular way -- and I can look at my data any time I want, no matter which application put it there! Collaboration is what the open source desktop is good at, because we're all working together. It should be easy to collaborate on data.
(02:09:27 PM) aquarius: That's a brief summary of what desktopcouch *is*: any questions so far before we get on to the meat: how do you actually Use This Thing?
(02:10:33 PM) aquarius: mandel_macaque (hey, mandel :)) -- that's what the desktopcouch mailing list is for, so people can get together and talk about what should be in a standard record
(02:10:55 PM) aquarius: there's no ivory tower which hands down standard formats from the top of the mountain :)
(02:11:26 PM) aquarius: mandel_macaque's question was: will there be a "group" that will try to define standard records?
(02:11:35 PM) aquarius: <mhall119|work> QUESTION: how does desktopcouch differ from/replace gconf?
(02:12:05 PM) aquarius: mhall119|work, desktopcouch is for storing all sorts of user data. It's not just about preferences, although you could store preferences in it
(02:12:36 PM) aquarius: <sandy|lu1k> QUESTION: What about performance? Why would Banshee/rhythmbox switch to a slower way to store metadata?
(02:13:18 PM) aquarius: sandy|lu1k, performance hasn't really been an issue in our testing, and couchdb provides some serious advantages over existing things like sqlite or text files, like replication and user browseability
(02:13:34 PM) aquarius: <mandel_macaque> QUESTIONS: Is desktopcouch creating the required infrastructure to allow user sync, or should applications take care of that?
(02:14:06 PM) aquarius: desktopcouch is providing infrastructure and UI to "pair" machines and handle all the replication; applications do not have to know or worry about data being replicated to your other computers
(02:14:19 PM) aquarius: <jopojop> QUESTION: can you store media like images, audio and video?
(02:14:41 PM) aquarius: jopojop, not really -- couchdb is designed for textual, key/value pair, dictionary data, not for binary data
(02:15:37 PM) aquarius: it's possible to store binary data in desktopcouch, but I'd suggest not importing your whole mp3 collection into it; store the metadata. The filesystem is good at handling binary data
(02:15:57 PM) aquarius: <sandy|lu1k> QUESTION the real performance concern that media apps have is query speed for doing quick searches
(02:16:36 PM) aquarius: sandy|lu1k, that's something we'd really like to see more experimentation with. couchdb's views architecture makes it really, really quick for some uses,
(02:17:05 PM) aquarius: ok, let's talk about how to use it :)
(02:17:10 PM) aquarius: The easiest way to use desktopcouch is from Python, using the desktopcouch.records module.
(02:17:20 PM) aquarius: This is installed by default in Karmic.
(02:17:25 PM) aquarius: An individual "document" in desktop Couch is called a "record", because there are certain extra things that are in a record over and above what stock CouchDB requires, and desktopcouch.records takes care of this for you.
(02:17:37 PM) aquarius: First, a bit of example Python code! This is taken from the docs at /usr/share/doc/python-desktopcouch-records/api/records.txt.
(02:17:44 PM) aquarius: >>> from desktopcouch.records.server import CouchDatabase
(02:17:46 PM) aquarius: >>> from desktopcouch.records.record import Record
(02:17:47 PM) aquarius: >>> my_database = CouchDatabase("testing", create=True)
(02:17:48 PM) aquarius: # get the "testing" database. In your desktop Couch you can have many databases; each application can have its own with whatever name it wants. If it doesn't exist already, this creates it.
(02:17:59 PM) aquarius: >>> my_record = Record({ "name": "Stuart Langridge", "project": "Desktop Couch", "hair_colour": "red" }, record_type='http://example.com/testrecord')
(02:18:03 PM) aquarius: # Create a record, currently not stored anywhere. Records must have a "record type", a URL which is unique to this sort of record.
(02:18:26 PM) aquarius: >>> my_record["weight"] = "too high!"
(02:18:29 PM) aquarius: # A record works just like a Python dictionary, so you can add and remove keys from it.
(02:18:43 PM) aquarius: >>> my_record_id = my_database.put_record(my_record)
(02:18:44 PM) aquarius: # Actually save the record into the database. Records each have a unique ID; if you don't specify one, the records API will choose one for you, and return it.
(02:19:06 PM) aquarius: >>> fetched_record = my_database.get_record(my_record_id)
(02:19:07 PM) aquarius: # You can retrieve records by ID
(02:19:11 PM) aquarius: >>> print fetched_record["name"]
(02:19:11 PM) aquarius: "Stuart Langridge"
(02:19:15 PM) aquarius: # and the record you get back is a dictionary, just like when you're creating it.
(02:19:59 PM) aquarius: That's some very basic code for working with desktop Couch; it's dead easy to save records into the database.
(02:20:00 PM) aquarius: You can work with it like any key/value pair database.
(02:20:06 PM) aquarius: And then desktopcouch itself takes care of things like replicating your data to your netbook and your desktop without you having to do anything at all.
(02:20:16 PM) aquarius: And the users of your application can see their data directly by using the web interface; no more grovelling around in dotfiles or sqlite3 databases from the command line to work out what an application has stored.
(02:20:30 PM) aquarius: You can get at the web interface by browsing to file:///home/aquarius/.local/share/desktop-couch/couchdb.html in a web browser, which will take you to the right place.
(02:20:42 PM) aquarius: (er, if your username is aquarius you can, anyway :))
(02:20:48 PM) aquarius: I'll stop there for some questions about this section!
(02:21:24 PM) aquarius: ah, people in the chat channel are trying it out. YOu might need to install python-desktopcouch-records
(02:21:59 PM) aquarius: the version in karmic right now has a couple of strange outstanding bugs which we're working on which might make it a little difficult to follow along
(02:22:02 PM) aquarius: <mandel_macaque> QUESTION: (about views) which is the policy for design documents (views), one per app?
(02:22:34 PM) aquarius: mandel_macaque, no policy, thus far. Create whichever design docs you want to -- having one per app sounds sensible, but an app might want more than one
(02:22:48 PM) aquarius: mandel_macaque, this is an ideal topic to bring up for discussion on the mailing list :)
(02:23:14 PM) aquarius: <test1> QUESTION: Does desktopCouch/CouchDB provide a means controls access to my data on a per application basis? I would not necessarily want any application to be able to access any data - I might want to silo two mail apps to different databases, etc.
(02:23:51 PM) aquarius: test1, at the moment it does not (in much the same way as the filesystem doesn't), but it would be possible to build that in
(02:24:09 PM) aquarius: <mhall119|work> QUESTION: how does the HTML interact with couchdb?  Javascript?
(02:24:33 PM) aquarius: mhall119|work, (I assume you mean: how does the HTML web interface for browsing your data interact with couchdb?) yes, JavaScript
(02:24:43 PM) aquarius: <AntoineLeclair> QUESTION: so when I do CRUD, it's done locally, then replicated on the web DB? (and replicated locally from the web some other time to keep sync?)
...