How to build web applications that can work offline with PouchDB?

In some of our projects, we need to build mobile applications that can be used offline. In this article, Valentin presented a fast way to develop native applications for both iOS and Android using a single code base in JavaScript. This time, instead of writing native applications, we thought that the current mobile browsers were performant enough to efficiently run JavaScript.

Thanks to that choice, the application is closer to what we are used to develop everyday at Theodo, and we can use cool technologies such as NodeJS, AngularJS, Gulp… (some Theodoers wrote some articles on these subjects like this Angular tutorial or this Gulp book).
But there are still some questions to answer. The main one concerns data circulating in our application. Indeed, most of the features of our application can be used offline (the joy of working with a client-side language :-)), but it is worthless if all the work made at this moment is lost or unavailable for other users.

Thus, we were looking for a way to store data when you are offline and to make it available when you are back online. PouchDB does exactly that. This JavaScript library works the same way as a CouchDB database and enables data replication between a server-side database and a client-side database.

But, first, what is CouchDB ?

CouchDB logo

CouchDB is an open source NoSQL database using JSON to store data. It is a document-oriented database that can be requested by HTTP.

In other words, if you have a CouchDB instance running in your local environment on the port 5984 and you want to see the document having the id ‘document_id’ on the database ‘test’, all you have to do is make a GET request on the URL :

    http://localhost:5984/test/document_id

Then, the response will look like this:

{
  "_id": "document_id",
  "_rev": "946B7D1C",
  "subject": "CouchDB presentation",
  "author": "Yann",
  "postedDate": "2014-07-04T17:30:12Z-04:00",
  "tags": ["couchdb", "relax", "nosql"],
  "body": "It is as simple as this to retrieve a document from a CouchDB database!"
}

As you may guess, it is as easy to create, update or delete a document, by making a POST, PUT or DELETE request to the database.

CouchDB comes with other features, like the possibility to define filters. For instance, if I have a CouchDB database containing a set of messages whose author can be Alice or Bob, and I define the following document:

    {
        "_id": "_design/app",
        "_rev": "1-b20db05077a51944afd11dcb3a6f18f1",
        "filters": {
            "name": "function(doc, req) { if(doc.name == req.query.author) { return true; }
                     else { return false; }}"
        }
    }

On this URL :

    http://localhost:5984/db/_changes?filter=app/name?author=Alice

I will see all the documents matching the filter ‘name’ with ‘Alice’ standing for the parameter ‘author’, another way to say that the response will contain all the messages written by Alice!

But we haven’t seen yet the main reason of why CouchDB should be chosen over any other database system for our initial needs. This choice is driven by the fact that CouchDB is made to easily replicate databases. At the end of a replication between two CouchDB databases, all active documents on the source database are also in the destination database and all documents that were deleted in the source databases are also deleted (if they existed) on the destination database.

You should not be afraid to override important data that you want to keep during this process, each document comes with a revision id, and all the history of a document is stored and available. It’s up to you to handle conflicts that can be introduced by incompatible changes made by different users on a database.

Now that we have seen how CouchDB can be used, let’s see how PouchDB can be used in our project, and how he interacts with CouchDB.

PouchDB, the JavaScript database that syncs

First, to install PouchDB you can use npm, bower or simply download the sources if you don’t use any of these tools (I recommend you to use them).

Once ready, you will see that creating a new PouchDb database is as simple as:

var db = new PouchDB("dbname");

The CRUD operations are also intuitive to write, for instance the method used to fetch a document is

db.get(docId, [options], [callback]);

For other methods you can believe me or check their documentation there. I will just emphasize on the method permitting to replicate from or to a distant CouchDB database

db.replicate.to(remoteDB, [options]);
// or
db.replicate.from(remoteDB, [options]);

Given all these tools, we build our application following this general architecture:

After being authenticated by the server, we create a new PouchDB database and we replicate this user’s data from the CouchDB database running on the server, thanks to a filter similar to the one presented earlier.

When a user is logged, all his actions are stored in the PouchDB database. When it is possible (i.e if the user is online), a process of continuous synchronization sends all the PouchDB data to the CouchDB database and vice versa.

Just before user logout, we launch one last time a replication process from the PouchDB database to the CouchDB one, then we destroy the PouchDB database.

It works like a charm, but you have to be cautious about some issues. First, even if it is possible to do it, it is not recommended to store your attachments in a PouchDB or a CouchDB database. As explained in this article, it fattens your database and makes the login replication last much longer.
Next, be restrictive about the data you replicate. The lighter it is, the faster it will be to replicate or request in. For example, use your CouchDB filter only for the last revisions of your documents by using the ‘?style=main_only’ option in your request. The idea is to avoid outdated documents that are not compatible anymore with your code.

To conclude with, thanks to PouchDB we manage to build an application that could store data locally while it’s offline, and send it to a central CouchDB database as soon as it is online. Enjoy it, and if you need any extra feature, develop it and make a Pull Request to the GitHub project!

Bonus

How can I migrate A MySQL database to a CouchDB one?

If you consider writing a new app using CouchDB for an existing business, with its existing SQL database keeping these data is a key point. A way to do it is to use of the Node.js library cradle. It is a CouchDB client that allows every operation that we are used to make with CouchDB. Coupled with a MySQL client such as node-mysql, it is possible to make a MySQL query, and store all that you need in new CouchDB documents. Run this task periodically and your CouchDB database will be “in sync” with the MySQL database.
Be aware to save a new CouchDB document only if it was modified. CouchDB stores all the revisions of a document, save documents when they are unmodified will increase the database size without any valuable reason. The following script can be used as a skeleton (it is written in CoffeeScript):

_       = require 'underscore'
yamljs  = require 'yamljs'
mysql   = require 'mysql'
cradle  = require 'cradle'

file    = fs.readFileSync(__dirname + '/../config.yml', 'utf8')
options = yamljs.parse(file)

connection = mysql.createConnection(
  user:     options.mysql.user
  password: options.mysql.password
  database: options.mysql.db
)

query = """
        SELECT * FROM ... WHERE ...
        """

db = new (cradle.Connection)().database(options.couchdb)

connection.query query, (err, rows, fields) =>
    throw err  if err

    _.each rows, (row) ->
        _.map row, (field, key) ->
            try
                row[key] = decodeURIComponent(escape(field)) if _.isString(field)
            catch error
                console.log "#{error} | #{field}"

        newDocument =
            user:
                firstname:  row.firstname
                lastname:   row.lastname
                username:   row.username
            location:
                address:    row.address
                city:       row.city
                postalCode: row.postalCode
                country:    row.country

        db.save(newDocument)

How does PouchDB work?

To store the documents locally, PouchDB uses the database embedded in the user’s browser. By default, it will be an IndexedDB database in Firefox/Chrome/IE, WebSQL in Safari and LevelDB in Node.js.
According to the browser, different size limits exist for this local database, but as long as you stay with JSON documents and small attachments you don’t have to worry for it.
You can override this choice by creating your PouchDB database this way :

var pouch = new PouchDB("myDB", { adapter: "websql" }); // can also be idb, leveldb or http
if (!pouch.adapter) {
  // websql not supported by this browser
  pouch = new PouchDB("myDB");
}

For most of its operations, PouchDB operates as an intermediate between a JavaScript app and these local databases. However, it interacts differently during the process of replication, where PouchDB should be able to communicate with a CouchDB database for instance. Moreover, in the case of a continuous replication, it would be better to replicate only the last changes made in the source database.

Actually, an incremental id is given to every modification on a PouchDB or CouchDB document. These ids are used as checkpoints in the process of replication. After checking all the changes between the last checkpoint replicated and the last change made, these modifications are sent by batches to the destination database. Each batch is processed one by one, and the id of the last change replicated of a batch is marked as the new checkpoint.

This way, the replication process only copies the changes needed.

How to build web applications that can work offline with PouchDB?

But, first, what is CouchDB ?

PouchDB, the JavaScript database that syncs

Bonus

Yann Jacquot

Web Developer at Theodo

Theodo France

Liked this article?