Database Management

Note

See Command Line Interface for details on running the paster commands mentioned below.

Initialization

Before you can run CKAN for the first time, you need to run db init to initialize your database:

paster db init -c /etc/ckan/default/production.ini

If you forget to do this you’ll see this error message in your web browser:

503 Service Unavailable: This site is currently off-line. Database is not initialised.

Cleaning

Warning

This will delete all data from your CKAN database!

You can delete everything in the CKAN database, including the tables, to start from scratch:

paster db clean -c /etc/ckan/default/production.ini

After cleaning the database you must do either initialize it or import a previously created dump.

Import and Export

Dumping and Loading databases to/from a file

PostgreSQL offers the command line tools pg_dump and pg_restore for dumping and restoring a database and its content to/from a file.

For example, first dump your CKAN database:

sudo -u postgres pg_dump --format=custom -d ckan_default > ckan.dump

Warning

The exported file is a complete backup of the database, and includes API keys and other user data which may be regarded as private. So keep it secure, like your database server.

Note

If you’ve chosen a non-default database name (i.e. not ckan_default) then you need to adapt the commands accordingly.

Then restore it again:

paster db clean -c /etc/ckan/default/production.ini
sudo -u postgres pg_restore --clean --if-exists -d ckan_default < ckan.dump

If you’re importing a dump from an older version of CKAN you must upgrade the database schema after the import.

Once the import (and a potential upgrade) is complete you should rebuild the search index.

Note

Earlier versions of CKAN offered the paster commands db dump and db load. These are still available but are deprecated in favor of the native tools of PostgreSQL mentioned above. db dump and db load will be removed in future versions of CKAN.

Exporting Datasets to JSON Lines

You can export all of your CKAN site’s datasets from your database to a JSON Lines file using ckanapi:

ckanapi dump datasets -c /etc/ckan/default/production.ini -O my_datasets.jsonl

This is useful to create a simple public listing of the datasets, with no user information. Some simple additions to the Apache config can serve the dump files to users in a directory listing. To do this, add these lines to your virtual Apache config file (e.g. /etc/apache2/sites-available/ckan_default.conf):

Alias /dump/ /home/okfn/var/srvc/ckan.net/dumps/

# Disable the mod_python handler for static files
<Location /dump>
    SetHandler None
    Options +Indexes
</Location>

Warning

Don’t serve an SQL dump of your database (created using the pg_dump command), as those contain private user information such as email addresses and API keys.

Exporting User Accounts to JSON Lines

You can export all of your CKAN site’s user accounts from your database to a JSON Lines file using ckanapi:

ckanapi dump users -c /etc/ckan/default/production.ini -O my_database_users.jsonl

Upgrading

Warning

You should create a backup of your database before upgrading it.

To avoid problems during the database upgrade, comment out any plugins that you have enabled in your ini file. You can uncomment them again when the upgrade finishes.

If you are upgrading to a new CKAN major release update your CKAN database’s schema using the paster db upgrade command:

paster db upgrade -c /etc/ckan/default/production.ini