DataStore extension
The CKAN DataStore extension provides an ad hoc database for storage of structured data from CKAN resources. Data can be pulled out of resource files and stored in the DataStore.
When a resource is added to the DataStore, you get:
Automatic data previews on the resource’s page, using for instance the DataTables view extension
The Data API: search, filter and update the data, without having to download and upload the entire data file
The DataStore is integrated into the CKAN API and authorization system.
The DataStore is generally used alongside tools which will automatically upload data to the DataStore from suitable files, whether uploaded to CKAN’s FileStore or externally linked. See Automatically Adding Data to the DataStore for more details.
Relationship to FileStore
The DataStore is distinct but complementary to the FileStore (see FileStore and file uploads). In contrast to the FileStore which provides ‘blob’ storage of whole files with no way to access or query parts of that file, the DataStore is like a database in which individual data elements are accessible and queryable. To illustrate this distinction, consider storing a spreadsheet file like a CSV or Excel document. In the FileStore this file would be stored directly. To access it you would download the file as a whole. By contrast, if the spreadsheet data is stored in the DataStore, one would be able to access individual spreadsheet rows via a simple web API, as well as being able to make queries over the spreadsheet contents.
Setting up the DataStore
1. Enable the plugin
Add the datastore
plugin to your CKAN config file:
ckan.plugins = datastore
2. Set-up the database
Warning
Make sure that you follow the steps in Set Permissions below correctly. Wrong settings could lead to serious security issues.
The DataStore requires a separate PostgreSQL database to save the DataStore resources to.
List existing databases:
sudo -u postgres psql -l
Check that the encoding of databases is UTF8
, if not internationalisation may be a problem. Since changing the encoding of PostgreSQL may mean deleting existing databases, it is suggested that this is fixed before continuing with the datastore setup.
Create users and databases
Tip
If your CKAN database and DataStore databases are on different servers, then you need to create a new database user on the server where the DataStore database will be created. As in Installing CKAN from source we’ll name the database user ckan_default:
sudo -u postgres createuser -S -D -R -P -l ckan_default
Create a database_user called datastore_default. This user will be given read-only access to your DataStore database in the Set Permissions step below:
sudo -u postgres createuser -S -D -R -P -l datastore_default
Create the database (owned by ckan_default), which we’ll call datastore_default:
sudo -u postgres createdb -O ckan_default datastore_default -E utf-8
Set URLs
Now, uncomment the ckan.datastore.write_url and ckan.datastore.read_url lines in your CKAN config file and edit them if necessary, for example:
ckan.datastore.write_url = postgresql://ckan_default:pass@localhost/datastore_default ckan.datastore.read_url = postgresql://datastore_default:pass@localhost/datastore_default
Replace pass
with the passwords you created for your ckan_default and
datastore_default database users.
Set permissions
Once the DataStore database and the users are created, the permissions on the DataStore and CKAN database have to be set. CKAN provides a ckan command to help you correctly set these permissions.
If you are able to use the psql
command to connect to your database as a
superuser, you can use the datastore set-permissions
command to emit the
appropriate SQL to set the permissions.
For example, if you can connect to your database server as the postgres
superuser using:
sudo -u postgres psql
Then you can use this connection to set the permissions:
ckan -c /etc/ckan/default/ckan.ini datastore set-permissions | sudo -u postgres psql --set ON_ERROR_STOP=1
Note
If you performed a package install, you will need to replace all references to ‘ckan -c /etc/ckan/default/ckan.ini …’ with ‘sudo ckan …’ and provide the path to the config file, e.g.:
sudo ckan datastore set-permissions | sudo -u postgres psql --set ON_ERROR_STOP=1
If your database server is not local, but you can access it over SSH, you can pipe the permissions script over SSH:
ckan -c /etc/ckan/default/ckan.ini datastore set-permissions | ssh dbserver sudo -u postgres psql --set ON_ERROR_STOP=1
If you can’t use the psql
command in this way, you can simply copy and paste
the output of:
ckan -c /etc/ckan/default/ckan.ini datastore set-permissions
into a PostgreSQL superuser console.
3. Test the set-up
The DataStore is now set-up. To test the set-up, (re)start CKAN and run the following command to list all DataStore resources:
curl -X GET "http://127.0.0.1:5000/api/3/action/datastore_search?resource_id=_table_metadata"
This should return a JSON page without errors.
To test the whether the set-up allows writing, you can create a new DataStore resource. To do so, run the following command:
curl -X POST http://127.0.0.1:5000/api/3/action/datastore_create -H "Authorization: {YOUR-API-KEY}" -d '{"resource": {"package_id": "{PACKAGE-ID}"}, "fields": [ {"id": "a"}, {"id": "b"} ], "records": [ { "a": 1, "b": "xyz"}, {"a": 2, "b": "zzz"} ]}'
Replace {YOUR-API-KEY}
with a valid API key and {PACKAGE-ID}
with the
id of an existing CKAN dataset.
A table named after the resource id should have been created on your DataStore database. Visiting this URL should return a response from the DataStore with the records inserted above:
http://127.0.0.1:5000/api/3/action/datastore_search?resource_id={RESOURCE_ID}
Replace {RESOURCE-ID}
with the resource id that was returned as part of the
response of the previous API call.
You can now delete the DataStore table with:
curl -X POST http://127.0.0.1:5000/api/3/action/datastore_delete -H "Authorization: {YOUR-API-KEY}" -d '{"resource_id": "{RESOURCE-ID}"}'
To find out more about the Data API, see The Data API.
Automatically Adding Data to the DataStore
In most cases, you will want data that is added to CKAN (whether it is linked to or uploaded to the FileStore) to be automatically added to the DataStore. This requires some processing, to extract the data from your files and to add it to the DataStore in the format the DataStore can handle.
This task of automatically parsing and then adding data to the DataStore can be performed by different tools, you can choose the one the best fits your requirements:
XLoader is the officially supported extension for automated uploads to the DataStore. It runs as a background job and supports type guessing and limiting the number of rows imported among other settings.
DataPusher+ (DataPusher Plus) is a next-generation replacement for the DataPusher, maintained by datHere. It focuses on increased performance and robustness and includes data pre-processing capabilities to infer fields, transform data, etc.
AirCan is a tool built on top of Apache Airflow maintained by Datopian that among other functionalities supports automated data uploads to the DataStore.
DataPusher is a legacy tool that is no longer maintained. It presents significant limitations so users are encouraged to migrate to one of the tools above.
Data Dictionary
DataStore columns may be described with a Data Dictionary. A Data Dictionary tab will appear when editing any resource with a DataStore table. The Data Dictionary form allows entering the following values for each column:
Type Override: the type to be used the next time DataPusher is run to load data into this column
Label: a human-friendly label for this column
Description: a full description for this column in markdown format
The Data Dictionary is set through the API as part of the Fields passed
to datastore_create()
and
returned from datastore_search()
.
See also
For information on customizing the Data Dictionary form, see Customizing the DataStore Data Dictionary Form.
Downloading Resources
A DataStore resource can be downloaded in the CSV file format from {CKAN-URL}/datastore/dump/{RESOURCE-ID}
.
For an Excel-compatible CSV file use {CKAN-URL}/datastore/dump/{RESOURCE-ID}?bom=true
.
Other formats supported include tab-separated values (?format=tsv
),
JSON (?format=json
) and XML (?format=xml
). E.g. to download an Excel-compatible
tab-separated file use
{CKAN-URL}/datastore/dump/{RESOURCE-ID}?format=tsv&bom=true
.
- A number of parameters from
datastore_search()
can be used: offset
,limit
,filters
,q
,full_text
,distinct
,plain
,language
,fields
,sort
The Data API
The CKAN DataStore offers an API for reading, searching and filtering data without the need to download the entire file first. The DataStore is an ad hoc database which means that it is a collection of tables with unknown relationships. This allows you to search in one DataStore resource (a table in the database) as well as queries across DataStore resources.
Data can be written incrementally to the DataStore through the API. New data can be inserted, existing data can be updated or deleted. You can also add a new column to an existing table even if the DataStore resource already contains some data.
Triggers may be added to enforce validation, clean data as it is loaded or even record histories. Triggers are PL/pgSQL functions that must be created by a sysadmin.
You will notice that we tried to keep the layer between the underlying PostgreSQL database and the API as thin as possible to allow you to use the features you would expect from a powerful database management system.
A DataStore resource can not be created on its own. It is always required to have an associated CKAN resource. If data is stored in the DataStore, it can automatically be previewed by a preview extension.
Making a Data API request
Making a Data API request is the same as making an Action API request: you post a JSON dictionary in an HTTP POST request to an API URL, and the API also returns its response in a JSON dictionary. See the API guide for details.
API reference
Note
Lists can always be expressed in different ways. It is possible to use lists, comma separated strings or single items. These are valid lists: ['foo', 'bar']
, 'foo, bar'
, "foo", "bar"
and 'foo'
. Additionally, there are several ways to define a boolean value. True
, on
and 1
are all vaid boolean values.
Note
The table structure of the DataStore is explained in Internal structure of the database.
- ckanext.datastore.logic.action.datastore_create(context: Context, data_dict: dict[str, Any])
Adds a new table to the DataStore.
The datastore_create action allows you to post JSON data to be stored against a resource. This endpoint also supports altering tables, aliases and indexes and bulk insertion. This endpoint can be called multiple times to initially insert more data, add/remove fields, change the aliases or indexes as well as the primary keys.
To create an empty datastore resource and a CKAN resource at the same time, provide
resource
with a validpackage_id
and omit theresource_id
.If you want to create a datastore resource from the content of a file, provide
resource
with a validurl
.See Fields and Records for details on how to lay out records.
- Parameters:
resource_id (string) – resource id that the data is going to be stored against.
force (bool (optional, default: False)) – set to True to edit a read-only resource
resource (dictionary) – resource dictionary that is passed to
resource_create()
. Use instead ofresource_id
(optional)aliases (list or comma separated string) – names for read only aliases of the resource. (optional)
fields (list of dictionaries) – fields/columns and their extra metadata. (optional)
delete_fields (bool (optional, default: False)) – set to True to remove existing fields not passed
records (list of dictionaries) – the data, eg: [{“dob”: “2005”, “some_stuff”: [“a”, “b”]}] (optional)
primary_key (list or comma separated string) – fields that represent a unique key (optional)
indexes (list or comma separated string) – indexes on table (optional)
triggers (list of dictionaries) – trigger functions to apply to this table on update/insert. functions may be created with
datastore_function_create()
. eg: [ {“function”: “trigger_clean_reference”}, {“function”: “trigger_check_codes”}]calculate_record_count (bool (optional, default: False)) – updates the stored count of records, used to optimize datastore_search in combination with the total_estimation_threshold parameter. If doing a series of requests to change a resource, you only need to set this to True on the last request.
Please note that setting the
aliases
,indexes
orprimary_key
replaces the existing aliases or constraints. Settingrecords
appends the provided records to the resource. Settingfields
without including all existing fields will remove the others and the data they contain.Results:
- Returns:
The newly created data object, excluding
records
passed.- Return type:
dictionary
See Fields and Records for details on how to lay out records.
- ckanext.datastore.logic.action.datastore_run_triggers(context: Context, data_dict: dict[str, Any]) int
update each record with trigger
The datastore_run_triggers API action allows you to re-apply existing triggers to an existing DataStore resource.
- Parameters:
resource_id (string) – resource id that the data is going to be stored under.
Results:
- Returns:
The rowcount in the table.
- Return type:
int
- ckanext.datastore.logic.action.datastore_upsert(context: Context, data_dict: dict[str, Any])
Updates or inserts into a table in the DataStore
The datastore_upsert API action allows you to add or edit records to an existing DataStore resource. In order for the upsert and update methods to work, a unique key has to be defined via the datastore_create action. The available methods are:
- upsert
Update if record with same key already exists, otherwise insert. Requires unique key or _id field.
- insert
Insert only. This method is faster that upsert, but will fail if any inserted record matches an existing one. Does not require a unique key.
- update
Update only. An exception will occur if the key that should be updated does not exist. Requires unique key or _id field.
- Parameters:
resource_id (string) – resource id that the data is going to be stored under.
force (bool (optional, default: False)) – set to True to edit a read-only resource
records (list of dictionaries) – the data, eg: [{“dob”: “2005”, “some_stuff”: [“a”,”b”]}] (optional)
method (string) – the method to use to put the data into the datastore. Possible options are: upsert, insert, update (optional, default: upsert)
calculate_record_count (bool (optional, default: False)) – updates the stored count of records, used to optimize datastore_search in combination with the total_estimation_threshold parameter. If doing a series of requests to change a resource, you only need to set this to True on the last request.
dry_run (bool (optional, default: False)) – set to True to abort transaction instead of committing, e.g. to check for validation or type errors.
Results:
- Returns:
The modified data object.
- Return type:
dictionary
- ckanext.datastore.logic.action.datastore_info(context: Context, data_dict: dict[str, Any]) dict[str, Any]
Returns detailed metadata about a resource.
- Parameters:
resource_id (string) – id or alias of the resource we want info about.
Results:
- Return type:
dictionary
- Returns:
meta: resource metadata dictionary with the following keys:
aliases - aliases (views) for the resource
count - row count
db_size - size of the datastore database (bytes)
id - resource id (useful for dereferencing aliases)
idx_size - size of all indices for the resource (bytes)
size - size of resource (bytes)
table_type - BASE TABLE, VIEW, FOREIGN TABLE or MATERIALIZED VIEW
fields: A list of dictionaries based on Fields, with an additional nested dictionary per field called schema, with the following keys:
native_type - native database data type
index_name
is_index
notnull
uniquekey
- ckanext.datastore.logic.action.datastore_delete(context: Context, data_dict: dict[str, Any])
Deletes a table or a set of records from the DataStore. (Use
datastore_records_delete()
to keep tables intact)- Parameters:
resource_id (string) – resource id that the data will be deleted from. (optional)
force (bool (optional, default: False)) – set to True to edit a read-only resource
filters (dictionary) – Filters to apply before deleting (eg {“name”: “fred”}). If missing delete whole table and all dependent views. (optional)
calculate_record_count (bool (optional, default: False)) – updates the stored count of records, used to optimize datastore_search in combination with the total_estimation_threshold parameter. If doing a series of requests to change a resource, you only need to set this to True on the last request.
Results:
- Returns:
Original filters sent.
- Return type:
dictionary
- ckanext.datastore.logic.action.datastore_records_delete(context: Context, data_dict: dict[str, Any])
Deletes records from a DataStore table but will never remove the table itself.
- Parameters:
resource_id (string) – resource id that the data will be deleted from. (required)
force (bool (optional, default: False)) – set to True to edit a read-only resource
filters (dictionary) – Filters to apply before deleting (eg {“name”: “fred”}). If {} delete all records. (required)
calculate_record_count (bool (optional, default: False)) – updates the stored count of records, used to optimize datastore_search in combination with the total_estimation_threshold parameter. If doing a series of requests to change a resource, you only need to set this to True on the last request.
Results:
- Returns:
Original filters sent.
- Return type:
dictionary
- ckanext.datastore.logic.action.datastore_search(context: Context, data_dict: dict[str, Any])
Search a DataStore resource.
The datastore_search action allows you to search data in a resource. By default 100 rows are returned - see the limit parameter for more info.
A DataStore resource that belongs to a private CKAN resource can only be read by you if you have access to the CKAN resource and send the appropriate authorization.
- Parameters:
resource_id (string) – id or alias of the resource to be searched against
filters (dictionary) – Filters for matching conditions to select, e.g {“key1”: “a”, “key2”: “b”} (optional)
q (string or dictionary) – full text query. If it’s a string, it’ll search on all fields on each row. If it’s a dictionary as {“key1”: “a”, “key2”: “b”}, it’ll search on each specific field (optional)
full_text (string) – full text query. It search on all fields on each row. This should be used in replace of
q
when performing string search accross all fieldsdistinct (bool) – return only distinct rows (optional, default: false)
plain (bool) – treat as plain text query (optional, default: true)
language (string) – language of the full text query (optional, default: english)
limit (int) – maximum number of rows to return (optional, default:
100
, unless set in the site’s configurationckan.datastore.search.rows_default
, upper limit:32000
unless set in site’s configurationckan.datastore.search.rows_max
)offset (int) – offset this number of rows (optional)
fields (list or comma separated string) – fields to return (optional, default: all fields in original order)
sort (string) – comma separated field names with ordering e.g.: “fieldname1, fieldname2 desc nulls last”
include_total (bool) – True to return total matching record count (optional, default: true)
total_estimation_threshold (int or None) – If “include_total” is True and “total_estimation_threshold” is not None and the estimated total (matching record count) is above the “total_estimation_threshold” then this datastore_search will return an estimate of the total, rather than a precise one. This is often good enough, and saves computationally expensive row counting for larger results (e.g. >100000 rows). The estimated total comes from the PostgreSQL table statistics, generated when Express Loader or DataPusher finishes a load, or by autovacuum. NB Currently estimation can’t be done if the user specifies ‘filters’ or ‘distinct’ options. (optional, default: None)
records_format (controlled list) – the format for the records return value: ‘objects’ (default) list of {fieldname1: value1, …} dicts, ‘lists’ list of [value1, value2, …] lists, ‘csv’ string containing comma-separated values with no header, ‘tsv’ string containing tab-separated values with no header
Setting the
plain
flag to false enables the entire PostgreSQL full text search query language.A listing of all available resources can be found at the alias
_table_metadata
.If you need to download the full resource, read Downloading Resources.
Results:
The result of this action is a dictionary with the following keys:
- Return type:
A dictionary with the following keys
- Parameters:
fields (list of dictionaries) – fields/columns and their extra metadata
offset (int) – query offset value
limit (int) – queried limit value (if the requested
limit
was above theckan.datastore.search.rows_max
value then this responselimit
will be set to the value ofckan.datastore.search.rows_max
)filters (list of dictionaries) – query filters
total (int) – number of total matching records
total_was_estimated (bool) – whether or not the total was estimated
records (depends on records_format value passed) – list of matching results
- ckanext.datastore.logic.action.datastore_search_sql(context: Context, data_dict: dict[str, Any])
Execute SQL queries on the DataStore.
The datastore_search_sql action allows a user to search data in a resource or connect multiple resources with join expressions. The underlying SQL engine is the PostgreSQL engine. There is an enforced timeout on SQL queries to avoid an unintended DOS. The number of results returned is limited to 32000, unless set in the site’s configuration
ckan.datastore.search.rows_max
Queries are only allowed if you have access to the all the CKAN resources in the query and send the appropriate authorization.Note
This action is not available by default and needs to be enabled with the ckan.datastore.sqlsearch.enabled setting.
Note
When source data columns (i.e. CSV) heading names are provided in all UPPERCASE you need to double quote them in the SQL select statement to avoid returning null results.
- Parameters:
sql (string) – a single SQL select statement
Results:
The result of this action is a dictionary with the following keys:
- Return type:
A dictionary with the following keys
- Parameters:
fields (list of dictionaries) – fields/columns and their extra metadata
records (list of dictionaries) – list of matching results
records_truncated (bool) – indicates whether the number of records returned was limited by the internal limit, which is 32000 records (or other value set in the site’s configuration
ckan.datastore.search.rows_max
). If records are truncated by this, this key has value True, otherwise the key is not returned at all.
- ckanext.datastore.logic.action.set_datastore_active_flag(context: Context, data_dict: dict[str, Any], flag: bool)
Set appropriate datastore_active flag on CKAN resource.
Called after creation or deletion of DataStore table.
- ckanext.datastore.logic.action.datastore_function_create(context: Context, data_dict: dict[str, Any])
Create a trigger function for use with datastore_create
- Parameters:
name (string) – function name
or_replace (bool) – True to replace if function already exists (default: False)
rettype (string) – set to ‘trigger’ (only trigger functions may be created at this time)
definition (string) – PL/pgSQL function body for trigger function
- ckanext.datastore.logic.action.datastore_function_delete(context: Context, data_dict: dict[str, Any])
Delete a trigger function
- Parameters:
name (string) – function name
Fields
Fields define the column names and the type of the data in a column. A field is defined as follows:
{
"id": # the column name (required)
"type": # the data type for the column
"info": {
"label": # human-readable label for column
"notes": # markdown description of column
"type_override": # type for datapusher to use when importing data
...: # free-form user-defined values
}
...: # values defined and validated with IDataDictionaryForm
}
Field types not provided will be guessed based on the first row of provided data. Set the types to ensure that future inserts will not fail because of an incorrectly guessed type. See Field types for details on which types are valid.
See also
For more on custom field values and customizing the Data Dictionary form, see Customizing the DataStore Data Dictionary Form.
Records
A record is the data to be inserted in a DataStore resource and is defined as follows:
{
column_1_id: value_1,
columd_2_id: value_2,
...
}
Example:
[
{
"code_number": 10,
"description": "Submitted successfully"
},
{
"code_number": 42,
"description": "In progress"
}
]
Field types
The DataStore supports all types supported by PostgreSQL as well as a few additions. A list of the PostgreSQL types can be found in the type section of the documentation. Below you can find a list of the most common data types. The json
type has been added as a storage for nested data.
In addition to the listed types below, you can also use array types. They are defines by prepending a _
or appending []
or [n]
where n denotes the length of the array. An arbitrarily long array of integers would be defined as int[]
.
- text
Arbitrary text data, e.g.
Here's some text
.- json
Arbitrary nested json data, e.g
{"foo": 42, "bar": [1, 2, 3]}
. Please note that this type is a custom type that is wrapped by the DataStore.- date
Date without time, e.g
2012-5-25
.- time
Time without date, e.g
12:42
.- timestamp
Date and time, e.g
2012-10-01T02:43Z
.- int
Integer numbers, e.g
42
,7
.- float
Floats, e.g.
1.61803
.- bool
Boolean values, e.g.
true
,0
You can find more information about the formatting of dates in the date/time types section of the PostgreSQL documentation.
Filters
Filters define the matching conditions to select from the DataStore. A filter is defined as follows:
{
"resource_id": # the resource ID (required)
"filters": {
# column name: # field value
# column name: # List of field values
...: # other user-defined filters
}
}
Filters must be supplied as a dictonary. Filters are used as WHERE statements. The filters have to be valid key/value pairs. The key must be a valid column name and the value must match the respective column type. The value may be provided as a List of multiple matching values. See Field types for details on which types are valid.
Example (single filter values, used as WHERE = statements):
{
"resource_id": "5f38da22-7d55-4312-81ce-17f1a9e84788",
"filters": {
"name": "Fred",
"dob": "1994-7-07"
}
}
Example (multiple filter values, used as WHERE IN statements):
{
"resource_id": "5f38da22-7d55-4312-81ce-17f1a9e84788",
"filters": {
"name": ["Fred", "Jones"],
"dob": ["1994-7-07", "1992-7-27"]
}
}
Resource aliases
A resource in the DataStore can have multiple aliases that are easier to remember than the resource id. Aliases can be created and edited with the datastore_create()
API endpoint. All aliases can be found in a special view called _table_metadata
. See Internal structure of the database for full reference.
Comparison of different querying methods
The DataStore supports querying with two API endpoints. They are similar but support different features. The following list gives an overview of the different methods.
Ease of use |
Easy |
Complex |
Flexibility |
Low |
High |
Query language |
Custom (JSON) |
SQL |
Join resources |
No |
Yes |
Internal structure of the database
The DataStore is a thin layer on top of a PostgreSQL database. Each DataStore resource belongs to a CKAN resource. The name of a table in the DataStore is always the resource id of the CKAN resource for the data.
As explained in Resource aliases, a resource can have mnemonic aliases which are stored as views in the database.
All aliases (views) and resources (tables respectively relations) of the DataStore can be found in a special view called _table_metadata
. To access the list, open http://{YOUR-CKAN-INSTALLATION}/api/3/action/datastore_search?resource_id=_table_metadata
.
_table_metadata
has the following fields:
- _id
Unique key of the relation in
_table_metadata
.- alias_of
Name of a relation that this alias point to. This field is
null
iff the name is not an alias.- name
Contains the name of the alias if alias_of is not null. Otherwise, this is the resource id of the CKAN resource for the DataStore resource.
- oid
The PostgreSQL object ID of the table that belongs to name.
Extending DataStore
Starting from CKAN version 2.7, backend used in DataStore can be replaced with custom one. For this purpose, custom extension must implement ckanext.datastore.interfaces.IDatastoreBackend, which provides one method - register_backends. It should return dictonary with names of custom backends as keys and classes, that represent those backends as values. Each class supposed to be inherited from ckanext.datastore.backend.DatastoreBackend.
Note
Example of custom implementation can be found at ckanext.example_idatastorebackend
- ckanext.datastore.backend.get_all_resources_ids_in_datastore() list[str]
Helper for getting id of all resources in datastore.
Uses get_all_ids of active datastore backend.
- exception ckanext.datastore.backend.DatastoreException
- class ckanext.datastore.backend.DatastoreBackend
Base class for all datastore backends.
Very simple example of implementation based on SQLite can be found in ckanext.example_idatastorebackend. In order to use it, set datastore.write_url to ‘example-sqlite:////tmp/database-name-on-your-choice’
- Prop _backend:
mapping(schema, class) of all registered backends
- Prop _active_backend:
current active backend
- classmethod register_backends()
Register all backend implementations inside extensions.
- classmethod set_active_backend(config: CKANConfig)
Choose most suitable backend depending on configuration
- Parameters:
config – configuration object
- Return type:
ckan.common.CKANConfig
- classmethod get_active_backend()
Return currently used backend
- configure(config: CKANConfig)
Configure backend, set inner variables, make some initial setup.
- Parameters:
config – configuration object
- Returns:
config
- Return type:
CKANConfig
- create(context: Context, data_dict: dict[str, Any], plugin_data: dict[int, dict[str, Any]]) Any
Create new resourct inside datastore.
Called by datastore_create.
- Parameters:
data_dict – See ckanext.datastore.logic.action.datastore_create
- Returns:
The newly created data object
- Return type:
dictonary
- upsert(context: Context, data_dict: dict[str, Any]) Any
Update or create resource depending on data_dict param.
Called by datastore_upsert.
- Parameters:
data_dict – See ckanext.datastore.logic.action.datastore_upsert
- Returns:
The modified data object
- Return type:
dictonary
- delete(context: Context, data_dict: dict[str, Any]) Any
Remove resource from datastore.
Called by datastore_delete.
- Parameters:
data_dict – See ckanext.datastore.logic.action.datastore_delete
- Returns:
Original filters sent.
- Return type:
dictonary
- search(context: Context, data_dict: dict[str, Any]) Any
Base search.
Called by datastore_search.
- Parameters:
data_dict – See ckanext.datastore.logic.action.datastore_search
fields (list of dictionaries) – fields/columns and their extra metadata
offset (int) – query offset value
limit (int) – query limit value
filters (list of dictionaries) – query filters
total (int) – number of total matching records
records (list of dictionaries) – list of matching results
- Return type:
dictonary with following keys
- search_sql(context: Context, data_dict: dict[str, Any]) Any
Advanced search.
Called by datastore_search_sql. :param sql: a single seach statement :type sql: string
- Return type:
dictonary
- Parameters:
fields (list of dictionaries) – fields/columns and their extra metadata
records (list of dictionaries) – list of matching results
- resource_exists(id: str) bool
Define whether resource exists in datastore.
- resource_fields(id: str) Any
Return dictonary with resource description.
Called by datastore_info. :returns: A dictionary describing the columns and their types.
- resource_info(id: str) Any
Return DataDictonary with resource’s info - #3414
- resource_id_from_alias(alias: str) Any
Convert resource’s alias to real id.
- Parameters:
alias (string) – resource’s alias or id
- Returns:
real id of resource
- Return type:
string
- get_all_ids() list[str]
Return id of all resource registered in datastore.
- Returns:
all resources ids
- Return type:
list of strings
- create_function(*args: Any, **kwargs: Any) Any
Called by datastore_function_create action.
- drop_function(*args: Any, **kwargs: Any) Any
Called by datastore_function_delete action.