Customizing the DataStore Data Dictionary Form

Extensions can customize the Data Dictionary form, keys available and values stored for each column using the IDataDictionaryForm interface.

class ckanext.datastore.interfaces.IDataDictionaryForm

Allow data dictionary validation and per-plugin data storage by extending the datastore_create schema and adding values to fields returned from datastore_info

update_datastore_create_schema(schema: Schema) Schema

Return a modified schema for handling field input in the data dictionary form and datastore_create parameters.

Validators are provided a plugin_data dict in the context that can be used to store per-field values. Top-level keys in this dict should match the field index, second-level keys should match the plugin name and values should be a dict with string keys storing data for that plugin.

e.g. a statistics plugin that needs to store per-column information might store this with plugin_data by inserting values like:

{0: {'statistics': {'minimum': 34, ...}, ...}, ...}

#                   ^ the data stored for this field+plugin
#     ^ the name of the plugin
#^ 0 for the first field passed in fields

Values not removed from field info by validation will be available in the field info dict returned from datastore_search and datastore_info

update_datastore_info_field(field: dict[str, Any], plugin_data: dict[str, Any])

Return a modified version of the datastore_info field dict based on this field’s plugin_data to provide additional information to users and existing values for new form fields in the data dictionary page.

Let’s add five new keys with custom validation rules to the data dictionary fields.

With this plugin enabled each field in the Data Dictionary form will have an input for:

  • an integer value

  • a JSON object

  • a numeric value that can only be increased when edited

  • a “sticky” value that will not be removed if left blank

  • a secret value that will be stored but never displayed in the form.

First extend the form template to render the form inputs:

{% ckan_extends %}

{% block additional_fields %}
  {{ form.input('fields__' ~ position ~ '__an_int',
    label=_('An integer'), id='example-plugin-f' ~ position ~ 'an_int',
    value=data.get('an_int', field.get('an_int', '')),
    classes=['control-full'], error=errors.an_int) }}

  {{ form.input('fields__' ~ position ~ '__json_obj',
    label=_('JSON object'), id='example-plugin-f' ~ position ~ 'json_obj',
    value=
      data.json_obj if 'json_obj' in data else
      h.dump_json(field['json_obj']) if 'json_obj' in field else '',
    classes=['control-full'], error=errors.json_obj) }}

  {{ form.input('fields__' ~ position ~ '__only_up',
    label=_('Always increasing'), id='example-plugin-f' ~ position ~ 'only_up',
    value=data.get('only_up', field.get('only_up', '')),
    classes=['control-full'], error=errors.only_up) }}

  {{ form.input('fields__' ~ position ~ '__sticky',
    label=_('Sticky input'), id='example-plugin-f' ~ position ~ 'sticky',
    value=data.get('sticky', field.get('sticky', '')),
    classes=['control-full'], error=errors.sticky) }}

  {{ form.input('fields__' ~ position ~ '__secret',
    label=_('Secret (write-only)'),
    id='example-plugin-f' ~ position ~ 'secret',
    value='', classes=['control-full'],
    error=errors.secret) }}
{% endblock %}

We use the form.input macro to render the form fields. The name of each field starts with fields__ and includes a position index because this block will be rendered once for every field in the data dictionary.

The value for each input is set to either the value from data the text data passed when re-rendering a form containing errors, or field the json value (text, number, object etc.) currently stored in the data dictionary when rendering a form for the first time.

The error for each field is set from errors.

Next we create a plugin to apply the template and validation rules for each data dictionary field key.

# encoding: utf-8

from __future__ import annotations

from typing import Any, cast
from ckan.types import Schema, ValidatorFactory
from ckan.common import CKANConfig
from ckan.lib.navl.dictization_functions import missing
from ckan.types import (
    Context, FlattenDataDict, FlattenErrorDict, FlattenKey,
)

import json

from ckan.plugins.toolkit import (
    Invalid, get_validator, add_template_directory, _
)
from ckan import plugins
from ckanext.datastore.interfaces import IDataDictionaryForm


class ExampleIDataDictionaryFormPlugin(plugins.SingletonPlugin):
    plugins.implements(IDataDictionaryForm)
    plugins.implements(plugins.IConfigurer)

    # IConfigurer

    def update_config(self, config: CKANConfig):
        add_template_directory(config, 'templates')

    # IDataDictionaryForm

    def update_datastore_create_schema(self, schema: Schema):
        ignore_empty = get_validator('ignore_empty')
        int_validator = get_validator('int_validator')
        unicode_only = get_validator('unicode_only')
        datastore_default_current = get_validator('datastore_default_current')
        to_datastore_plugin_data = cast(
            ValidatorFactory, get_validator('to_datastore_plugin_data'))
        to_eg_iddf = to_datastore_plugin_data('example_idatadictionaryform')

        f = cast(Schema, schema['fields'])
        f['an_int'] = [ignore_empty, int_validator, to_eg_iddf]
        f['json_obj'] = [ignore_empty, json_obj, to_eg_iddf]
        f['only_up'] = [
            only_increasing, ignore_empty, int_validator, to_eg_iddf]
        f['sticky'] = [
            datastore_default_current, ignore_empty, unicode_only, to_eg_iddf]

        # use different plugin_key so that value isn't removed
        # when above fields are updated & value not exposed in
        # datastore_info
        f['secret'] = [
            ignore_empty,
            to_datastore_plugin_data('example_idatadictionaryform_secret')
        ]
        return schema

    def update_datastore_info_field(
            self, field: dict[str, Any], plugin_data: dict[str, Any]):
        # expose all our non-secret plugin data in the field
        field.update(plugin_data.get('example_idatadictionaryform', {}))
        return field


def json_obj(value: str | dict[str, Any]) -> dict[str, Any]:
    '''accept only json objects i.e. dicts or "{...}"'''
    try:
        if isinstance(value, str):
            value = json.loads(value)
        else:
            json.dumps(value)
        if not isinstance(value, dict):
            raise TypeError
        return value
    except (TypeError, ValueError):
        raise Invalid(_('Not a JSON object'))


def only_increasing(
        key: FlattenKey, data: FlattenDataDict,
        errors: FlattenErrorDict, context: Context):
    '''once set only accept new values larger than current value'''
    value = data[key]
    field_index = key[-2]
    field_name = key[-1]
    # current values for plugin_data are available as
    # context['plugin_data'][field_index]['_current']
    current = context['plugin_data'].get(field_index, {}).get(
        '_current', {}).get('example_idatadictionaryform', {}).get(
        field_name)
    if current is None:
        return
    if value is not None and value != '' and value is not missing:
        try:
            if int(value) < current:
                errors[key].append(
                    _('Value must be larger than %d') % current)
        except ValueError:
            return  # allow int_validator to handle the error
    else:
        # keep current value when empty/missing
        data[key] = current

In update_datastore_create_schema the to_datastore_plugin_data factory generates a validator that will store our new keys as plugin data. The string passed is used to group keys for this plugin to allow multiple separate IDataDictionaryForm plugins to store data for Data Dictionary fields at the same time. It’s possible to use multiple groups from the same plugin: here we use a different group for the secret key because we want to treat it differently.

In update_datastore_info_field we can add keys stored as plugin data to the fields objects returned by datastore_info. Here we add everything but the secret key. These values are also passed to the form template above as field.