Skip to content

Instantly share code, notes, and snippets.

@simonwo
Created September 26, 2019 16:49
Show Gist options
  • Save simonwo/60bbf8cb1de68acdbaf0c017852a8b15 to your computer and use it in GitHub Desktop.
Save simonwo/60bbf8cb1de68acdbaf0c017852a8b15 to your computer and use it in GitHub Desktop.

GOV.UK Registers Profile

This page is a summary of the GDS profile of Registers (e.g. specific choices above the entry log level). This is the version that orc processes.

Mostly this is from https://raw.githubusercontent.com/openregister/specification/rsf-spec/index.html which can be downloaded and displayed locally.

RSF

RSF as per: https://github.com/openregister/registers-rfcs/blob/rsf-spec/content/rsf-spec/index.md

Keys

Key names are alphanumeric strings with dashes or underscores. (Registers.app doesn't enforce this because there's no technical reason why this need to be true, so long as the key doesn't include a tab character.)

Fields

Field names must be lowercase letters and hyphens. (Registers.app doesn't enforce this as above.)

Items

All items are encoded as JSON. There are some specifications on what the JSON looks like.

The canonicalisation algorithm is as follows:

  • JSON object values MUST be sorted into lexicographical order. The keys of a JSON object must be a valid field name, which is restricted to the alphabet of lower case letters and hyphens, which makes this ordering relatively simple to implement.
  • All whitespace MUST be removed.
  • Characters in strings must be represented as follows:
    • For ASCII control characters (codepoints 0x00 - 0x1f):
      • If it has a short representation (\b, \f, \n, \r, or \t), that short representation MUST be used.
      • Other control characters (such as NULL) MUST be represented as a \u00XX escape sequence. Hexadecimal digits MUST be upper-case.
    • Backslash () and double quote (") MUST be escaped as \ and " respectively.
    • All other characters MUST be included literally (ie unescaped). This includes forward-slash (/).

This canonicalisation algorithm is very similar to that used in JCS, except that we stipulate an ordering of keys, and we enforce upper-case rather than lower-case hex digits.

Metadata

Within the system region, there should be the following keys and items. Examples are given as TSV format as returned by orc for the country Register, e.g. orc ls country system <key>.

  • name: JSON object with key "name" => value is name of the Register. e.g. system name 2017-07-17T10:59:47Z {"name":"country"}

  • custodian: JSON object with key "custodian" => value is the name of the custodian. e.g. system custodian 2017-11-02T11:18:00Z {"custodian":"David de Silva"}

  • register:<name> where <name> is the name of the Register: JSON object with:

    • key "fields" => value is an array of field names. Order is used to order the columns on GOV.UK Register service (and Registers.app).
    • key "register" => value is name of the Register (again)
    • key "registry" => value is the organisation that owns the Register. For GOV.UK Registers this is reasonably clear because it's the organisation that has legislative or policy authority. (For Registers.app Registers what this means is less clear. At the moment, each user has a "default collection" which we use to populate this field, and we display this in the URL and above the title of the Register. For non-Gov Registers, we have tended to use either "register-dynamics" for Registers we've made or just the name of the person (e.g. "peter-k-wells"). We don't think this is a particularly helpful thing for Registers to contain outside of Gov, so we may drop it.)
    • key "text" => value is a string which talks about the Register (like it's README). For Gov Registers these tend to be quite short; we'd prefer them to be longer. There's a ticket #243 to render Markdown from this field.
    • key "phase" => value is alpha, beta etc. depending on the status of the Register on GOV.UK. (Registers.app ignores this value and don't write it.) e.g. system register:country 2016-08-04T14:45:41Z {"fields":["country","name","official-name","citizen-names","start-date","end-date"],"phase":"beta","register":"country","registry":"foreign-commonwealth-office","text":"British English-language names and descriptive terms for countries"}
  • field:<name> where <name> is one of the field names from register:<name>. JSON object with:

    • key "cardinality" => value "1" if the field is single-valued, or "n" if it is multi-valued.
    • key "datatype" => value is the datatype of the field, so one of string, integer, curie, datetime, timestamp, period, url, text. See the spec document for how these are structured.
    • key "field" => value is the name of the field.
    • key "text" => value is a human-readable description string of the field. (We display this in the column header on Registers.app.) e.g. system field:name 2017-01-10T17:16:07Z {"cardinality":"1","datatype":"string","field":"name","phase":"beta","text":"The commonly-used name of a record."}

There is the additional restriction that the name of the Register must be one of the keys in the item e.g. the Country register must have a country field. (We currently honour this in Registers.app but may drop it, as there are a lot of real use-cases where this is not a helpful column name.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment