Dataset Sites

Dataset Sites help data users to find your data, and interact with you if they find any issues. They are both human and machine readable, and allow your data to appear everywhere from the OpenActive Status Page to the Google Dataset Search.

Example dataset sites: GLL, Fusion Lifestyle‚Äč

What do I need?

  • If you are creating just one Dataset Site for your organisation, then jump to the Dataset Site Generator.

  • If you are a large booking system generating a Dataset Site for each of your customers, jump to the Dataset Site Template.

What is a Dataset Site?

  • A web page that can be referenced when discussing the dataset.

  • A human and machine readable licence associated with the data (the Dataset Site contains invisible metadata which allows its details to be read automatically).

  • A human and machine readable rights statement to specify how dataset users (innovators who want to build on top of/use your data) should attribute your data.

  • An accessible "single point of truth" that explains where the data can be found.

  • Links to documentation relating to the format of the data, including the specifications it follows, and the data fields it contains.

  • A place where the community can contribute with comments, and raise issues - all Dataset Sites are linked to a GitHub issues board (e.g. this one) that allows data users to raise issues in the open.

Option 1: Dataset Site Generator: Create one Dataset Site

The Dataset Site Generator and associated guides very quickly create a minimal Dataset Site covering all of the criteria above, using freely available, open source tools. A generated site contains features sufficient for publishing a single dataset, which in most cases is enough for initial publishing of data relating to OpenActive.

Please find out more at the Dataset Site Generator instructions page.

Option 2: Dataset Site Template: Build Dataset Sites into your system

The Dataset Site Template is very easy to use and quick to apply - it's basically a single mustache template and associated JSON structure. It is designed to work with minimal effort with an extremely wide range of platforms and languages.

The dataset site template repository contains a mustache template for creating an OpenActive dataset site. This dataset site it produces is slightly more advanced compared with those generated by the Dataset Site Generator.

It is designed to be embedded in a booking system that outputs open data feeds for each customer, and allows the booking system to easily generate a dataset site for each customer.

Getting Started

The Dataset Site Template is a single self-contained mustache template of an HTML page that contains embedded CSS, an embedded encoded image, and references to CDNs of Font Awesome and Google Fonts. It works across all browsers, and includes fully compliant DCAT and schema.org machine-readable metadata to ensure it is compatible with Google Dataset Search.

Steps to render the template:

  1. Construct the JSON-LD found in example.json based on your customers' own properties.

  2. Find a mustache library for your platform or language.

  3. Write code to do the following:

    • Stringify the input JSON, and place the contents of the string within the "json" property at the root of the JSON itself (i.e. serialised JSON embedded in the original deserialised object).

    • Use the resulting JSON with the mustache template to render the dataset site.

Personalising the Dataset Site

The Dataset Site Template is designed to carry the customer's brand with minimal configuration.

See the settings specified here for an example of how a minimal number of configurable properties can be used to generate the whole dataset site in a way that is personalised to each customer.

We suggest if you can provide the customer with a means of customising the logo and background image (e.g. via uploading an image to the cloudinary.com CDN, using their widget, which is free at low volume), these have the largest effect on the brand feel of the page.

Although the customer will likely be able to fill in most properties specific to them, there are two where they will require guidance:

  • discussionUrl - you will need to create a new GitHub repository for each customer, and copy the URL of its Issues board into the property value. We recommend that you create each repository within your own GitHub organisation either manually or via an API call. If you "follow" these repositories using a new GitHub account created with your support e-mail address then you will receive notifications for each query, and be able to reply via e-mail to the notifications from your support e-mail address - these replies then appear directly in GitHub. Note that any administrator accounts automatically follow newly created GitHub repositories within your organisation.

  • documentation - as a booking system you should provide at least a single page on your website that explains the OpenActive feeds. Each customer will have the option of providing their own documentation for their dataset site that links to this, or just linking to your documentation direct.

Issues board creation

The discussionUrl is the url of the GitHub issues board for that dataset site. There are two ways of creating a GitHub issues board.

Manually

A guide for creating a new GitHub repository for each customer can be found below:

Automatically

The GitHub API provides a mechanism to automatically create GitHub repositories. The recommended properties for a new repository are included below:

{
"name": "AshfordLeisureTrust",
"description": "Issues relating to open data from Ashford Leisure Trust",
"homepage": "https://ashfordleisuretrust.leisurecloud.net/OpenActive/",
"private": false,
"has_issues": true,
"has_projects": false,
"has_wiki": false,
"auto_init": false
}

Providing a Data Catalog

For large booking systems, a Data Catalog must also be provided to allow the many Dataset Sites that are created to be easily indexed by the OpenActive Status Page and other data users.

A Data Catalog is very simply an array of the URLs of all your Dataset Sites (the dataset array), presented within a DataCatalog wrapper following a specific format. An example of a live Data Catalog from the Gladstone system can be found here.

.NET Example

This simple console app demonstrates the Dataset Site Template render steps outlined above using OpenActive.NET.

JavaScript Example

This JSFiddle demonstrates the Dataset Site Template render steps outlined above using plain JavaScript.

Please note this is only an example to demonstrate the logic and is not intended for production. The mustache template must be rendered server-side as one of its primary purposes is SEO.

Click the Result tab below to see the result of a template render.