Apply for QTS in England - Setting up analytics
1. Configure dfe-analytics
Follow the instructions in the README, and set it up: https://github.com/DFE-Digital/apply-for-qualified-teacher-status/pull/99
2. Get a BigQuery project setup and add initial owners
Ask in Slack on the #twd_data_insights
channel for someone to help you
procure a BigQuery instance in the digital.education.gov.uk
Google Cloud
Organisation.
Ask for your @digital.education.gov.uk
Google account to be setup as an owner
via the IAM and Admin settings. Add other team members as necessary.
Set up billing
You also need to set up your BigQuery instance with paid billing. This is
because dfe-analytics
uses streaming, and streaming isn't allowed in the free
tier:
accessDenied: Access Denied: BigQuery BigQuery: Streaming insert is not allowed
in the free tier
3. Create a data set and table
Update: The dfe_analytics terraform module now automates these steps.
You should create separate data sets for each environment (dev/preprod/prod).
- Select the BigQuery instance
- Go to the Analysis -> SQL Workspace section
- Tap on the 3 dots next to the project name, "Create data set"
- Name it
events_ENVIRONMENT
, such asevents_local
for local development testing, and set the location toeurope-west2 (London)
- Select your new
events_local
data set - Create a table
- Name it
events
- Set the schema to match the one below (including the nested fields inside
request_query
anddata
) - Set Partitioning to
occurred_at
- Set Partitioning type to
By day
- Set Clustering order to
event_type
- Click on "Create table"
Tip: You can copy this empty table between environments to save time and not have to do the last few steps over and over.
Schema
Field name | Type | Mode |
---|---|---|
occurred_at | TIMESTAMP | REQUIRED |
event_type | STRING | REQUIRED |
environment | STRING | REQUIRED |
namespace | STRING | NULLABLE |
user_id | STRING | NULLABLE |
request_uuid | STRING | NULLABLE |
request_method | STRING | NULLABLE |
request_path | STRING | NULLABLE |
request_user_agent | STRING | NULLABLE |
request_referer | STRING | NULLABLE |
request_query | RECORD | REPEATED |
request_query.key | STRING | REQUIRED |
request_query.value | STRING | REPEATED |
response_content_type | STRING | NULLABLE |
response_status | STRING | NULLABLE |
data | RECORD | REPEATED |
data.key | STRING | REQUIRED |
data.value | STRING | REPEATED |
entity_table_name | STRING | NULLABLE |
event_tags | STRING | REPEATED |
anonymised_user_agent_and_ip | STRING | NULLABLE |
If you edit as text, you can paste this:
[
{
"name": "occurred_at",
"type": "TIMESTAMP",
"mode": "REQUIRED"
},
{
"name": "event_type",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "environment",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "namespace",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "user_id",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "request_uuid",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "request_method",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "request_path",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "request_user_agent",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "request_referer",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "request_query",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "key",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "value",
"type": "STRING",
"mode": "REPEATED"
}
]
},
{
"name": "response_content_type",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "response_status",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "data",
"type": "RECORD",
"mode": "REPEATED",
"fields": [
{
"name": "key",
"type": "STRING",
"mode": "REQUIRED"
},
{
"name": "value",
"type": "STRING",
"mode": "REPEATED"
}
]
},
{
"name": "entity_table_name",
"type": "STRING",
"mode": "NULLABLE"
},
{
"name": "event_tags",
"type": "STRING",
"mode": "REPEATED"
},
{
"name": "anonymised_user_agent_and_ip",
"type": "STRING",
"mode": "NULLABLE"
}
]
4. Create custom roles
Update: The dfe_analytics terraform module now automates these steps.
- Go to IAM and Admin settings > Roles
- Click on "+ Create role"
- Create the 3 roles outlined below
Analyst
Field | Value |
---|---|
Title | BigQuery Analyst Custom |
Description | Assigned to accounts used by performance analysts. |
ID | bigquery_analyst_custom |
Role launch stage | General Availability |
+ Add permissions | See below |
Developer
Field | Value |
---|---|
Title | BigQuery Developer Custom |
Description | Assigned to accounts used by developers. |
ID | bigquery_developer_custom |
Role launch stage | General Availability |
+ Add permissions | See below |
Appender
Field | Value |
---|---|
Title | BigQuery Appender Custom |
Description | Assigned to accounts used by appenders (apps and scripts). |
ID | bigquery_appender_custom |
Role launch stage | General Availability |
+ Add permissions | See below |
5. Create an appender service account
Update: The dfe_analytics terraform module now automates these steps.
- Go to IAM and Admin settings > Create service account
- Name it like "Appender NAME_OF_SERVICE ENVIRONMENT", so "Appender Apply Local"
- Add a description, like "Used when developing locally."
- Grant the service account access to the project, use the "BigQuery Appender Custom" role you set up earlier
6. Get an API JSON key :key:
Update: The dfe_analytics terraform module now automates these steps.
- Access the service account you previously setup
- Go to the keys tab, click on "Add key > Create new key"
- Create a JSON private key
The full contents of this JSON file is your BIGQUERY_API_JSON_KEY
.
6. Set up environment variables
Update: The dfe_analytics terraform module now automates these steps.
Putting the previous things together, to finish setting up dfe-analytics
, you
need these environment variables:
BIGQUERY_TABLE_NAME=events
BIGQUERY_PROJECT_ID=apply-for-qts-in-england
BIGQUERY_DATASET=events_local
BIGQUERY_API_JSON_KEY=<contents of the JSON, make sure to strip or escape newlines>
7. Configure dfe-analytics-dataform
Follow the instructions in the README, and set it up: https://github.com/DFE-Digital/dfe-analytics-dataform#how-to-install