Integration schemas can be automatically updated using the JSON schema files in tap repositories.
What is automated?
Updates to integration schemas are only partially automated, some elements still need to be updated manually.
In the integration folders under _data/taps/schemas
, only the contents in the json
folder is updated automatically. If there are any updates in the table details (primary key, replication key, replication method, etc.), these have to be done manually.
If a new table is added, the new JSON file is added and a new entry in the <integration>-<version>-tables.yml
file is created, but the details have to be filled out manually.
If an existing table is no longer found in the repository, the element status: not_found
is added in the <integration>-<version>-tables.yml
file to unpublish it.
How does it work?
There are two different GitHub Actions for two use cases: integrations with JSON schemas in their repositories, and integrations without JSON schemas.
Integrations with JSON schemas
For integrations with JSON schemas, the GitHub Action is Import JSON schemas. It can be launched by anyone who has access to it and it takes two parameters:
- The name of the Singer repository,
tap-facebook
for example. This parameter is required. - The name of the branch to use on the Singer repository. This is optional, if no value is provided, the default branch (master or main) is used.
Once it is launched, the job performs a series of actions:
- Creates a new branch on the docs repository.
- Retrieves the JSON schema files from the specified tap repository.
- Formats the JSON files to make them compatible with the HTML templates used to display them.
- Updates the YAML file containing table details (adds new tables, marks missing tables as not found).
- Checks that the primary keys, replication keys, and foreign keys listed are still found in the schema. If any issues are found, a text file with the list of errors will be added to the integration version folder.
- Commits and pushes the changes.
- Creates a pull request from the new branch to the master branch on the docs repository.
Integrations without JSON schemas
For integrations without JSON schemas, the GitHub Action is Get JSON schemas from catalog JSON file. This one is a bit more complex, it can be launched by anyone who has access to it but it requires an input file that needs to be generated by a developer.
Before launching the job, someone needs to:
- Run the integration in discovery mode to generate a catalog JSON file (this needs to be done by a developer).
- Create a new branch on the docs repository and add the catalog file in the
script/json
folder. - Commit and push the new file on the new branch.
Once this is done, the GitHub Action can be launched with the following parameters:
- The name of the Singer repository,
tap-facebook
for example. This parameter is required. - The major version of the tap,
2
for example. - The name of the branch created to commit the catalog file.
- The name of the catalog JSON file added,
catalog.json
for example. - The check box to indicate whether a pull request should be created.
Once it is launched, the job performs a series of actions:
- Splits the catalog file into separate JSON files for each table and adds them in the correct folder, based on the tap name and version.
- Formats the JSON files to make them compatible with the HTML templates used to display them.
- Updates the YAML file containing table details (adds new tables, marks missing tables as not found).
- Checks that the primary keys, replication keys, and foreign keys listed are still found in the schema. If any issues are found, a text file with the list of errors will be added to the integration version folder.
- Deletes the catalog file.
- Commits and pushes the changes.
- If the check box is selected, creates a pull request from the your branch to the master branch on the docs repository. Otherwise, you can create your own pull request.
What do I need to do after the job runs?
Once the pull request is created, you need to review it. Here are some things to do:
- If a new table was added, fill out the table details.
- If a table was marked as not found, check that it was also removed from the tap repository. If so, remove it completely. If the table was supposed to be found, there may be an issue is the script.
- If an
<integration>-<version>-issues.txt
file was added, check and fix the issues found. - Check that the changes in the updated JSON files match the changes on the tap repository.
Once the changes are ready to go to production and have been validated by someone from the sources team and someone from the documentation team, you can merge the pull request to publish the updates.
Back to top
Last updated: 13 March 2024