-
Notifications
You must be signed in to change notification settings - Fork 51
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add data products for snowplow-cli (#1101)
* Add snowplow-cli documentation for data products * Add data products cli instructions * Apply suggestions from code review Co-authored-by: Costas Kotsokalis <[email protected]> * One more missed spec -> specification * Remove github annnotate on publish * Even more renames * Amend to new sidebar * Add the manage data products in CLI section * Fix typo --------- Co-authored-by: Costas Kotsokalis <[email protected]>
- Loading branch information
1 parent
8e8b7b5
commit 8035f87
Showing
9 changed files
with
933 additions
and
75 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,100 @@ | ||
--- | ||
title: "Managing data products via the CLI" | ||
description: "Use the 'snowplow-cli data-products' command to manage your data products." | ||
sidebar_label: "Using the CLI" | ||
sidebar_position: 999 | ||
--- | ||
```mdx-code-block | ||
import Tabs from '@theme/Tabs'; | ||
import TabItem from '@theme/TabItem'; | ||
``` | ||
The `data-products` subcommand of [Snowplow CLI](/docs/data-product-studio/snowplow-cli/index.md) provides a collection of functionality to ease the integration of custom development and publishing workflows. | ||
## Snowplow CLI Prerequisites | ||
Installed and configured [Snowplow CLI](/docs/data-product-studio/snowplow-cli/index.md) | ||
## Available commands | ||
### Creating data product | ||
```bash | ||
./snowplow-cli dp generate --data-product my-data-product | ||
``` | ||
This command creates a minimal data product template in a new file `./data-products/my-data-product.yaml`. | ||
### Creating source application | ||
```bash | ||
./snowplow-cli dp generate --source-app my-source-app | ||
``` | ||
This command creates a minimal source application template in a new file `./data-products/source-apps/my-source-app.yaml`. | ||
### Creating event specification | ||
To create an event specification, you need to modify the existing data-product file and add an event specification object. Here's a minimal example: | ||
```yaml title="./data-products/test-cli.yaml" | ||
apiVersion: v1 | ||
resourceType: data-product | ||
resourceName: 3d3059c4-d29b-4979-a973-43f7070e1dd0 | ||
data: | ||
name: test-cli | ||
sourceApplications: [] | ||
eventSpecifications: | ||
- resourceName: 11d881cd-316e-4286-b5d4-fe7aebf56fca | ||
name: test | ||
event: | ||
source: iglu:com.snowplowanalytics.snowplow/button_click/jsonschema/1-0-0 | ||
``` | ||
:::caution Warning | ||
The `source` fields of events and entities must refer to a deployed data structure. Referring to a locally created data structure is not yet supported. | ||
::: | ||
### Linking data product to a source application | ||
To link a data product to a source application, provide a list of references to the source application files in the `data.sourceApplications` field. Here's an example: | ||
```yaml title="./data-products/test-cli.yaml" | ||
apiVersion: v1 | ||
resourceType: data-product | ||
resourceName: 3d3059c4-d29b-4979-a973-43f7070e1dd0 | ||
data: | ||
name: test-cli | ||
sourceApplications: | ||
- $ref: ./source-apps/my-source-app.yaml | ||
``` | ||
### Modifying the event specifications source applications | ||
By default event specifications inherit all the source applications of the data product. If you want to customise it, you can use the `excludedSourceApplications` in the event specification description to remove a given source application from an event specification. | ||
```yaml title="./data-products/test-cli.yaml" | ||
apiVersion: v1 | ||
resourceType: data-product | ||
resourceName: 3d3059c4-d29b-4979-a973-43f7070e1dd0 | ||
data: | ||
name: test-cli | ||
sourceApplications: | ||
- $ref: ./source-apps/generic.yaml | ||
- $ref: ./source-apps/specific.yaml | ||
eventSpecifications: | ||
- resourceName: 11d881cd-316e-4286-b5d4-fe7aebf56fca | ||
name: All source apps | ||
event: | ||
source: iglu:com.snowplowanalytics.snowplow/button_click/jsonschema/1-0-0 | ||
- resourceName: b9c994a0-03b2-479c-b1cf-7d25c3adc572 | ||
name: Not quite everything | ||
excludedSourceApplications: | ||
- $ref: ./source-apps/specific.yaml | ||
event: | ||
source: iglu:com.snowplowanalytics.snowplow/button_click/jsonschema/1-0-0 | ||
``` | ||
In this example event specification `All source apps` is related to both `generic` and `specific` source apps, but event specification `Not quite everything` is related only to the `generic` source application. | ||
### Downloading data products, event specifications and source apps | ||
```bash | ||
./snowplow-cli dp download | ||
``` | ||
This command retrieves all organization data products, event specifications, and source applications. By default, it creates a folder named `data-products` in your current working directory. You can specify a different folder name as an argument if needed. | ||
The command creates the following structure: | ||
- A main `data-products` folder containing your data product files | ||
- A `source-apps` subfolder containing source application definitions | ||
- Event specifications embedded within their related data product files. | ||
### Validating data products, event specifications and source applications | ||
```bash | ||
./snowplow-cli dp validate | ||
``` | ||
This command scans all files under `./data-products` and validates them using the BDP console. It checks: | ||
1. Whether each file is in a valid format (YAML/JSON) with correctly formatted fields | ||
2. Whether all source application references in the data product files are valid | ||
3. Whether event specification rules are compatible with their schemas | ||
If validation fails, the command displays the errors in the console and exits with status code 1. | ||
### Publishing data products, event specifications and source applications | ||
```bash | ||
./snowplow-cli dp publish | ||
``` | ||
This command locates all files under `./data-products`, validates them, and publishes them to the BDP console. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
--- | ||
title: Snowplow CLI | ||
sidebar_label: Snowplow CLI | ||
sidebar_position: 7 | ||
--- | ||
import Tabs from '@theme/Tabs'; | ||
import TabItem from '@theme/TabItem'; | ||
|
||
`snowplow-cli` brings data management elements of Snowplow Console into the command line. It allows you to download your data structures and data products to yaml/json files and publish them back to console. This enables git-ops-like workflows, with reviews and brancing. | ||
|
||
# Install | ||
|
||
Snowplow CLI can be installed with [homebrew](https://brew.sh/): | ||
``` | ||
brew install snowplow-product/taps/snowplow-cli | ||
``` | ||
|
||
Verify the installation with | ||
``` | ||
snowplow-cli --help | ||
``` | ||
|
||
For systems where homebrew is not available binaries for multiple platforms can be found in [releases](https://github.com/snowplow-product/snowplow-cli/releases). | ||
|
||
Example installation for `linux_x86_64` using `curl` | ||
|
||
```bash | ||
curl -L -o snowplow-cli https://github.com/snowplow-product/snowplow-cli/releases/latest/download/snowplow-cli_linux_x86_64 | ||
chmod u+x snowplow-cli | ||
``` | ||
|
||
Verify the installation with | ||
``` | ||
./snowplow-cli --help | ||
``` | ||
|
||
# Configure | ||
|
||
You will need three values. | ||
|
||
An API Key Id and the corresponding API Key (secret), which are generated from the [credentials section](https://console.snowplowanalytics.com/credentials) in BDP Console. | ||
|
||
The organization ID, which can be retrieved from the URL immediately following the .com when visiting BDP console: | ||
|
||
![](./images/orgID.png) | ||
|
||
Snowplow CLI can take its configuration from a variety of sources. More details are available from `./snowplow-cli data-structures --help`. Variations on these three examples should serve most cases. | ||
|
||
<Tabs groupId="config"> | ||
<TabItem value="env" label="env variables" default> | ||
|
||
```bash | ||
SNOWPLOW_CONSOLE_API_KEY_ID=********-****-****-****-************ | ||
SNOWPLOW_CONSOLE_API_KEY=********-****-****-****-************ | ||
SNOWPLOW_CONSOLE_ORG_ID=********-****-****-****-************ | ||
``` | ||
|
||
</TabItem> | ||
<TabItem value="defaultconfig" label="$HOME/.config/snowplow/snowplow.yml" > | ||
|
||
```yaml | ||
console: | ||
api-key-id: ********-****-****-****-************ | ||
api-key: ********-****-****-****-************ | ||
org-id: ********-****-****-****-************ | ||
``` | ||
</TabItem> | ||
<TabItem value="args" label="inline arguments" > | ||
```bash | ||
./snowplow-cli data-structures --api-key-id ********-****-****-****-************ --api-key ********-****-****-****-************ --org-id ********-****-****-****-************ | ||
``` | ||
|
||
</TabItem> | ||
</Tabs> | ||
|
||
Snowplow CLI defaults to yaml format. It can be changed to json by either providing a `--output-format json` flag or setting the `output-format: json` config value. It will work for all commands where it matters, not only for `generate`. | ||
|
||
|
||
# Use cases | ||
|
||
- [Manage your data structures with snowplow-cli](/docs/data-product-studio/data-structures/manage/cli/index.md) | ||
- [Set up a github CI/CD pipeline to manage data structures and data products](/docs/resources/recipes-tutorials/recipe-data-structures-in-git/index.md) |
Oops, something went wrong.