Skip to content

Commit

Permalink
Update _dq.md
Browse files Browse the repository at this point in the history
  • Loading branch information
kyyberi committed Mar 17, 2024
1 parent 9cf6277 commit d15d140
Showing 1 changed file with 32 additions and 36 deletions.
68 changes: 32 additions & 36 deletions source/includes/_dq.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,56 +2,52 @@

The availability of the service/data. Use common SLA apprach to define percentage of guaranteed availability

## Elements and structure

| <div style="width:150px">Component name</div> | Type | Options | Description |
|---|---|---|---|
| extension | element | - | Binds together extension. This is used only in the example. This is part of unified method to add extensions to various data economy standards |
| $schema | valid URL | URL | URL to Schema that defines spec element content options. This is provided and maintained by vendor. |
| kind | attribute | string | Defines the class in which extension belongs to. Options: dataquality, access, pricing, sla, stakeholders, provider |
| vendor | attribute | string | Name of the vendor |
| spec | element | - | If extension contains EaC approach, this is the element in which vendor system specific "as code" specification is provided. |


## Montecarlo

> Example of Montecarlo extension usage
```yml
objectives:
- displayName: Availability
target: 0.98
ratioMetric:
counter: true
good:
metricSource:
type: Prometheus
metricSourceRef: prometheus-datasource
spec:
query: sum(localhost_server_requests{code=~"2xx|3xx",host="*",instance="127.0.0.1:9090"})
total:
metricSource:
type: Prometheus
metricSourceRef: prometheus-datasource
spec:
query: localhost_server_requests{code="total",host="*",instance="127.0.0.1:9090"}

extension:
$schema: URL
kind: dataquality
vendor: Montecarlo
spec:
field_health:
- table: project:dataset.table_name
timestamp_field: created
dimension_tracking:
- table: project:dataset.table_name
timestamp_field: created
field: order_status
```
| <div style="width:150px">Component name</div> | Type | Options | Description |
|---|---|---|---|
| **availability** | element | - | Binds together availability indicator description with objectives and monitoring. Follows OpenSLO standard model. |
| description | attribute | string | Short description to be used in displying more detailed information for consumers and operations staff. |
| monitoring | object | - | Binds together both monitoring and objectives (threshold values) structure |
| type | attribute | string | Defines the standard used in describing the monitoring object contant. Call also be vendor specific such as MonteCarlo and SodaCL. Details in type definition (link) |
| spec | object | - | Inside this object you write the type specified description or objectives and monitoring as code. |
| objectives | array | - | Define the objectives (threshold values) for expected quality of this indicator. |
## SodaCL
> Example of SodaCL extension usage
```yml

objectives:
- displayName: Completeness
target: 98
spec:
- for each column:
name: [member_id, gender, age_band]
checks:
- not null:
fail: when > 2% # Fail if more than 2% of records are null
extension:
$schema: URL
kind: dataquality
vendor: SodaCL
spec:
- for each column:
name: [member_id, gender, age_band]
checks:
- not null:
fail: when > 2% # Fail if more than 2% of records are null


```
Expand Down

0 comments on commit d15d140

Please sign in to comment.