-
Notifications
You must be signed in to change notification settings - Fork 27
Expand file tree
/
Copy pathgreat-docs.yml
More file actions
283 lines (256 loc) · 7.78 KB
/
great-docs.yml
File metadata and controls
283 lines (256 loc) · 7.78 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
# Great Docs Configuration for Pointblank
# See https://posit-dev.github.io/great-docs/user-guide/configuration.html
display_name: Pointblank
# Docstring Parser
parser: numpy
# Dynamic Introspection
dynamic: true
# Logo
logo:
light: assets/pointblank_logo_small.svg
dark: assets/pointblank_logo_small.svg
# Homepage
# Use a simple index.qmd as the landing page
homepage: index
hero:
logo: assets/pointblank_logo.svg
logo_height: 250px
# Scale-to-fit for Pointblank validation report tables
scale_to_fit:
- "#pb_tbl"
# GitHub Integration
github_style: widget
# Source Links
source:
enabled: true
branch: main
# CLI Documentation (Click-based)
cli:
enabled: true
module: pointblank.cli
name: cli
# Changelog
changelog:
enabled: true
# Custom Sections
sections:
- title: Demos
dir: examples
index: true
index_columns: 2
- title: Blog
dir: blog
type: blog
# Author Information
authors:
- name: Rich Iannone
role: Maintainer
affiliation: Posit, PBC
email: riannone@me.com
github: rich-iannone
orcid: 0000-0003-3925-190X
# Funding / Copyright Holder
funding:
name: "Posit Software, PBC"
roles:
- Copyright holder
- funder
homepage: https://posit.co
ror: https://ror.org/03wc8by49
# API Discovery
exclude:
- config
- assistant.MODEL_PROVIDERS
# API Reference Structure
# Organized to match the current quartodoc sections
reference:
title: "API Reference"
sections:
- title: Validate
desc: >
When performing data validation, use the `Validate` class to get the process started.
It takes the target table and options for metadata and failure thresholds (using the
`Thresholds` class or shorthands). The `Validate` class has numerous methods for
defining validation steps and for obtaining post-interrogation metrics and data.
contents:
- name: Validate
members: false
- Thresholds
- name: Actions
members: false
- FinalActions
- name: Schema
members: false
- name: DraftValidation
members: false
- title: Validation Steps
desc: >
Validation steps are sequential validations on the target data. Call Validate's
validation methods to build up a validation plan: a collection of steps that provides
good validation coverage.
contents:
- Validate.col_vals_gt
- Validate.col_vals_lt
- Validate.col_vals_ge
- Validate.col_vals_le
- Validate.col_vals_eq
- Validate.col_vals_ne
- Validate.col_vals_between
- Validate.col_vals_outside
- Validate.col_vals_in_set
- Validate.col_vals_not_in_set
- Validate.col_vals_increasing
- Validate.col_vals_decreasing
- Validate.col_vals_null
- Validate.col_vals_not_null
- Validate.col_vals_regex
- Validate.col_vals_within_spec
- Validate.col_vals_expr
- Validate.col_exists
- Validate.col_pct_null
- Validate.rows_distinct
- Validate.rows_complete
- Validate.col_schema_match
- Validate.row_count_match
- Validate.col_count_match
- Validate.data_freshness
- Validate.tbl_match
- Validate.conjointly
- Validate.specially
- Validate.prompt
- title: Aggregation Steps
desc: >
These validation methods check aggregated column values (sums, averages, standard
deviations) against fixed values or column references.
contents:
- Validate.col_sum_gt
- Validate.col_sum_lt
- Validate.col_sum_ge
- Validate.col_sum_le
- Validate.col_sum_eq
- Validate.col_avg_gt
- Validate.col_avg_lt
- Validate.col_avg_ge
- Validate.col_avg_le
- Validate.col_avg_eq
- Validate.col_sd_gt
- Validate.col_sd_lt
- Validate.col_sd_ge
- Validate.col_sd_le
- Validate.col_sd_eq
- title: Column Selection
desc: >
Use the `col()` function along with column selection helpers to flexibly select columns
for validation. Combine `col()` with `starts_with()`, `matches()`, etc. for selecting
multiple target columns.
contents:
- col
- starts_with
- ends_with
- contains
- matches
- everything
- first_n
- last_n
- expr_col
- title: Segment Groups
desc: >
Combine multiple values into a single segment using `seg_*()` helper functions.
contents:
- seg_group
- title: Interrogation and Reporting
desc: >
The validation plan is executed when `interrogate()` is called. After interrogation,
view validation reports, extract metrics, or split data based on results.
contents:
- Validate.interrogate
- Validate.set_tbl
- Validate.get_tabular_report
- Validate.get_step_report
- Validate.get_json_report
- Validate.get_dataframe_report
- Validate.get_sundered_data
- Validate.get_data_extracts
- Validate.all_passed
- Validate.assert_passing
- Validate.assert_below_threshold
- Validate.above_threshold
- Validate.n
- Validate.n_passed
- Validate.n_failed
- Validate.f_passed
- Validate.f_failed
- Validate.warning
- Validate.error
- Validate.critical
- title: Inspection and Assistance
desc: >
Functions for getting to grips with a new data table. Use DataScan for a quick
overview, `preview()` for first/last rows, `col_summary_tbl()` for column summaries,
and `missing_vals_tbl()` for missing value analysis.
contents:
- DataScan
- preview
- col_summary_tbl
- missing_vals_tbl
- load_dataset
- get_data_path
- connect_to_table
- print_database_tables
- title: Table Pre-checks
desc: >
Helper functions for use with the `active=` parameter of validation methods. These
inspect the target table before a step runs and conditionally skip the step when
preconditions are not met.
contents:
- has_columns
- has_rows
- title: YAML
desc: >
Functions for using YAML to orchestrate validation workflows.
contents:
- yaml_interrogate
- validate_yaml
- yaml_to_python
- title: Utility Functions
desc: >
Functions for accessing metadata about the target data and managing configuration.
contents:
- get_column_count
- get_row_count
- get_action_metadata
- get_validation_summary
- write_file
- read_file
- ref
- title: Test Data Generation
desc: >
Generate synthetic test data based on schema definitions. Use `generate_dataset()` to create
data from a Schema object, or `schema_from_tbl()` to infer a generation-ready schema from an
existing table (Polars, Pandas, or Ibis/DuckDB).
contents:
- generate_dataset
- schema_from_tbl
- int_field
- float_field
- string_field
- bool_field
- date_field
- datetime_field
- time_field
- duration_field
- profile_fields
- title: Prebuilt Actions
desc: >
Prebuilt action functions for common notification patterns.
contents:
- send_slack_notification
- emit_otel
- title: Integrations
desc: >
Classes for integrating Pointblank with external observability and monitoring
systems. Use `OTelExporter` to export validation results as OpenTelemetry
metrics, traces, and logs.
contents:
- name: integrations.otel.OTelExporter
members: true