Skip to content

Commit

Permalink
[MAINTENANCE] Improvement to contributor documentation (#8043)
Browse files Browse the repository at this point in the history
Co-authored-by: Ken Wade <ken@greatexpectations.io>
  • Loading branch information
christian-bromann and kenwade4 authored Jul 11, 2023
1 parent 0905c8a commit e6e1f5b
Show file tree
Hide file tree
Showing 11 changed files with 199 additions and 85 deletions.
2 changes: 0 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -41,8 +41,6 @@ wheels/
*.egg-info/
.installed.cfg
*.egg
.vscode/


# PyInstaller
# Usually these files are written by a python script from a template
Expand Down
46 changes: 25 additions & 21 deletions CONTRIBUTING_CODE.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ Python dependencies are required to modify Great Expectations code, submit a new

1. Run the following command to create a virtual environment in your local repository using Python versions 3.8 to 3.11, activate the environment, and then install the necessary dependencies:

```python
```sh
python3 -m venv gx_dev
source gx_dev/bin/activate
Expand All @@ -76,6 +76,7 @@ Python dependencies are required to modify Great Expectations code, submit a new
pip install -c constraints-dev.txt -e ".[test]"
```

To specify other dependencies, add a comma after `test` and enter the dependency name. For example, "[test,postgresql,trino]". The supported dependencies include: `arrow`, `athena`, `aws_secrets`, `azure`, `azure_secrets`, `bigquery`, `dev`, `dremio`, `excel`, `gcp`, `hive`, `mssql`, `mysql`, `pagerduty`, `postgresql`, `redshift`, `s3`, `snowflake`, `spark`, `sqlalchemy`, `teradata`, `test`, `trino`, `vertica`.

2. Optional. If you're using Amazon Redshift, run the following command to install the `libpq-dev` package:
Expand All @@ -97,16 +98,17 @@ Python dependencies are required to modify Great Expectations code, submit a new
```
or
```sh
brew install unixodbc
brew install unixodbc
```
If your Mac computer has an Apple Silicon chip, you might need to
1. specify additional compiler or linker options. For example:
`export LDFLAGS="-L/opt/homebrew/Cellar/unixodbc/[your version]/lib"`
`export CPPFLAGS="-I/opt/homebrew/Cellar/unixodbc/[your version]/include"`
```sh
export LDFLAGS="-L/opt/homebrew/Cellar/unixodbc/[your version]/lib"
export CPPFLAGS="-I/opt/homebrew/Cellar/unixodbc/[your version]/include"`
```
2. reinstall pyodbc:
Expand Down Expand Up @@ -134,25 +136,25 @@ A virtual environment allows you to create and test code without affecting the p
1. Run the following command to create a virtual environment named `great_expectations_dev`:
```python
python3 -m venv <path_to_environments_folder\>/great_expectations_dev
```sh
python3 -m venv <path_to_environments_folder\>/great_expectations_dev
```
2. Run the following command to activate the virtual environment:
```python
<source path_to_environments_folder\>/great_expectations_dev/bin/activate
```sh
<source path_to_environments_folder\>/great_expectations_dev/bin/activate
```
### Anaconda
1. Run the following command to create a virtual environment named `great_expectations_dev`:
```python
conda create --name great_expectations_dev
```sh
conda create --name great_expectations_dev
```
2. Run the following command to activate the virtual environment:
```python
conda activate great_expectations_dev
```sh
conda activate great_expectations_dev
```
## Install dependencies from requirements-dev.txt
Expand Down Expand Up @@ -346,7 +348,7 @@ One of the most significant features of an Expectation is that it produces the s
The following is the test fixture file structure:
````json
```json
{
"expectation_type" : "expect_column_max_to_be_between",
"datasets" : [{
Expand All @@ -355,14 +357,15 @@ The following is the test fixture file structure:
"tests" : [...]
}]
}
````
```
Below `datasets` are three entries: `data`, `schemas`, and `tests`.
#### Data
The `data` parameter defines a DataFrame of sample data to apply Expectations against. The DataFrame is defined as a dictionary of lists, with keys containing column names and values containing lists of data entries. All lists within a dataset must have the same length. For example:
````console
```console
"data" : {
"w" : [1, 2, 3, 4, 5, 5, 4, 3, 2, 1],
"x" : [2, 3, 4, 5, 6, 7, 8, 9, null, null],
Expand All @@ -371,12 +374,13 @@ The `data` parameter defines a DataFrame of sample data to apply Expectations ag
"zz" : ["1/1/2016", "1/2/2016", "2/2/2016", "2/2/2016", "3/1/2016", "2/1/2017", null, null, null, null],
"a" : [null, 0, null, null, 1, null, null, 2, null, null],
},
````
```
#### Schemas
The `schema` parameter defines the types to be used when instantiating tests against different execution environments, including different SQL dialects. Each schema is defined as a dictionary with column names and types as key-value pairs. If the schema isn’t specified for a given execution environment, Great Expectations introspects values and attempts to identify the schema. For example:
````console
```console
"schemas": {
"sqlite": {
"w" : "INTEGER",
Expand All @@ -395,7 +399,7 @@ The `schema` parameter defines the types to be used when instantiating tests aga
"a" : "INTEGER",
}
},
````
```
#### Tests
The `tests` parameter defines the tests to be executed against the DataFrame. Each item in `tests` must include `title`, `exact_match_out`, `in`, and `out`. The test runner executes the named Expectation once for each item, with the values in `in` supplied as kwargs.
Expand All @@ -404,7 +408,7 @@ The test passes if the values in the expectation Validation Result correspond wi
`suppress_test_for` is an optional parameter to disable an Expectation for a specific list of backends. For example:
````sh
```sh
"tests" : [{
"title": "Basic negative test case",
"exact_match_out" : false,
Expand All @@ -423,7 +427,7 @@ The test passes if the values in the expectation Validation Result correspond wi
...
]
````
```
The test fixture files are stored in subdirectories of `tests/test_definitions/` corresponding to the class of Expectation:
Expand Down
51 changes: 51 additions & 0 deletions CONTRIBUTING_WORKFLOWS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Workflows

The Great Expectations code base has various places where you can contribute code to. This document describes several workflows you might want to run to get started.

First, make sure you have cloned the repository and installed the Python dependencies. Read more on this in [Contribute a code change](CONTRIBUTING_CODE.md).

This code base provides following workflows:

- [Code Linting](#code-linting)
- [Locally deploy docs](#locally-deploy-docs)
- [Verify links in docs](#verify-links-in-docs)
- [Generate Glossary](#generate-glossary)

## Code Linting

Before submitting a pull request, make sure that your code passes the lint check, for that run:

```sh
black .
ruff . --fix
```

## Locally Deploy Docs

You can find more information on developing Great Expectation docs in [/docs/docusaurus/README.md](/docs/docusaurus/README.md). To get a version of the docs deployed locally, run:

```sh { name=docs background=false }
invoke docs
```

The website should be available at:

```sh
open http://localhost:3000/docs
```

## Verify links in docs

We use a link checker tool to verify that links within our docs are valid, you can run it via:

```sh { name=linkcheck }
python3 docs/checks/docs_link_checker.py -p docs -r docs -s docs --skip-external
```

## Generate Glossary

Generates a glossary page in our docs:

```sh { name=glossary cwd=./scripts }
python3 ./build_glossary_page.py
```
64 changes: 64 additions & 0 deletions IDE_SETUP_TIPS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# IDE Setup Tips

This document describes useful set-up tips for contributors to this repository. Feel free to suggest more useful changes to this.

## VS Code

Create a `.vscode` directory and add the following files to it:

_.vscode/extension.json_
```json
{
// See https://go.microsoft.com/fwlink/?LinkId=827846 to learn about workspace recommendations.
// Extension identifier format: ${publisher}.${name}. Example: vscode.csharp
// List of extensions which should be recommended for users of this workspace.
"recommendations": [
"stateful.runme"
],
// List of extensions recommended by VS Code that should not be recommended for users of this workspace.
"unwantedRecommendations": []
}
```

_.vscode/launch.json_
```json
{
"version": "0.2.0",
"configurations": [
{
"name": "GX Docusarus Docs",
"type": "node-terminal",
"request": "launch",
"command": "invoke docs"
},
{
"name": "GX Start MySQL Container",
"type": "node-terminal",
"request": "launch",
"command": "docker-compose up -d",
"cwd": "${workspaceFolder}/assets/docker/mysql"
},
{
"name": "GX Start PostgreSQL Container",
"type": "node-terminal",
"request": "launch",
"command": "docker-compose up -d",
"cwd": "${workspaceFolder}/assets/docker/postgresql"
}
]
}
```

_.vscode/settings.json_
```json
{
"workbench.editorAssociations": {
"CONTRIBUTING_WORKFLOWS.md": "runme",
"CONTRIBUTING_CODE.md": "runme"
}
}
```

## PyCharm

tbd.
2 changes: 1 addition & 1 deletion ci/checks/check_repo_root_size.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# Please take care to only add files or directories to the repo root unless they are
# required to be in the repo root, otherwise please find a more appropriate location.

NUM_ITEMS_SHOULD_BE=40
NUM_ITEMS_SHOULD_BE=42
NUM_ITEMS=$(ls -la | wc -l)

echo "Items found in repo root:"
Expand Down
2 changes: 1 addition & 1 deletion contrib/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Using a new Expectation contributed by a user is easy. Simply `pip install --upg

For example:

```
```python
from great_expectations_contrib.expectations import ExpectNelsonsColumnToExist

# ... obtain Validator
Expand Down
7 changes: 0 additions & 7 deletions docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,3 @@ to the great expectations default value.

This will mount your local great expectations repo on top the image's version of great expectations.
You can now edit your repo locally and the changes will be reflected in the docker container.






```
2 changes: 1 addition & 1 deletion docs/checks/docs_link_checker.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"""A command-line tool used to check links in docusaurus markdown documentation
To check all of our markdown documentation, from the repo root run:
python scripts/docs_link_checker.py -p docs -r docs -s docs --skip-external
python ./docs/checks/docs_link_checker.py -p docs -r docs -s docs --skip-external
The above command:
- -p docs (also --path): The path to the markdown files you want to check. For example, if you wanted to check only the tutorial files, you could specify docs/tutorials
Expand Down
Loading

0 comments on commit e6e1f5b

Please sign in to comment.