Skip to content

Commit 1c371e0

Browse files
Restructuring and editing docs. (#171)
* Restructuring, rewriting, editing docs. * Added more content; editing. * Style changes for tobiko theme. * Updated per PR review. * Indigo -> deep purple.
1 parent bc964eb commit 1c371e0

21 files changed

Lines changed: 264 additions & 255 deletions

README.md

Lines changed: 11 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
1-
# About SQLMesh
2-
SQLMesh is a next-generation SQL transformation platform. It provides you with powerful automation for versioning, backfilling, deployment, and testing -- allowing you to focus on simply writing SQL.
1+
# Overview
2+
3+
SQLMesh is a next-generation SQL transformation platform. It provides you with powerful automation for versioning, backfilling, deployment, and testing — allowing you to focus on simply writing SQL.
34

45
SQLMesh is able to achieve all of this with minimal setup; there are no additional services or dependencies required to get started using SQLMesh other than a connection to your existing data warehouse or engine.
56

@@ -9,42 +10,32 @@ One of the main advantages over other transformation frameworks is that SQLMesh
910

1011
SQLMesh also automates away complexity, so configuring models is no longer tricky due to complex macros that require understanding of the context for execution. Writing your data pipelines incrementally with SQLMesh not only saves you money and time, but keeps your systems maintainable, reliable, and accessible to all of your data practicioners.
1112

12-
## Reduced cost
13+
### Reduced cost
1314
As discussed above, incremental compute is significantly cheaper than full refresh compute.
1415

1516
For example, if you have one year of history but only receive new data on a daily basis, only processing that new data is ~365x cheaper than reprocessing one year each day. As your data grows, it's possible that refreshing your tables may take longer than a day, which means you would never be able to catch up!
1617

1718
In addition, you may not be able to refresh particular tables all at once; they may need to be batched into smaller intervals. The cost of your data pipelines compound as more dependent pipelines are created. Therefore, writing your data pipelines incrementally as much as possible can result in exponential savings.
1819

19-
## Increased efficiency
20-
SQLMesh safely reuses physical tables across isolated environments. Some databases, such as Snowflake, have [zero-copy cloning](https://docs.snowflake.com/en/user-guide/tables-storage-considerations.html#label-cloning-tables) -- but this is a manual process, and not widely supported.
20+
### Increased efficiency
21+
SQLMesh safely reuses physical tables across isolated environments. Some databases, such as Snowflake, have [zero-copy cloning](https://docs.snowflake.com/en/user-guide/tables-storage-considerations.html#label-cloning-tables) — but this is a manual process, and not widely supported.
2122

2223
SQLMesh is able to automatically reuse tables regardless of which data warehouse or engine you're using. This is achieved by storing fingerprints of your models and by employing [views](https://en.wikipedia.org/wiki/View_(SQL)) like pointers to physical locations. Therefore, spinning up a new development environment is fast and cheap; only models with incompatible changes need to be materialized, once again saving time and money.
2324

24-
## Automation for everyone
25+
### Automation for everyone
2526
Creating maintainable and scalable data pipelines is extremely difficult, and is a task usually reserved for data engineers. As your data grows, the need for incremental compute becomes mandatory due to the cost and time constaints.
2627

2728
Incremental models have inherent state of which partitions have been computed. This makes managing the consistency and accuracy challenging (leaving no data leakages or gaps). Although a seasoned engineer may have the expertise or tooling to operate one of these tables, an analyst would not. In these organizations, analysts would either need to file a ticket and wait on data engineering resources, or bypass core data models by running their own custom jobs, which inevitably leads to an ungoverned data mess. SQLMesh democratizes the ability to write safe and scalable data pipelines to all data practitioners, regardless of technical ability.
2829

29-
## Complexity made simple
30+
### Complexity made simple
3031
As more and more models and users depend on core tables, the complexity of making changes increases. You must ensure that all downstream data consumers are compatible and updated with any new changes.
3132

3233
Propagating a change throughout a complex graph of dependencies is difficult to communicate, and also challenging to do accurately. The introduction of other schedulers such as [Airflow](https://airflow.apache.org/) adds even more complexity. SQLMesh seamlessly integrates directly with your existing scheduler so that your entire data pipeline, including jobs outside of SQLMesh, will be unified and robust.
3334

34-
## Collaboration and integration
35+
### Collaboration and integration
3536
SQLMesh allows for data pipelines to be a collaborative experience. It both empowers less technical data users to contribute and enables them to collaborate with others who may be more familiar with data engineering. Development can be done in a fully isolated environment that can be accessed and validated by others.
3637

3738
SQLMesh provides information about changes and how they may affect your downstream consumers. This transparency, along with the ability to categorize changes, makes it more feasible for a less technically savvy user to make updates to core data pipelines. By integrating with our Continuous Integration/Continuous Delivery (CI/CD) flows, you can require approval for any changes before going to production, ensuring that the relevant data owners or experts can review and validate the changes.
3839

39-
## Testing and reliability
40-
SQLMesh supports both [audits](#audits) and [tests](#tests). Although unit tests has been commonplace in the world of software engineering, they are relatively unknown in the data world. SQLMesh's data unit tests allow for stability and reliability, as data pipeline owners can ensure that changes to models don't change underlying logic. These tests can run quickly in CI, or locally without having to create full scale tables.
41-
42-
Ready to jump in? Refer to `sqlmesh.docs.getting_started`.
43-
44-
## Community
45-
46-
We'd love to help guide you along your data journey. Follow the links below to connect with us:
47-
48-
* Join the [tobiko Slack community](https://join.slack.com/t/tobiko-data/shared_invite/zt-1je7o3xhd-C7~GuZTj0a8xz_uQbTJjHg) to ask questions, or just to say hi!
49-
* File an issue on our [GitHub](https://github.com/TobikoData/sqlmesh/issues/new).
50-
* Send us an email at [hello@tobikodata.com](hello@tobikodata.com) with your questions or feedback.
40+
### Testing and reliability
41+
SQLMesh supports both audits and tests. Although unit tests has been commonplace in the world of software engineering, they are relatively unknown in the data world. SQLMesh's data unit tests allow for stability and reliability, as data pipeline owners can ensure that changes to models don't change underlying logic. These tests can run quickly in CI, or locally without having to create full scale tables.

docs/api/cli.md

Lines changed: 26 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -31,20 +31,20 @@ Usage: sqlmesh plan [OPTIONS] [ENVIRONMENT]
3131
Plan a migration of the current context's models with the given environment.
3232
3333
Options:
34-
-s, --start TEXT The start datetime of the interval this command
35-
will be applied for.
36-
-e, --end TEXT The end datetime of the interval this command will
37-
be applied for.
38-
-f, --from TEXT The environment to base the plan on instead of
34+
-s, --start TEXT The start datetime of the interval for which this
35+
command will be applied.
36+
-e, --end TEXT The end datetime of the interval for which this
37+
command will be applied.
38+
-f, --from TEXT The environment to base the plan on rather than
3939
local files.
4040
--skip-tests TEXT Skip tests prior to generating the plan if they
4141
are defined.
4242
-r, --restate-model TEXT Restate data for specified models and models
4343
downstream from the one specified. For production
44-
environment all related model versions will have
45-
their intervals wiped but only the current
44+
environment, all related model versions will have
45+
their intervals wiped, but only the current
4646
versions will be backfilled. For development
47-
enviornment only the current model versions will
47+
environment, only the current model versions will
4848
be affected.
4949
--no-gaps Ensure that new snapshots have no data gaps when
5050
comparing to existing snapshots for matching
@@ -53,7 +53,7 @@ Options:
5353
--forward-only Create a plan for forward-only changes.
5454
--no-prompts Disable interactive prompts for the backfill time
5555
range. Please note that if this flag is set and
56-
there are uncategorized changes the plan creation
56+
there are uncategorized changes, plan creation
5757
will fail.
5858
--auto-apply Automatically apply the new plan after creation.
5959
--help Show this message and exit.
@@ -66,13 +66,13 @@ Usage: sqlmesh evaluate [OPTIONS] MODEL
6666
Evaluate a model and return a dataframe with a default limit of 1000.
6767
6868
Options:
69-
-s, --start TEXT The start datetime of the interval this command will be
70-
applied for.
71-
-e, --end TEXT The end datetime of the interval this command will be
72-
applied for.
73-
-l, --latest TEXT The latest time used for non incremental datasets
69+
-s, --start TEXT The start datetime of the interval for which this
70+
command will be applied.
71+
-e, --end TEXT The end datetime of the interval for which this
72+
command will be applied.
73+
-l, --latest TEXT The latest time used for non-incremental datasets
7474
(defaults to now).
75-
--limit INTEGER The number of rows which the query should be limited to.
75+
--limit INTEGER The number of rows the query should be limited to.
7676
--help Show this message and exit.
7777
```
7878

@@ -83,11 +83,11 @@ Usage: sqlmesh render [OPTIONS] MODEL
8383
Renders a model's query, optionally expanding referenced models.
8484
8585
Options:
86-
-s, --start TEXT The start datetime of the interval this command will be
87-
applied for.
88-
-e, --end TEXT The end datetime of the interval this command will be
89-
applied for.
90-
-l, --latest TEXT The latest time used for non incremental datasets
86+
-s, --start TEXT The start datetime of the interval for which this
87+
command will be applied.
88+
-e, --end TEXT The end datetime of the interval for which this
89+
command will be applied.
90+
-l, --latest TEXT The latest time used for non-incremental datasets
9191
(defaults to now).
9292
--expand TEXT Whether or not to expand materialized models (defaults to
9393
False). If True, all referenced models are expanded as
@@ -126,11 +126,11 @@ Usage: sqlmesh audit [OPTIONS]
126126
127127
Options:
128128
--model TEXT A model to audit. Multiple models can be audited.
129-
-s, --start TEXT The start datetime of the interval this command will be
130-
applied for.
131-
-e, --end TEXT The end datetime of the interval this command will be
132-
applied for.
133-
-l, --latest TEXT The latest time used for non incremental datasets
129+
-s, --start TEXT The start datetime of the interval for which this
130+
command will be applied.
131+
-e, --end TEXT The end datetime of the interval for which this
132+
command will be applied.
133+
-l, --latest TEXT The latest time used for non-incremental datasets
134134
(defaults to now).
135135
--help Show this message and exit.
136136
```
@@ -165,6 +165,6 @@ Usage: sqlmesh dag [OPTIONS]
165165
graphviz package.
166166
167167
Options:
168-
--file TEXT The file to write the dag image to.
168+
--file TEXT The file to which the dag image should be written.
169169
--help Show this message and exit.
170170
```

docs/api/notebook.md

Lines changed: 17 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Notebook
22

3-
SQLMesh supports JupyterLabs and Databricks Notebooks. Magics are loaded automatically and use the variable `context`
3+
SQLMesh supports JupyterLabs and Databricks Notebooks. Magics are loaded automatically and use the variable `context`.
44

55
```python
66
from sqlmesh import Context
@@ -15,25 +15,25 @@ context = Context(path="example")
1515
[--skip-backfill] [--forward-only] [--no-prompts] [--auto-apply]
1616
[environment]
1717
18-
Goes through a set of prompts to both establish a plan and apply it
18+
Iterates through a set of prompts to both establish a plan and apply it.
1919
2020
positional arguments:
21-
environment The environment to run the plan against
21+
environment The environment to run the plan against.
2222
2323
options:
2424
--start START, -s START
2525
Start date to backfill.
2626
--end END, -e END End date to backfill.
2727
--from FROM_, -f FROM_
28-
The environment to base the plan on instead of local
28+
The environment to base the plan on rather than local
2929
files.
3030
--skip-tests, -t Skip the unit tests defined for the model.
3131
--restate-model <[RESTATE_MODEL ...]>, -r <[RESTATE_MODEL ...]>
3232
Restate data for specified models (and models
3333
downstream from the one specified). For production
34-
environment all related model versions will have their
35-
intervals wiped but only the current versions will be
36-
backfilled. For development enviornment only the
34+
environment, all related model versions will have their
35+
intervals wiped, but only the current versions will be
36+
backfilled. For development environment, only the
3737
current model versions will be affected.
3838
--no-gaps, -g Ensure that new snapshots have no data gaps when
3939
comparing to existing snapshots for matching models in
@@ -42,7 +42,7 @@ options:
4242
--forward-only Create a plan for forward-only changes.
4343
--no-prompts Disables interactive prompts for the backfill time
4444
range. Please note that if this flag is set and there
45-
are uncategorized changes the plan creation will fail.
45+
are uncategorized changes, plan creation will fail.
4646
--auto-apply Automatically applies the new plan after creation.
4747
```
4848

@@ -52,7 +52,7 @@ options:
5252
%evaluate [--start START] [--end END] [--latest LATEST] [--limit LIMIT]
5353
model
5454
55-
Evaluate a model query and fetches a dataframe.
55+
Evaluate a model query and fetch a dataframe.
5656
5757
positional arguments:
5858
model The model.
@@ -63,8 +63,8 @@ options:
6363
--end END, -e END End date to render.
6464
--latest LATEST, -l LATEST
6565
Latest date to render.
66-
--limit LIMIT The number of rows which the query should be limited
67-
to.
66+
--limit LIMIT The number of rows for which which the query
67+
should be limited.
6868
```
6969

7070
## render
@@ -77,21 +77,22 @@ TODO
7777
Fetches a dataframe from sql, optionally storing it in a variable.
7878
7979
positional arguments:
80-
df_var An optional variable name to the store the resulting dataframe in.
80+
df_var An optional variable name to store the resulting dataframe.
8181
```
8282

8383
## test
8484
```
8585
%test [--ls] model [test_name]
8686
87-
Allow the user to list tests for a model, output a specific test and then write their changes back
87+
Allow the user to list tests for a model, output a specific test, and
88+
then write their changes back.
8889
8990
positional arguments:
9091
model The model.
91-
test_name The test name to display
92+
test_name The test name to display.
9293
9394
options:
94-
--ls List tests associated with a model
95+
--ls List tests associated with a model.
9596
```
9697

9798
## audit
@@ -107,5 +108,5 @@ TODO
107108
```
108109
%dag
109110
110-
Displays the dag
111+
Displays the dag.
111112
```

docs/api/overview.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# Overview
22

3-
SQLMesh can be used with a [cli](cli.md), [notebook](notebook.md), or directly through [Python](python.md). Each interface tries to have equivalent functionality and arguments.
3+
SQLMesh can be used with a [cli](cli.md), [notebook](notebook.md), or directly through [Python](python.md). Each interface aims to have parity in both functionality and arguments.
44

55
## plan
6-
Plan is the main command of SQLMesh. It allows you to interactively create a migration plan, understand the downstream impact, and apply it. All changes to models and environments will be materialized through plan.
6+
Plan is the main command of SQLMesh. It allows you to interactively create a migration plan, understand the downstream impact, and apply it. All changes to models and environments are materialized through plan.
77

88
Read more about [plan](/concepts/plans).
99

@@ -29,4 +29,4 @@ Formats all SQL model files in place.
2929
Shows the diff between the local model and a model in an evironment.
3030

3131
## dag
32-
Shows the dag.
32+
Shows the [DAG](../glossary.md).

docs/community.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Join our community
2+
3+
We'd love to help guide you along your data journey. Connect with us in the following ways:
4+
5+
* Join the [tobiko Slack community](https://join.slack.com/t/tobiko-data/shared_invite/zt-1je7o3xhd-C7~GuZTj0a8xz_uQbTJjHg) to ask questions, or just to say hi!
6+
* File an issue on our [GitHub](https://github.com/TobikoData/sqlmesh/issues/new).
7+
* Send us an email at [hello@tobikodata.com](hello@tobikodata.com) with your questions or feedback.
8+

0 commit comments

Comments
 (0)