You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SQLMesh is a next-generation SQL transformation platform. It provides you with powerful automation for versioning, backfilling, deployment, and testing -- allowing you to focus on simply writing SQL.
1
+
# Overview
2
+
3
+
SQLMesh is a next-generation SQL transformation platform. It provides you with powerful automation for versioning, backfilling, deployment, and testing — allowing you to focus on simply writing SQL.
3
4
4
5
SQLMesh is able to achieve all of this with minimal setup; there are no additional services or dependencies required to get started using SQLMesh other than a connection to your existing data warehouse or engine.
5
6
@@ -9,42 +10,32 @@ One of the main advantages over other transformation frameworks is that SQLMesh
9
10
10
11
SQLMesh also automates away complexity, so configuring models is no longer tricky due to complex macros that require understanding of the context for execution. Writing your data pipelines incrementally with SQLMesh not only saves you money and time, but keeps your systems maintainable, reliable, and accessible to all of your data practicioners.
11
12
12
-
## Reduced cost
13
+
###Reduced cost
13
14
As discussed above, incremental compute is significantly cheaper than full refresh compute.
14
15
15
16
For example, if you have one year of history but only receive new data on a daily basis, only processing that new data is ~365x cheaper than reprocessing one year each day. As your data grows, it's possible that refreshing your tables may take longer than a day, which means you would never be able to catch up!
16
17
17
18
In addition, you may not be able to refresh particular tables all at once; they may need to be batched into smaller intervals. The cost of your data pipelines compound as more dependent pipelines are created. Therefore, writing your data pipelines incrementally as much as possible can result in exponential savings.
18
19
19
-
## Increased efficiency
20
-
SQLMesh safely reuses physical tables across isolated environments. Some databases, such as Snowflake, have [zero-copy cloning](https://docs.snowflake.com/en/user-guide/tables-storage-considerations.html#label-cloning-tables)-- but this is a manual process, and not widely supported.
20
+
###Increased efficiency
21
+
SQLMesh safely reuses physical tables across isolated environments. Some databases, such as Snowflake, have [zero-copy cloning](https://docs.snowflake.com/en/user-guide/tables-storage-considerations.html#label-cloning-tables)— but this is a manual process, and not widely supported.
21
22
22
23
SQLMesh is able to automatically reuse tables regardless of which data warehouse or engine you're using. This is achieved by storing fingerprints of your models and by employing [views](https://en.wikipedia.org/wiki/View_(SQL)) like pointers to physical locations. Therefore, spinning up a new development environment is fast and cheap; only models with incompatible changes need to be materialized, once again saving time and money.
23
24
24
-
## Automation for everyone
25
+
###Automation for everyone
25
26
Creating maintainable and scalable data pipelines is extremely difficult, and is a task usually reserved for data engineers. As your data grows, the need for incremental compute becomes mandatory due to the cost and time constaints.
26
27
27
28
Incremental models have inherent state of which partitions have been computed. This makes managing the consistency and accuracy challenging (leaving no data leakages or gaps). Although a seasoned engineer may have the expertise or tooling to operate one of these tables, an analyst would not. In these organizations, analysts would either need to file a ticket and wait on data engineering resources, or bypass core data models by running their own custom jobs, which inevitably leads to an ungoverned data mess. SQLMesh democratizes the ability to write safe and scalable data pipelines to all data practitioners, regardless of technical ability.
28
29
29
-
## Complexity made simple
30
+
###Complexity made simple
30
31
As more and more models and users depend on core tables, the complexity of making changes increases. You must ensure that all downstream data consumers are compatible and updated with any new changes.
31
32
32
33
Propagating a change throughout a complex graph of dependencies is difficult to communicate, and also challenging to do accurately. The introduction of other schedulers such as [Airflow](https://airflow.apache.org/) adds even more complexity. SQLMesh seamlessly integrates directly with your existing scheduler so that your entire data pipeline, including jobs outside of SQLMesh, will be unified and robust.
33
34
34
-
## Collaboration and integration
35
+
###Collaboration and integration
35
36
SQLMesh allows for data pipelines to be a collaborative experience. It both empowers less technical data users to contribute and enables them to collaborate with others who may be more familiar with data engineering. Development can be done in a fully isolated environment that can be accessed and validated by others.
36
37
37
38
SQLMesh provides information about changes and how they may affect your downstream consumers. This transparency, along with the ability to categorize changes, makes it more feasible for a less technically savvy user to make updates to core data pipelines. By integrating with our Continuous Integration/Continuous Delivery (CI/CD) flows, you can require approval for any changes before going to production, ensuring that the relevant data owners or experts can review and validate the changes.
38
39
39
-
## Testing and reliability
40
-
SQLMesh supports both [audits](#audits) and [tests](#tests). Although unit tests has been commonplace in the world of software engineering, they are relatively unknown in the data world. SQLMesh's data unit tests allow for stability and reliability, as data pipeline owners can ensure that changes to models don't change underlying logic. These tests can run quickly in CI, or locally without having to create full scale tables.
41
-
42
-
Ready to jump in? Refer to `sqlmesh.docs.getting_started`.
43
-
44
-
## Community
45
-
46
-
We'd love to help guide you along your data journey. Follow the links below to connect with us:
47
-
48
-
* Join the [tobiko Slack community](https://join.slack.com/t/tobiko-data/shared_invite/zt-1je7o3xhd-C7~GuZTj0a8xz_uQbTJjHg) to ask questions, or just to say hi!
49
-
* File an issue on our [GitHub](https://github.com/TobikoData/sqlmesh/issues/new).
50
-
* Send us an email at [hello@tobikodata.com](hello@tobikodata.com) with your questions or feedback.
40
+
### Testing and reliability
41
+
SQLMesh supports both audits and tests. Although unit tests has been commonplace in the world of software engineering, they are relatively unknown in the data world. SQLMesh's data unit tests allow for stability and reliability, as data pipeline owners can ensure that changes to models don't change underlying logic. These tests can run quickly in CI, or locally without having to create full scale tables.
Copy file name to clipboardExpand all lines: docs/api/overview.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,9 +1,9 @@
1
1
# Overview
2
2
3
-
SQLMesh can be used with a [cli](cli.md), [notebook](notebook.md), or directly through [Python](python.md). Each interface tries to have equivalent functionality and arguments.
3
+
SQLMesh can be used with a [cli](cli.md), [notebook](notebook.md), or directly through [Python](python.md). Each interface aims to have parity in both functionality and arguments.
4
4
5
5
## plan
6
-
Plan is the main command of SQLMesh. It allows you to interactively create a migration plan, understand the downstream impact, and apply it. All changes to models and environments will be materialized through plan.
6
+
Plan is the main command of SQLMesh. It allows you to interactively create a migration plan, understand the downstream impact, and apply it. All changes to models and environments are materialized through plan.
7
7
8
8
Read more about [plan](/concepts/plans).
9
9
@@ -29,4 +29,4 @@ Formats all SQL model files in place.
29
29
Shows the diff between the local model and a model in an evironment.
We'd love to help guide you along your data journey. Connect with us in the following ways:
4
+
5
+
* Join the [tobiko Slack community](https://join.slack.com/t/tobiko-data/shared_invite/zt-1je7o3xhd-C7~GuZTj0a8xz_uQbTJjHg) to ask questions, or just to say hi!
6
+
* File an issue on our [GitHub](https://github.com/TobikoData/sqlmesh/issues/new).
7
+
* Send us an email at [hello@tobikodata.com](hello@tobikodata.com) with your questions or feedback.
0 commit comments