Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.

Tidying up duplication between /docs and docs.datafold.com #495

Merged
merged 5 commits into from
Apr 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 1 addition & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,26 +32,6 @@ For their corresponding connection strings, check out our [detailed table](https
#### Looking for a database not on the list?
If a database is not on the list, we'd still love to support it. [Please open an issue](https://github.com/datafold/data-diff/issues) to discuss it, or vote on existing requests to push them up our todo list.

## Use cases

### Diff Tables Between Databases
#### Quickly identify issues when moving data between databases

<p align="center">
<img alt="diff2" src="https://user-images.githubusercontent.com/1799931/196754998-a88c0a52-8751-443d-b052-26c03d99d9e5.png" />
</p>

### Diff Tables Within a Database
#### Improve code reviews by identifying data problems you don't have tests for
<p align="center">
<a href=https://www.loom.com/share/682e4b7d74e84eb4824b983311f0a3b2 target="_blank">
<img alt="Intro to Diff" src="https://user-images.githubusercontent.com/1799931/196576582-d3535395-12ef-40fd-bbbb-e205ccae1159.png" width="50%" height="50%" />
</a>
</p>

&nbsp;
&nbsp;

## Get started

### Installation
Expand Down Expand Up @@ -126,10 +106,7 @@ In both code examples, I've used `<>` carrots to represent values that **should

### We're here to help!

We know that in some cases, the data-diff command can become long and dense. And maybe you're new to the command line.

* We're here to help [on slack](https://getdbt.slack.com/archives/C03D25A92UU) if you have ANY questions as you use `data-diff` in your workflow.
* You can also post a question in [GitHub Discussions](https://github.com/datafold/data-diff/discussions).
We're here to help! Please post any questions in [GitHub Discussions](https://github.com/datafold/data-diff/discussions).

## How to Use

Expand Down
159 changes: 0 additions & 159 deletions docs/how-to-use.md

This file was deleted.

16 changes: 3 additions & 13 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,12 @@
:hidden:

python-api
python_examples

data-diff
---------

**Data-diff** is a command-line tool and Python library to efficiently diff
rows across two different databases.

⇄ Verifies across many different databases (e.g. *PostgreSQL* -> *Snowflake*) !

🔍 Outputs diff of rows in detail

🚨 Simple CLI/API to create monitoring and alerts

🔥 Verify 25M+ rows in <10s, and 1B+ rows in ~5min.

♾️ Works for tables with 10s of billions of rows
**Data-diff** is a command-line tool and Python library for comparing tables in and across databases.

For more information, `See our README <https://github.com/datafold/data-diff#readme>`_

Expand All @@ -32,4 +22,4 @@ Resources
- :doc:`python-api`
- The rest of the `documentation`_

.. _documentation: https://docs.datafold.com/os_diff/about/
.. _documentation: https://docs.datafold.com/guides/os_data_diff
44 changes: 44 additions & 0 deletions docs/python_examples.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
Python API Examples
---------

**Example 1: Diff tables in mysql and postgresql**

.. code-block:: python
# Optional: Set logging to display the progress of the diff
import logging
logging.basicConfig(level=logging.INFO)

from data_diff import connect_to_table, diff_tables

table1 = connect_to_table("postgresql:///", "table_name", "id")
table2 = connect_to_table("mysql:///", "table_name", "id")

for different_row in diff_tables(table1, table2):
plus_or_minus, columns = different_row
print(plus_or_minus, columns)


**Example 2: Connect to snowflake using dictionary configuration**

.. code-block:: python
SNOWFLAKE_CONN_INFO = {
"driver": "snowflake",
"user": "erez",
"account": "whatever",
"database": "TESTS",
"warehouse": "COMPUTE_WH",
"role": "ACCOUNTADMIN",
"schema": "PUBLIC",
"key": "snowflake_rsa_key.p8",
}

snowflake_table = connect_to_table(SNOWFLAKE_CONN_INFO, "table_name") # Uses id by default

Run `help(connect_to_table)` and `help(diff_tables)` or read our API reference to learn more about the different options:

- connect_to_table_

- diff_tables_

.. _connect_to_table: https://data-diff.readthedocs.io/en/latest/python-api.html#data_diff.connect_to_table
.. _diff_tables: https://data-diff.readthedocs.io/en/latest/python-api.html#data_diff.diff_tables