Skip to content

Commit 3bce0ad

Browse files
committed
add video and update readme
1 parent 4a68f74 commit 3bce0ad

File tree

3 files changed

+73
-14
lines changed

3 files changed

+73
-14
lines changed

README.md

Lines changed: 50 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,67 @@
11
# sketch
22

3-
# Currently a work in progress.
3+
Co-pilot for pandas users, AI that understands the content of data, greatly enhancing the relevance of suggestions. Adding data context to AI code-writing assistants, usable in any Jupyter in seconds.
44

5-
Co-pilot for pandas users, AI that understands the content of data, greatly enhancing the relevance of suggestions.
5+
Enhance your workflow by asking questions of your data, and getting code suggestions to answer those questions. Reduce the time spent googling, asking chat-gpt3, and even re-writing co-pilot suggestions. Get more accurate code suggestions for code and pandas, all without adding any plugin to your IDE.
66

7-
Adding data context to AI code-writing assistants, usable in any Jupyter in seconds.
8-
9-
```
7+
```bash
108
pip install sketch
119
```
1210

13-
## Example (gif)
11+
## Demo
12+
13+
![](./sketch-demo-basic.mp4)
14+
15+
## How to use
16+
17+
It's as simple as importing sketch, and then using the `.sketch` extension on any pandas dataframe.
18+
1419
```python
1520
import sketch
16-
...
17-
df.sketch.howto("Check for any duplicate rows, and keep the first one based on the time feature")
1821
```
1922

20-
## It would also be pretty good to compare this to copilot directly
21-
(Show a copilot suggestion with comment block, and its output)
22-
(Show a GPT-3 codex response)
23+
Now, any pandas dataframe you have will have an extension registered to it. Access this new extension with your dataframes name `.sketch`
24+
25+
### `.sketch.ask`
26+
27+
Ask is a basic question-answer system on sketch, this will return an answer in text that is based off of the summary statistics and description of the data.
28+
29+
Use ask to get an understanding of the data, get better column names, ask hypotheticals (how would I go about doing X with this data), and more.
30+
31+
```python
32+
df.sketch.ask("Which columns are integer type?")
33+
```
34+
35+
### `.sketch.howto`
36+
37+
Howto is the basic "code-writing" prompt in sketch. This will return a code-block you should be able to copy paste and use as a starting point (or possibly ending!) for any question you have to ask of the data. Ask this how to clean the data, normalize, create new features, plot, and even build models!
38+
39+
```python
40+
df.sketch.howto("Plot the sales versus time")
41+
```
42+
43+
### `.sketch.apply`
44+
45+
apply is a more advanced prompt that is more useful for data generation. Use it to parse fields, generate new features, and more. This is built directly on [lambdaprompt](https://github.com/approximatelabs/lambdaprompt). In order to use this, you will need to set up a free account with OpenAI, and set an environment variable with your API key. `OPENAI_API_KEY=YOUR_API_KEY`
46+
47+
```python
48+
df['review_keywords'] = df.sketch.apply("Keywords for the review [{{ review_text }}] of product [{{ product_name }}] (comma separated):")
49+
```
50+
51+
```python
52+
df['capitol'] = pd.DataFrame({'State': ['Colorado', 'Kansas', 'California', 'New York']}).sketch.apply("What is the capitol of [{{ State }}]?")
53+
```
54+
55+
## Sketch currently uses `prompts.approx.dev` to help run with minimal setup
56+
57+
In the future, we plan to update the prompts at this endpoint with our own custom foundation model, built to answer questions more accurately than GPT-3 can with its minimal data context.
2358

24-
## How to run with your own OpenAI API key
59+
You can also directly call OpenAI directly (and not use our endpoint) by using your own API key. To do this, set 2 environment variables.
2560

26-
If you add `OPENAI_API_KEY` environment variable and `LOCAL_LAMBDA_PROMPT=True`, then sketch will run the prompts locally, directly using your API key with openAI's endpoints.
61+
(1) `SKETCH_USE_REMOTE_LAMBDAPROMPT=False`
62+
(2) `OPENAI_API_KEY=YOUR_API_KEY`
2763

2864
## How it works
2965

30-
Sketch uses efficient approximation algorithms (data sketches) to quickly summarize your data, and feed that information into language models. Right now it does this by summarizing the columns and writing these summary statistics as additional context to be used by the code-writing prompt. In the future we hope to feed these sketches directly into custom made "data + language" foundation models.
66+
Sketch uses efficient approximation algorithms (data sketches) to quickly summarize your data, and feed that information into language models. Right now it does this by summarizing the columns and writing these summary statistics as additional context to be used by the code-writing prompt. In the future we hope to feed these sketches directly into custom made "data + language" foundation models to get more accurate results.
3167

sketch-demo-basic.mp4

5.36 MB
Binary file not shown.

sketch/pandas_extension.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -269,3 +269,26 @@ def apply_func(row):
269269
return new_gpt3_prompt(**row_dict)
270270

271271
return self._obj.apply(apply_func, axis=1)
272+
273+
# # Async version
274+
275+
# new_gpt3_prompt = lambdaprompt.AsyncGPT3Prompt(prompt_template_string)
276+
# named_args = new_gpt3_prompt.get_named_args()
277+
# known_args = set(self._obj.columns) | set(kwargs.keys())
278+
# needed_args = set(named_args)
279+
# if needed_args - known_args:
280+
# raise RuntimeError(
281+
# f"Missing: {needed_args - known_args}\nKnown: {known_args}"
282+
# )
283+
284+
# ind, vals = [], []
285+
# for i, row in self._obj.iterrows():
286+
# ind.append(i)
287+
# row_dict = row.to_dict()
288+
# row_dict.update(kwargs)
289+
# vals.append(new_gpt3_prompt(**row_dict))
290+
291+
# # gather the results
292+
# vals = asyncio.run(asyncio.gather(*vals))
293+
294+
# return pd.Series(vals, index=ind)

0 commit comments

Comments
 (0)