Skip to content

Visualizations In Daft #1169

Closed Answered by jaychia
RagingTiger asked this question in Q&A
Discussion options

You must be logged in to vote

Hi @RagingTiger!

To interact with libraries such as Matplotlib, it's easiest to use Pandas as an interchange format. Because Daft uses Arrow as a backend, the conversion from a Daft dataframe into a Pandas dataframe with .to_pandas() is very cheap.

You can think of the workflow as:

  1. Use Daft for data heavy-lifting:
    1. Read data
    2. Process data
    3. Perform aggregations
  2. Now that your dataframe is processed and aggregated, it is small enough to pull into memory on your driver machine as a Pandas dataframe for visualizations with libraries such as matplotlib/seaborn

Here is some sample code:

# Do work in Daft
df = daft.read_parquet(...)
df = df.with_column(...)
df = df.agg(...)

# Execute
df.collect(…

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by RagingTiger
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants