Skip to content

Conversation

@pared
Copy link
Contributor

@pared pared commented Sep 20, 2021

Per treeverse/dvc#6431 (specs)

What we need to document:

images support
plots modify will not have an effect on image plots (not restricting it as we support dirs, which can have mixed content)

closes #2807

@pared pared changed the title Plots images plots: document images support Sep 20, 2021
@pared pared changed the title plots: document images support [WIP] plots: document images support Sep 20, 2021
@shcheklein shcheklein temporarily deployed to dvc-org-plots-images-dxygj9x6v September 20, 2021 17:22 Inactive
Data series plots utilize [Vega-Lite](https://vega.github.io/vega-lite/) for
rendering. Vega-Lite is a declarative grammar for defining plots using JSON. The
plots can also be saved as SVG or PNG image filed from the browser. Image type
plots are rendered using `<img>` tag.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of curiosity - how do we deal with an image file from a past commit? when I do dvc plots diff? do we embed it into HTML, copy it nearby?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we just insert files with identical path into same <div> tag.:

Screenshot 2021-09-21 at 22 44 56

During treeverse/dvc#6603 I will try to make it look less terrible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, but where does it take the image itself from HEAD? the png file - how does it read from the Git history?

Copy link
Contributor Author

@pared pared Sep 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We read it as a binary file and later write the same way into the webpage.

It can become problematic (memory-wise) in case of multiple images and revision, so further development assumes
using cache URL instead of binary data (in cached plots use case). In the case of git, well this approach will probably have to remain.

related:
treeverse/dvc#6145 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shcheklein Not sure if you meant how images are collected or how they are sourced into the HTML. If you meant the latter, dvc plots diff now generates a directory that looks like:

dvc_plots
├── index.html
└── static
    ├── HEAD_image.png
    └── workspace_image.png

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @dberenbaum ! I'm not sure if then We read it as a binary file by Pawel is still relevant?

I was asking this because we'll need access to these data from the VS Code env. We are iterating right now on the requirements and I'm curious what kind of interface we can expect/build for it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment from @pared is relevant for Studio, but probably not for VSCode. We don't have any --show-json or other machine-readable output from dvc plots diff today for images, so either VSCode can read directly from the generated dvc_plots directory, or we will need to add some other option to provide the needed info.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine for us to start with dvc plots diff if it supports experiments/checkpoints. But it doesn't sound like a very efficient thing. Ideally yes, we'll need an interface to see the list of available things across commits/experiments and either path to those or a command that would stream them for us (if it's a Git history) and/or find another way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it doesn't sound like a very efficient thing.

Do you mean by compute, memory, or storage? @pared linked to the discussion about memory concerns above, where we can make optimizations if needed. We chose to make copies of the files in the dvc_plots directory so that the results are self-contained and can be shared, moved, etc. This seems like a case where the user requirements and downstream product requirements might not align well.

@dberenbaum
Copy link
Contributor

Related: treeverse/dvc#6431

@shcheklein shcheklein temporarily deployed to dvc-org-plots-images-dxygj9x6v September 21, 2021 22:02 Inactive
@shcheklein shcheklein temporarily deployed to dvc-org-plots-images-dxygj9x6v September 22, 2021 08:21 Inactive
@shcheklein shcheklein temporarily deployed to dvc-org-plots-images-dxygj9x6v September 22, 2021 09:17 Inactive
@pared pared changed the title [WIP] plots: document images support plots: document images support Sep 22, 2021
@pared pared requested a review from shcheklein September 22, 2021 09:23
Comment on lines +3 to 6
A set of commands to visualize and compare _plot metrics_:
[show](/doc/command-reference/plots/show),
[diff](/doc/command-reference/plots/diff), and
[modify](/doc/command-reference/plots/modify).
Copy link
Contributor

@jorgeorpinel jorgeorpinel Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should clarify here (in the usage block below) and in https://dvc.org/doc/command-reference/plots/modify that it only works with structured "data series" file plots.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorgeorpinel added context to modify
Also created:
treeverse/dvc#6742

Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few first quick copy edits (committing)...

@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-plots-images-dxygj9x6v September 27, 2021 03:00 Inactive
Co-authored-by: Restyled.io <commits@restyled.io>
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-plots-images-dxygj9x6v September 27, 2021 03:02 Inactive
Copy link
Contributor

@jorgeorpinel jorgeorpinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More copy edits (committing)

@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-plots-images-dxygj9x6v September 27, 2021 03:12 Inactive
Co-authored-by: Restyled.io <commits@restyled.io>
@jorgeorpinel jorgeorpinel temporarily deployed to dvc-org-plots-images-dxygj9x6v September 27, 2021 03:13 Inactive
@shcheklein shcheklein temporarily deployed to dvc-org-plots-images-dxygj9x6v October 4, 2021 16:28 Inactive
pared and others added 3 commits October 11, 2021 11:32
Co-authored-by: Jorge Orpinel <jorgeorpinel@users.noreply.github.com>
Co-authored-by: Jorge Orpinel <jorgeorpinel@users.noreply.github.com>
Co-authored-by: Jorge Orpinel <jorgeorpinel@users.noreply.github.com>
@shcheklein shcheklein temporarily deployed to dvc-org-plots-images-dxygj9x6v October 11, 2021 09:33 Inactive
@jorgeorpinel
Copy link
Contributor

Sorry for the delay and thanks @pared !

@jorgeorpinel jorgeorpinel merged commit d59a8c7 into master Oct 20, 2021
@jorgeorpinel jorgeorpinel deleted the plots_images branch October 20, 2021 16:20
Comment on lines +40 to +43
DVC can work with two types of plots files:

In contrast to `dvc metrics`, these metrics should be stored as data series.
Unlike its `dvc metrics` counterpart, `dvc plots diff` cannot calculate numeric
differences between the metrics in different experiments.
1. Data series files, which can be JSON, YAML, CSV or TSV.
2. Image files in JPEG, GIF, or PNG format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know what we're going to need to revisit this explanation. I think... We already have 2 types of metrics in this same doc so this is going to be confusing... I'm not sure how or where else to present this feature so that it's clearer though 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the handling of different types of plots should be explained in a separate document?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah not sure... Maybe the types of metrics (metrics vs. plots) shouldn't be there. Moved to #2956

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

plots: document images support

5 participants