Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regular updates and fixes (early September) #601

Merged
merged 28 commits into from
Sep 6, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
ff3e9d6
term: change `ENV` for env(ironment) in contributing user guide
jorgeorpinel Sep 1, 2019
d86008a
remove: update help output
jorgeorpinel Sep 1, 2019
c38e6ce
get/import: add link and clarify HEAD `--rev` option default
jorgeorpinel Sep 1, 2019
07fb5b0
changelog: remove extra space in changelog/0.35
jorgeorpinel Sep 1, 2019
5bb7ede
get-started: reformat add-files
jorgeorpinel Sep 1, 2019
8ec0d54
docs: review usage of "DVC" branding of terms (1)
jorgeorpinel Sep 2, 2019
0413387
term: "remote cache" -> "remote storage"
jorgeorpinel Sep 2, 2019
182ff1b
term: review usage of "DVC" branding (3) through static/docs/commands…
jorgeorpinel Sep 2, 2019
26d3692
term: most "local cache" -> "cache directory" / "project cache"
jorgeorpinel Sep 2, 2019
fa93646
term: data set -> dataset
jorgeorpinel Sep 2, 2019
e91df63
term: "run(s)/ran again" -> "regenerate" (repro context)
jorgeorpinel Sep 2, 2019
903cea3
Merge branch 'master' into jorgeorpinel
jorgeorpinel Sep 2, 2019
b022c85
term: review usage of "dependency graph" (and related), "DAG", and
jorgeorpinel Sep 3, 2019
1665812
cmd ref: update "Data and pipelines are up to date." phrase
jorgeorpinel Sep 3, 2019
492cfc6
term: improve usage of "regenreate" and "execute" for stages/pipeline…
jorgeorpinel Sep 3, 2019
d81791d
term: reduse usage of "again", especially in the contest of `dvc repro`
jorgeorpinel Sep 3, 2019
65fbec3
glossary: update "workspace" term, and improve related user-guide des…
jorgeorpinel Sep 4, 2019
84c42fe
Merge branch 'master' into jorgeorpinel
jorgeorpinel Sep 4, 2019
3f7884b
term: stop using glossary entry "cache directory", related updates
jorgeorpinel Sep 5, 2019
3c0db9f
user-guide: link "cache directory" term where appropriate
jorgeorpinel Sep 5, 2019
5fe82e7
cmd ref: change from HEAD to "tip of default branch" in --rev option …
jorgeorpinel Sep 5, 2019
a5019ef
get-started: reword stage file commands explanation
jorgeorpinel Sep 5, 2019
46aa961
cmd ref: fix closing `)` in run and hyphenate "non-deterministic" in …
jorgeorpinel Sep 5, 2019
8753d05
cmd ref: explain outputs better in `add`
jorgeorpinel Sep 5, 2019
e4ce024
comlpemenet last commit
jorgeorpinel Sep 5, 2019
62742ed
term: review DVC branding up to static/docs/commands-reference/metrics
jorgeorpinel Sep 5, 2019
f9ab91a
term: review "runs" throughout
jorgeorpinel Sep 5, 2019
03eaa8b
term: review usage of "data remote" and include "remote storage" more
jorgeorpinel Sep 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions static/docs/commands-reference/import-url.md
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,7 @@ Saving information to 'data.xml.dvc'.

DVC has noticed the "external" data source has changed, and updated the import
stage (reproduced it). In this case it's also necessary to run `dvc repro` so
that the rest of the pipeline is also run again. We can confirm so with:
that the rest of the pipeline is also regenerated. We can confirm so with:

```dvc
$ dvc status
Expand All @@ -348,6 +348,6 @@ $ dvc status
Data and pipelines are up to date.
```

`dvc repro` runs again the given stage `prepare.dvc`, noticing that its
`dvc repro` regenerates the given `prepare.dvc` stage, noticing that its
dependency `data/data.xml` has changed. `dvc status` should report "Nothing to
reproduce." after this.
6 changes: 3 additions & 3 deletions static/docs/commands-reference/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -285,6 +285,6 @@ Data and pipelines are up to date.

After reproducing this pipeline up to the "evaluate" stage, the data files are
in sync with the code/config files, but we must now commit the changes to the
Git repository. Looking closely we see that `dvc status` is run again, informing
us that the data files are synchronized with the `Pipelines are up to date.`
message.
Git repository. Looking closely we see that `dvc status` is used again,
informing us that the data files are synchronized with the
`Pipelines are up to date.` message.
18 changes: 9 additions & 9 deletions static/docs/commands-reference/repro.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# repro

Run again commands recorded in the [stages](/doc/commands-reference/run) of one
or more [pipelines](/doc/commands-reference/pipeline), in the correct order. The
commands to be run are determined by recursively analyzing target stages and
changes in their dependencies.
Regenerate [stages](/doc/commands-reference/run) of one or more
[pipelines](/doc/commands-reference/pipeline) by executing commands recorded in
them again, in the correct order. The commands to be executed are determined by
recursively analyzing target stages and changes in their dependencies.

## Synopsis

Expand All @@ -24,7 +24,7 @@ positional arguments:
<abbr>project</abbr>. (A pipeline is typically defined using the `dvc run`
command, while data input nodes are defined by the `dvc add` command.)

There's a few ways to restrict the stages that will be run again by this
There's a few ways to restrict the stages that will be regenerated by this
command: by specifying stage file `targets`, or by using the `--single-item`,
`--cwd`, or other options.

Expand Down Expand Up @@ -92,13 +92,13 @@ specified), and updates stage files with the new checksum information.
`requirements.txt`, we can specify it only once in `A`, omitting it in `B` and
`C`. To be precise , it reproduces all descendants of a changed stage or the
stages following the changed stage, even if their direct dependencies did not
change. Like with the same option on `dvc run`, this is a way to force stages
without changes to run again. This can also be useful for pipelines containing
stages that produce nondeterministic (semi-random) outputs. For
change. Like with the same option on `dvc run`, this is a way to force
regenerating stages without changes. This can also be useful for pipelines
containing stages that produce nondeterministic (semi-random) outputs. For
nondeterministic stages the outputs can vary on each execution, meaning the
cache cannot be trusted for such stages.

- `--downstream` - only run again the stages after the given `targets` in their
- `--downstream` - only regenerate the stages after the given `targets` in their
corresponding pipelines, including the target stages themselves.

- `-h`, `--help` - prints the usage/help message, and exit.
Expand Down
2 changes: 1 addition & 1 deletion static/docs/commands-reference/status.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Data and pipelines are up to date.
```

This indicates that no differences were detected, and therefore no stages would
be run again by `dvc repro`.
be regenerated by `dvc repro`.

If instead, differences are detected, `dvc status` lists those changes. For each
DVC-file (stage) with differences, the changes in _dependencies_ and/or
Expand Down
2 changes: 1 addition & 1 deletion static/docs/tutorial/define-ml-pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -398,4 +398,4 @@ focus is DVC, not ML modeling and we use a relatively small dataset without any
advanced ML techniques.

In the next chapter we will try to improve the metrics by changing our modeling
code and using reproducibility in our pipeline regeneration.
code and using reproducibility in our pipeline.
3 changes: 2 additions & 1 deletion static/docs/tutorial/reproducibility.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,8 @@ Reproducing 'Dvcfile':

The process started with the feature creation stage because one of its
parameters was changed — the edited source code file `code/featurization.py`.
All dependent stages were ran again as well.
All dependent stages were regenerated as well. (See `--downstream` option in
`dvc repro`.)

Let’s take a look at the metric’s change. The improvement is close to zero
(+0.0075% to be precise):
Expand Down