Skip to content

Commit

Permalink
api ref: address another round of feedback on open fn
Browse files Browse the repository at this point in the history
per iterative#908 (review)
and several other comments
  • Loading branch information
jorgeorpinel committed Mar 8, 2020
1 parent f245cc8 commit 6bd2740
Show file tree
Hide file tree
Showing 3 changed files with 39 additions and 40 deletions.
2 changes: 1 addition & 1 deletion public/static/docs/api-reference/get_url.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# get_url()
# dvc.api.get_url()

Returns the URL to the storage location of a data file or directory tracked in a
<abbr>DVC project</abbr>.
Expand Down
56 changes: 28 additions & 28 deletions public/static/docs/api-reference/open.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# open()
# dvc.api.open()

Opens a tracked file.

```py
dvc.api.open(path: str,
repo: str = None,
rev: str = None,
remote: str = None,
mode: str = "r",
encoding: str = None)
def open(path: str,
repo: str = None,
rev: str = None,
remote: str = None,
mode: str = "r",
encoding: str = None)
```

#### Usage:
Expand All @@ -25,9 +25,10 @@ with dvc.api.open(

## Description

Open file or model (`path`) tracked in a <abbr>DVC project</abbr> (by DVC or
Git), and generate a corresponding
[file object](https://docs.python.org/3/glossary.html#term-file-object).
Open a data or model file tracked in a <abbr>DVC project</abbr> and generate a
corresponding
[file object](https://docs.python.org/3/glossary.html#term-file-object). The
file can be tracked by DVC or by Git.

> The exact type of file object depends on the `mode` used. For more details,
> please refer to Python's
Expand All @@ -38,12 +39,13 @@ Git), and generate a corresponding
[context manager](https://www.python.org/dev/peps/pep-0343/#context-managers-in-the-standard-library)
(using the `with` keyword, as shown in the examples).

> Use `dvc.api.read()` to get the file's contents directly – no _context
> manager_ involved.
> Use `dvc.api.read()` to get the complete file contents in a single function
> call – no _context manager_ involved.
This function reads (streams) the file trough a direct connection to the storage
whenever possible, so it does not require any space on the disc to save the file
before making it accessible. The only exception is when using a Google Drive
This function makes a direct connection to the storage most of the times, so the
file contents can be streamed as they are read (which requires an active network
connection). This means it does not require space on the disc to save the file
before making it accessible. The only exception is when using Google Drive as
[remote type](/doc/command-reference/remote/add#supported-storage-types).

## Parameters
Expand Down Expand Up @@ -86,13 +88,11 @@ before making it accessible. The only exception is when using a Google Drive

- `dvc.exceptions.NoRemoteError` - no `remote` is found.

## Example: Use data tracked in a DVC repository online
## Example: Use data or models from DVC repositories online

Any <abbr>data artifact</abbr> can be employed directly in your Python app by
using this API.

For example, an XML file from a public DVC repo online can be processed directly
in your Python app with:
using this API. For example, an XML file tracked in a public DVC repo on Github
can be processed directly in your Python app with:

```py
from xml.dom.minidom import parse
Expand All @@ -106,8 +106,8 @@ with dvc.api.open(
# ... Process DOM
```

> Notice that you could read the contents of a tracked file faster with
> `dvc.api.read()`:
> Notice that if you just need to load the complete file contents to memory, you
> can use `dvc.api.read()` instead:
>
> ```py
> xmldata = dvc.api.read('get-started/data.xml',
Expand All @@ -116,7 +116,7 @@ with dvc.api.open(
> ```
Now let's imagine you want to deserialize and use a binary model from a private
repo online. For a case like this, we can use a SSH URL instead (assuming the
repo. For a case like this, we can use an SSH URL instead (assuming the
[credentials are configured](https://help.github.com/en/github/authenticating-to-github/connecting-to-github-with-ssh)
locally):
Expand All @@ -132,7 +132,7 @@ with dvc.api.open(
# ... Use instanciated model
```
## Example: Use other versions of data or results
## Example: Use different versions of data

The `rev` argument lets you specify any Git commit to look for an artifact. This
way any previous version, or alternative experiment can be accessed
Expand All @@ -151,10 +151,10 @@ with dvc.api.open(
# ... Read clean data from version 1.1.0
```

Also, notice that in this case we didn't supply a `repo` argument in this
example. DVC will attempt to find a <abbr>DVC project</abbr> to use in the
current working directory tree, and look for the file contents of `clean.csv` in
its local <abbr>cache</abbr>; no download will happen if found. See the
Also, notice that we didn't supply a `repo` argument in this example. DVC will
attempt to find a <abbr>DVC project</abbr> to use in the current working
directory tree, and look for the file contents of `clean.csv` in its local
<abbr>cache</abbr>; no download will happen if found. See the
[Parameters](#parameters) section for more info.

Note: to specify the file encoding of a text file, use:
Expand Down
21 changes: 10 additions & 11 deletions public/static/docs/api-reference/read.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# read()
# dvc.api.read()

Returns the contents of a tracked file.

```py
dvc.api.open(path: str,
repo: str = None,
rev: str = None,
remote: str = None,
mode: str = "r",
encoding: str = None)
def open(path: str,
repo: str = None,
rev: str = None,
remote: str = None,
mode: str = "r",
encoding: str = None)
```

#### Usage:
Expand All @@ -24,10 +24,9 @@ modelpkl = dvc.api.read(

## Description

This function wraps [`dvc.api.open()`](/doc/api-reference/open) for a simple and
direct way to return the complete contents of files tracked in <abbr>DVC
projects</abbr> (by DVC or Git). If the file cannot be found, a
`PathMissingError` is raised.
This function wraps [`dvc.api.open()`](/doc/api-reference/open), for a simple
way to return the complete contents of a file tracked in a <abbr>DVC
project</abbr>. The file can be tracked by DVC or by Git.

The returned contents can be a
[string](https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str)
Expand Down

0 comments on commit 6bd2740

Please sign in to comment.