Skip to content

add optional arguments to add_str() #287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 117 additions & 4 deletions ipfshttpclient/client/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,7 +240,15 @@ def apply_workarounds(self):

@utils.return_field('Hash')
@base.returns_single_item(dict)
def add_bytes(self, data: bytes, **kwargs):
def add_bytes(self, data: bytes,
trickle: bool = False,
only_hash: bool = False,
chunker: ty.Optional[str] = None,
pin: bool = True,
raw_leaves: bool = None,
cid_version: ty.Optional[int] = None,
**kwargs: base.CommonArgs) -> str:

"""Adds a set of bytes as a file to IPFS.

.. code-block:: python
Expand All @@ -254,19 +262,57 @@ def add_bytes(self, data: bytes, **kwargs):
----------
data
Content to be added as a file
trickle
Use trickle-dag format (optimized for streaming) when generating
the dag; see `the old FAQ <https://github.com/ipfs/faq/issues/218>`_
for more information
only_hash
Only chunk and hash, but do not write to disk
chunker
The chunking algorithm to use
pin
Pin this object when adding
raw_leaves
Use raw blocks for leaf nodes. (experimental). (Default: ``True``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the rationale for enabling the experimental raw blocks feature when nocopy is enabled? Since it's experimental, can raw blocks be left disabled by default even when nocopy is being used?

Note - I'm not yet suggesting the behavior must change, but rather seeing to understand the choice first.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no rationale, I'm working by copy/pasting here. I was just trying to scratch an itch (e.g. why add() has only_hash=True but add_str() did not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we need @ntninja to provide some more background on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raw_leaves is a prerequisite for nocopy on the daemon side because of how no-copy blocks are stored there.

Also I already asked @ftyers to remove the nocopy argument because it does not make sense for content not added from local files or URLs. So all that is really needed here is:

  1. Remove any mentioning of nocopy in the docs (the entire Default: part can go)
  2. Make the argument default to False instead of None
  3. Change "raw-leaves": raw_leaves if raw_leaves is not None else nocopy, to just "raw-leaves": raw_leaves, below (that should also fix the test failure)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is raw blocks still an experimental feature? (ignoring that the entire IPFS is v0.x lol)

Is some adjustment needed to both this new docstring and the existing docstrings mentioning raw blocks / raw leaves?

when *nocopy* is True, or ``False`` otherwise)
cid_version
CID version. Default value is provided by IPFS daemon. (experimental)

Returns
-------
str
Hash of the added IPFS object
"""

opts: ty.Dict[str, ty.Union[str, bool]] = {
"trickle": trickle,
"only-hash": only_hash,
"pin": pin,
"raw-leaves": raw_leaves if raw_leaves is not None else nocopy,
}
for option_name, option_value in [
("chunker", chunker),
("cid-version", cid_version),
]:
if option_value is not None:
opts[option_name] = option_value
kwargs.setdefault("opts", {}).update(opts)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid mutating kwargs and copy into a new dict instead?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the implementation from files.py, so presumably it works. presumably what you ask can be done, I don't know how to do it. But if you provide the code I am happy to copy/paste.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It "works" because changes to the kwargs dict - when declared as **kwargs - don't mutate the values of the variables at the call site in the function that called the function declaring **kwargs - however if it was declared as kwargs (no asterisks) you would mutate the call site's dictionary.

It's bad style - mutating mutable arguments can cause sneaky defects in functions that call the function doing the mutating (unless the documentation or name of the function states or strongly suggests that's the expected behavior).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that if you refactor this into a new function that returns dict (as mentioned in another comment) the issue of kwargs being mutated disappears.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@c0llab0rat0r: I don't believe that mutating the kwargs is a problem, since it is always a shallow copy of the caller's provided arguments.

I've seen plenty of code doing things like:

def func(a, *a, **kw):
    xtra_arg = kw.pop("xtra", "default")
    ...

This was/is a very common trick to emulate kw-only args with writing Python 2 compatible code. Until we dropped Python 2 support py-ipfs-api-client had similar code itself…

Note: kwargs is the exception here, not the rule. You are right that in general mutating arguments like this is indeed very bad unless the function/method is clearly meant to do so.


body, headers = multipart.stream_bytes(data, chunk_size=self.chunk_size)
return self._client.request('/add', decoder='json',
data=body, headers=headers, **kwargs)

@utils.return_field('Hash')
@base.returns_single_item(dict)
def add_str(self, string, **kwargs):
def add_str(self, string,
trickle: bool = False,
only_hash: bool = False,
chunker: ty.Optional[str] = None,
pin: bool = True,
raw_leaves: bool = None,
cid_version: ty.Optional[int] = None,
**kwargs: base.CommonArgs) -> str:

"""Adds a Python string as a file to IPFS.

.. code-block:: python
Expand All @@ -280,17 +326,54 @@ def add_str(self, string, **kwargs):
----------
string : str
Content to be added as a file
trickle
Use trickle-dag format (optimized for streaming) when generating
the dag; see `the old FAQ <https://github.com/ipfs/faq/issues/218>`_
for more information
only_hash
Only chunk and hash, but do not write to disk
chunker
The chunking algorithm to use
pin
Pin this object when adding
raw_leaves
Use raw blocks for leaf nodes. (experimental). (Default: ``True``
when *nocopy* is True, or ``False`` otherwise)
cid_version
CID version. Default value is provided by IPFS daemon. (experimental)

Returns
-------
str
Hash of the added IPFS object
"""

opts: ty.Dict[str, ty.Union[str, bool]] = {
"trickle": trickle,
"only-hash": only_hash,
"pin": pin,
"raw-leaves": raw_leaves if raw_leaves is not None else nocopy,
}
for option_name, option_value in [
("chunker", chunker),
("cid-version", cid_version),
]:
if option_value is not None:
opts[option_name] = option_value
kwargs.setdefault("opts", {}).update(opts)

body, headers = multipart.stream_text(string, chunk_size=self.chunk_size)
return self._client.request('/add', decoder='json',
data=body, headers=headers, **kwargs)

def add_json(self, json_obj, **kwargs):
def add_json(self, json_obj,
trickle: bool = False,
only_hash: bool = False,
chunker: ty.Optional[str] = None,
pin: bool = True,
raw_leaves: bool = None,
cid_version: ty.Optional[int] = None,
**kwargs: base.CommonArgs) -> str:
"""Adds a json-serializable Python dict as a json file to IPFS.

.. code-block:: python
Expand All @@ -302,12 +385,42 @@ def add_json(self, json_obj, **kwargs):
----------
json_obj : dict
A json-serializable Python dictionary
trickle
Use trickle-dag format (optimized for streaming) when generating
the dag; see `the old FAQ <https://github.com/ipfs/faq/issues/218>`_
for more information
only_hash
Only chunk and hash, but do not write to disk
chunker
The chunking algorithm to use
pin
Pin this object when adding
raw_leaves
Use raw blocks for leaf nodes. (experimental). (Default: ``True``
when *nocopy* is True, or ``False`` otherwise)
cid_version
CID version. Default value is provided by IPFS daemon. (experimental)


Returns
-------
str
Hash of the added IPFS object
"""
opts: ty.Dict[str, ty.Union[str, bool]] = {
"trickle": trickle,
"only-hash": only_hash,
"pin": pin,
"raw-leaves": raw_leaves if raw_leaves is not None else nocopy,
}
for option_name, option_value in [
("chunker", chunker),
("cid-version", cid_version),
]:
if option_value is not None:
opts[option_name] = option_value
kwargs.setdefault("opts", {}).update(opts)

return self.add_bytes(encoding.Json().encode(json_obj), **kwargs)


Expand All @@ -330,4 +443,4 @@ def get_json(self, cid, **kwargs):
object
Deserialized IPFS JSON object value
"""
return self.cat(cid, decoder='json', **kwargs)
return self.cat(cid, decoder='json', **kwargs)