Skip to content

Revert obspec dependency #367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 17, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/api/attributes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Attributes

::: obstore.Attribute
::: obstore.Attributes
3 changes: 3 additions & 0 deletions docs/api/get.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
::: obstore.get_range_async
::: obstore.get_ranges
::: obstore.get_ranges_async
::: obstore.GetOptions
::: obstore.GetResult
::: obstore.BytesStream
::: obstore.Bytes
::: obstore.OffsetRange
::: obstore.SuffixRange
4 changes: 4 additions & 0 deletions docs/api/list.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,7 @@
::: obstore.list
::: obstore.list_with_delimiter
::: obstore.list_with_delimiter_async
::: obstore.ObjectMeta
::: obstore.ListResult
::: obstore.ListStream
::: obstore.ListChunkType
3 changes: 3 additions & 0 deletions docs/api/put.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,6 @@

::: obstore.put
::: obstore.put_async
::: obstore.PutResult
::: obstore.UpdateVersion
::: obstore.PutMode
2 changes: 1 addition & 1 deletion docs/blog/posts/obstore-0.4.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ Obstore version 0.5 is expected to improve on extensible credentials by enabling

## Return Arrow data from `list_with_delimiter`

By default, the [`obstore.list`][] and [`obstore.list_with_delimiter`][] APIs [return standard Python `dict`s][obspec.ObjectMeta]. However, if you're listing a large bucket, the overhead of materializing all those Python objects can become significant.
By default, the [`obstore.list`][] and [`obstore.list_with_delimiter`][] APIs [return standard Python `dict`s][obstore.ObjectMeta]. However, if you're listing a large bucket, the overhead of materializing all those Python objects can become significant.

[`obstore.list`][] and [`obstore.list_with_delimiter`][] now both support a `return_arrow` keyword parameter. If set to `True`, an Arrow [`RecordBatch`][arro3.core.RecordBatch] or [`Table`][arro3.core.Table] will be returned, which is both faster and more memory efficient.

Expand Down
2 changes: 1 addition & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ nav:
- api/put.md
- api/rename.md
- api/sign.md
- api/attributes.md
- api/exceptions.md
- api/file.md
- obstore.fsspec: api/fsspec.md
Expand Down Expand Up @@ -157,7 +158,6 @@ plugins:
- https://arrow.apache.org/docs/objects.inv
- https://boto3.amazonaws.com/v1/documentation/api/latest/objects.inv
- https://botocore.amazonaws.com/v1/documentation/api/latest/objects.inv
- https://developmentseed.org/obspec/latest/objects.inv
- https://docs.aiohttp.org/en/stable/objects.inv
- https://docs.pola.rs/api/python/stable/objects.inv
- https://docs.python.org/3/objects.inv
Expand Down
31 changes: 2 additions & 29 deletions obstore/python/obstore/__init__.py
Original file line number Diff line number Diff line change
@@ -1,35 +1,8 @@
from typing import TYPE_CHECKING

from . import store
from ._obstore import (
Bytes,
___version,
copy,
copy_async,
delete,
delete_async,
get,
get_async,
get_range,
get_range_async,
get_ranges,
get_ranges_async,
head,
head_async,
list, # noqa: A004
list_with_delimiter,
list_with_delimiter_async,
open_reader,
open_reader_async,
open_writer,
open_writer_async,
put,
put_async,
rename,
rename_async,
sign,
sign_async,
)
from ._obstore import *
from ._obstore import ___version

if TYPE_CHECKING:
from . import _store, exceptions
Expand Down
47 changes: 47 additions & 0 deletions obstore/python/obstore/_attributes.pyi
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
from typing import Literal, TypeAlias

Attribute: TypeAlias = (
Literal[
"Content-Disposition",
"Content-Encoding",
"Content-Language",
"Content-Type",
"Cache-Control",
]
| str
)
"""Additional object attribute types.

- `"Content-Disposition"`: Specifies how the object should be handled by a browser.

See [Content-Disposition](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition).

- `"Content-Encoding"`: Specifies the encodings applied to the object.

See [Content-Encoding](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Encoding).

- `"Content-Language"`: Specifies the language of the object.

See [Content-Language](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Language).

- `"Content-Type"`: Specifies the MIME type of the object.

This takes precedence over any client configuration.

See [Content-Type](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type).

- `"Cache-Control"`: Overrides cache control policy of the object.

See [Cache-Control](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control).

Any other string key specifies a user-defined metadata field for the object.
"""

Attributes: TypeAlias = dict[Attribute, str]
"""Additional attributes of an object

Attributes can be specified in [`put`][obstore.put]/[`put_async`][obstore.put_async] and
retrieved from [`get`][obstore.get]/[`get_async`][obstore.get_async].

Unlike ObjectMeta, Attributes are not returned by listing APIs
"""
4 changes: 1 addition & 3 deletions obstore/python/obstore/_buffered.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,7 @@ import sys
from contextlib import AbstractAsyncContextManager, AbstractContextManager
from typing import Self

# TODO: fix import
from obspec._attributes import Attributes

from ._attributes import Attributes
from ._bytes import Bytes
from ._list import ObjectMeta
from ._store import ObjectStore
Expand Down
104 changes: 93 additions & 11 deletions obstore/python/obstore/_get.pyi
Original file line number Diff line number Diff line change
@@ -1,12 +1,100 @@
from collections.abc import Sequence
from datetime import datetime
from typing import TypedDict

# TODO: fix imports
from obspec._attributes import Attributes
from obspec._get import GetOptions

from ._attributes import Attributes
from ._bytes import Bytes
from ._list import ObjectMeta
from ._store import ObjectStore
from .store import ObjectStore

class OffsetRange(TypedDict):
"""Request all bytes starting from a given byte offset."""

offset: int
"""The byte offset for the offset range request."""

class SuffixRange(TypedDict):
"""Request up to the last `n` bytes."""

suffix: int
"""The number of bytes from the suffix to request."""

class GetOptions(TypedDict, total=False):
"""Options for a get request.

All options are optional.
"""

if_match: str | None
"""
Request will succeed if the `ObjectMeta::e_tag` matches
otherwise returning [`PreconditionError`][obstore.exceptions.PreconditionError].
See <https://datatracker.ietf.org/doc/html/rfc9110#name-if-match>
Examples:
```text
If-Match: "xyzzy"
If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
If-Match: *
```
"""

if_none_match: str | None
"""
Request will succeed if the `ObjectMeta::e_tag` does not match
otherwise returning [`NotModifiedError`][obstore.exceptions.NotModifiedError].
See <https://datatracker.ietf.org/doc/html/rfc9110#section-13.1.2>
Examples:
```text
If-None-Match: "xyzzy"
If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz"
If-None-Match: *
```
"""

if_unmodified_since: datetime | None
"""
Request will succeed if the object has been modified since
<https://datatracker.ietf.org/doc/html/rfc9110#section-13.1.3>
"""

if_modified_since: datetime | None
"""
Request will succeed if the object has not been modified since
otherwise returning [`PreconditionError`][obstore.exceptions.PreconditionError].
Some stores, such as S3, will only return `NotModified` for exact
timestamp matches, instead of for any timestamp greater than or equal.
<https://datatracker.ietf.org/doc/html/rfc9110#section-13.1.4>
"""

range: tuple[int, int] | list[int] | OffsetRange | SuffixRange
"""
Request transfer of only the specified range of bytes
otherwise returning [`NotModifiedError`][obstore.exceptions.NotModifiedError].
The semantics of this tuple are:
- `(int, int)`: Request a specific range of bytes `(start, end)`.
If the given range is zero-length or starts after the end of the object, an
error will be returned. Additionally, if the range ends after the end of the
object, the entire remainder of the object will be returned. Otherwise, the
exact requested range will be returned.
The `end` offset is _exclusive_.
- `{"offset": int}`: Request all bytes starting from a given byte offset.
This is equivalent to `bytes={int}-` as an HTTP header.
- `{"suffix": int}`: Request the last `int` bytes. Note that here, `int` is _the
size of the request_, not the byte offset. This is equivalent to `bytes=-{int}`
as an HTTP header.
<https://datatracker.ietf.org/doc/html/rfc9110#name-range>
"""

version: str | None
"""
Request a particular object version
"""

head: bool
"""
Request transfer of no content
<https://datatracker.ietf.org/doc/html/rfc9110#name-head>
"""

class GetResult:
"""Result for a get request.
Expand Down Expand Up @@ -36,9 +124,6 @@ class GetResult:

Note that after calling `bytes`, `bytes_async`, or `stream`, you will no longer be
able to call other methods on this object, such as the `meta` attribute.

This implements [`obspec.GetResult`][], but is redefined here to specialize the
exact instance of the `bytes` return type to be [`obstore.Bytes`][].
"""

@property
Expand Down Expand Up @@ -126,9 +211,6 @@ class BytesStream:

To fix this, set the `timeout` parameter in the
[`client_options`][obstore.store.ClientConfig] passed when creating the store.

This implements [`obspec.BufferStream`][], but is redefined here to specialize the
exact instance of the buffer return type to be [`obstore.Bytes`][].
"""

def __aiter__(self) -> BytesStream:
Expand Down
4 changes: 1 addition & 3 deletions obstore/python/obstore/_head.pyi
Original file line number Diff line number Diff line change
@@ -1,6 +1,4 @@
# TODO: fix improt
from obspec._meta import ObjectMeta

from ._list import ObjectMeta
from .store import ObjectStore

def head(store: ObjectStore, path: str) -> ObjectMeta:
Expand Down
Loading
Loading