Skip to content

Commit 4ec63f0

Browse files
authored
Make special exception in referenceFS (#1120)
* Make special exception in referenceFS So that we distinguish between files not in the reference list and anything that goes wrong during the fetch. * conda to mamba * ditch tox * yaml * versions as strings * try syntax * dollars * add deps to friends * More dep parts * env
1 parent 45d2301 commit 4ec63f0

File tree

11 files changed

+213
-165
lines changed

11 files changed

+213
-165
lines changed

.github/workflows/main.yaml

Lines changed: 52 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -8,56 +8,47 @@ on:
88

99
jobs:
1010
linux:
11-
name: ${{ matrix.TOXENV }}-pytest
11+
name: ${{ matrix.PY }}-pytest
1212
runs-on: ubuntu-latest
1313
strategy:
1414
fail-fast: false
1515
matrix:
16-
TOXENV: [py38, py39, py310, s3fs, gcsfs]
16+
PY: ["3.8", "3.9", "3.10"]
1717

1818
env:
19-
TOXENV: ${{ matrix.TOXENV }}
2019
CIRUN: true
2120

2221
steps:
2322
- name: Checkout
2423
uses: actions/checkout@v2
2524

26-
- name: Setup Miniconda
27-
uses: conda-incubator/setup-miniconda@v2
25+
- name: Setup conda
26+
uses: mamba-org/provision-with-micromamba@main
2827
with:
29-
auto-update-conda: true
30-
auto-activate-base: false
31-
activate-environment: test_env
3228
environment-file: ci/environment-py38.yml
29+
extra-specs: python=${{ matrix.PY }}
3330

3431
- name: Run Tests
3532
shell: bash -l {0}
3633
run: |
37-
tox -v
34+
pytest -v
3835
3936
win:
40-
name: ${{ matrix.TOXENV }}-pytest-win
37+
name: pytest-win
4138
runs-on: windows-2019
4239
strategy:
4340
fail-fast: false
44-
matrix:
45-
TOXENV: [py39]
4641

4742
env:
48-
TOXENV: ${{ matrix.TOXENV }}
4943
CIRUN: true
5044

5145
steps:
5246
- name: Checkout
5347
uses: actions/checkout@v2
5448

55-
- name: Setup Miniconda
56-
uses: conda-incubator/setup-miniconda@v2
49+
- name: Setup conda
50+
uses: mamba-org/provision-with-micromamba@main
5751
with:
58-
auto-update-conda: true
59-
auto-activate-base: false
60-
activate-environment: test_env
6152
environment-file: ci/environment-win.yml
6253

6354
- name: Run Tests
@@ -81,12 +72,9 @@ jobs:
8172
- name: Checkout
8273
uses: actions/checkout@v2
8374

84-
- name: Setup Miniconda
85-
uses: conda-incubator/setup-miniconda@v2
75+
- name: Setup conda
76+
uses: mamba-org/provision-with-micromamba@main
8677
with:
87-
auto-update-conda: true
88-
auto-activate-base: false
89-
activate-environment: test_env
9078
environment-file: ci/environment-downstream.yml
9179

9280
- name: Local install
@@ -97,10 +85,13 @@ jobs:
9785
git tag -a 3000 -m "fake"
9886
pip install -e .
9987
88+
- name: Clone s3fs
89+
shell: bash -l {0}
90+
run: git clone https://github.com/fsspec/s3fs
91+
10092
- name: Install s3fs
10193
shell: bash -l {0}
10294
run: |
103-
git clone https://github.com/fsspec/s3fs
10495
pip install -e ./s3fs --no-deps
10596
10697
- name: Run fsspec tests
@@ -117,3 +108,40 @@ jobs:
117108
shell: bash -l {0}
118109
run: |
119110
pytest -v dask/dask/bytes
111+
112+
fsspec_friends:
113+
name: ${{ matrix.FRIEND }}-pytest
114+
runs-on: ubuntu-latest
115+
strategy:
116+
fail-fast: false
117+
matrix:
118+
FRIEND: [gcsfs, s3fs]
119+
120+
env:
121+
CIRUN: true
122+
BOTO_CONFIG: /dev/null
123+
AWS_ACCESS_KEY_ID: foobar_key
124+
AWS_SECRET_ACCESS_KEY: foobar_secret
125+
126+
steps:
127+
- name: Checkout
128+
uses: actions/checkout@v2
129+
130+
- name: Setup conda
131+
uses: mamba-org/provision-with-micromamba@main
132+
with:
133+
environment-file: ci/environment-friends.yml
134+
135+
- name: Clone
136+
shell: bash -l {0}
137+
run: git clone https://github.com/fsspec/${{ matrix.FRIEND }}
138+
139+
- name: Install
140+
shell: bash -l {0}
141+
run: |
142+
pip install -e . --no-deps
143+
pip install -e ./${{ matrix.FRIEND }} --no-deps
144+
145+
- name: Test
146+
shell: bash -l {0}
147+
run: pytest -v ${{ matrix.FRIEND }}

README.md

Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -33,26 +33,25 @@ Please refer to [RTD](https://filesystem-spec.readthedocs.io/en/latest/?badge=la
3333

3434
## Develop
3535

36-
fsspec uses [tox](https://tox.readthedocs.io/en/latest/) and
37-
[tox-conda](https://github.com/tox-dev/tox-conda) to manage dev and test
38-
environments. First, install conda with tox and tox-conda in a base environment
39-
(eg. ``conda install -c conda-forge tox tox-conda``). Calls to ``tox`` can then be
40-
used to configure a development environment and run tests.
41-
42-
First, setup a development conda environment via ``tox -e {env}`` where ``env`` is one of ``{py38,py39,py310}``.
43-
This will install fsspec dependencies, test & dev tools, and install fsspec in develop
44-
mode. You may activate the dev environment under ``.tox/{env}`` via ``conda activate .tox/{env}``.
36+
fsspec uses GitHub Actions for CI. Environment files can be found
37+
in the "ci/" directory. Note that the main environment is called "py38",
38+
but it is expected that the version of python installed be adjustable at
39+
CI runtime. For local use, pick a version suitable for you.
4540

4641
### Testing
4742

4843
Tests can be run in the dev environment, if activated, via ``pytest fsspec``.
4944

50-
Alternatively, the full fsspec test suite can also be run via ``tox``, which will
51-
also build the appropriate environment (see above), with the environment specified
52-
by the TOXENV environment variable.
53-
5445
The full fsspec suite requires a system-level docker, docker-compose, and fuse
55-
installation.
46+
installation. If only making changes to one backend implementation, it is
47+
not generally necessary to run all tests locally.
48+
49+
It is expected that contributors ensure that any change to fsspec does not
50+
cause issues or regressions for either other fsspec-related packages such
51+
as gcsfs and s3fs, nor for downstream users of fsspec. The "downstream" CI
52+
run and corresponding environment file run a set of tests from the dask
53+
test suite, and very minimal tests against pandas and zarr from the test_dowstream.py
54+
module in this repo.
5655

5756
### Code Formatting
5857

@@ -62,7 +61,6 @@ Run ``black fsspec`` from the root of the filesystem_spec repository to
6261
auto-format your code. Additionally, many editors have plugins that will apply
6362
``black`` as you edit files. ``black`` is included in the ``tox`` environments.
6463

65-
6664
Optionally, you may wish to setup [pre-commit hooks](https://pre-commit.com) to
6765
automatically run ``black`` when you make a git commit.
6866
Run ``pre-commit install --install-hooks`` from the root of the

ci/environment-downstream.yml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
name: test_env
22
channels:
33
- conda-forge
4-
- defaults
54
dependencies:
65
- python=3.9
76
- dask

ci/environment-friends.yml

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
name: test_env
2+
channels:
3+
- conda-forge
4+
dependencies:
5+
- python=3.9
6+
- pytest
7+
- pytest-asyncio
8+
- pytest-benchmark
9+
- pytest-cov
10+
- pytest-mock
11+
- pytest-vcr
12+
- pip
13+
- pytest
14+
- ujson
15+
- requests
16+
- decorator
17+
- google-auth
18+
- aiohttp
19+
- google-auth-oauthlib
20+
- flake8
21+
- black
22+
- google-cloud-core
23+
- google-api-core
24+
- google-api-python-client
25+
- httpretty
26+
- aiobotocore
27+
- "moto>=4"
28+
- flask

ci/environment-py38.yml

Lines changed: 36 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,40 @@
11
name: test_env
22
channels:
33
- conda-forge
4-
- defaults
54
dependencies:
6-
- python=3.8
7-
- tox
8-
- tox-conda
5+
# - python=3.8 # set by env
6+
- pip
7+
- paramiko
8+
- requests
9+
- zstandard
10+
- python-snappy
11+
- aiohttp
12+
- lz4
13+
- distributed
14+
- dask
15+
- pyarrow
16+
- panel
17+
- notebook
18+
- pygit2
19+
- git
20+
- s3fs
21+
- pyftpdlib
22+
- cloudpickle
23+
- pytest
24+
- pytest-asyncio
25+
- pytest-benchmark
26+
- pytest-cov
27+
- pytest-mock
28+
- pytest-vcr
29+
- py
30+
- fusepy
31+
- tomli < 2
32+
- msgpack-python
33+
- python-libarchive-c
34+
- numpy
35+
- nomkl
36+
- jinja2
37+
- tqdm
38+
- pip:
39+
- hadoop-test-cluster
40+
- smbprotocol

ci/environment-win.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
name: test_env
22
channels:
33
- conda-forge
4-
- defaults
54
dependencies:
5+
- python=3.9
66
- aiohttp
77
- pip
88
- requests

fsspec/implementations/reference.py

Lines changed: 32 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,16 @@
2121
logger = logging.getLogger("fsspec.reference")
2222

2323

24+
class ReferenceNotReachable(RuntimeError):
25+
def __init__(self, reference, target, *args):
26+
super().__init__(*args)
27+
self.reference = reference
28+
self.target = target
29+
30+
def __str__(self):
31+
return f'Reference "{self.reference}" failed to fetch target {self.target}'
32+
33+
2434
def _first(d):
2535
return list(d.values())[0]
2636

@@ -213,7 +223,10 @@ def loop(self):
213223
def _cat_common(self, path, start=None, end=None):
214224
path = self._strip_protocol(path)
215225
logger.debug(f"cat: {path}")
216-
part = self.references[path]
226+
try:
227+
part = self.references[path]
228+
except KeyError:
229+
raise FileNotFoundError(path)
217230
if isinstance(part, str):
218231
part = part.encode()
219232
if isinstance(part, bytes):
@@ -254,15 +267,21 @@ async def _cat_file(self, path, start=None, end=None, **kwargs):
254267
if isinstance(part_or_url, bytes):
255268
return part_or_url[start:end]
256269
protocol, _ = split_protocol(part_or_url)
257-
return await self.fss[protocol]._cat_file(part_or_url, start=start, end=end)
270+
try:
271+
await self.fss[protocol]._cat_file(part_or_url, start=start, end=end)
272+
except Exception as e:
273+
raise ReferenceNotReachable(path, part_or_url) from e
258274

259275
def cat_file(self, path, start=None, end=None, **kwargs):
260276
part_or_url, start0, end0 = self._cat_common(path, start=start, end=end)
261277
if isinstance(part_or_url, bytes):
262278
return part_or_url[start:end]
263279
protocol, _ = split_protocol(part_or_url)
264280
# TODO: start and end should be passed to cat_file, not sliced
265-
return self.fss[protocol].cat_file(part_or_url, start=start0, end=end0)
281+
try:
282+
return self.fss[protocol].cat_file(part_or_url, start=start0, end=end0)
283+
except Exception as e:
284+
raise ReferenceNotReachable(path, part_or_url) from e
266285

267286
def pipe_file(self, path, value, **_):
268287
"""Temporarily add binary data or reference as a file"""
@@ -360,6 +379,16 @@ def cat(self, path, recursive=False, on_error="raise", **kwargs):
360379
elif np == u and s >= ns and e <= ne:
361380
out[p] = b[s - ns : (e - ne) or None]
362381

382+
for k, v in out.copy().items():
383+
if isinstance(v, Exception):
384+
ex = out[k]
385+
new_ex = ReferenceNotReachable(k, self.references[k])
386+
new_ex.__cause__ = ex
387+
if on_error == "raise":
388+
raise new_ex
389+
elif on_error != "omit":
390+
out[k] = new_ex
391+
363392
if len(out) == 1 and isinstance(path, str) and "*" not in path:
364393
return _first(out)
365394
return out

fsspec/implementations/tests/test_cached.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -884,7 +884,7 @@ def test_expiry():
884884
assert detail["time"] - start_time > 0.09
885885

886886

887-
def test_equality():
887+
def test_equality(tmpdir):
888888
"""Test sane behaviour for equality and hashing.
889889
890890
Make sure that different CachingFileSystem only test equal to each other
@@ -897,9 +897,11 @@ def test_equality():
897897
from fsspec.implementations.local import LocalFileSystem
898898

899899
lfs = LocalFileSystem()
900-
cfs1 = CachingFileSystem(fs=lfs, cache_storage="raspberry")
901-
cfs2 = CachingFileSystem(fs=lfs, cache_storage="banana")
902-
cfs3 = CachingFileSystem(fs=lfs, cache_storage="banana")
900+
dir1 = f"{tmpdir}/raspberry"
901+
dir2 = f"{tmpdir}/banana"
902+
cfs1 = CachingFileSystem(fs=lfs, cache_storage=dir1)
903+
cfs2 = CachingFileSystem(fs=lfs, cache_storage=dir2)
904+
cfs3 = CachingFileSystem(fs=lfs, cache_storage=dir2)
903905
assert cfs1 == cfs1
904906
assert cfs1 != cfs2
905907
assert cfs1 != cfs3

0 commit comments

Comments
 (0)