-
Notifications
You must be signed in to change notification settings - Fork 35
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add wget to mock out downloads for importers (#410)
* Use run_task for dataset importing * Add a wget mock for downloads
- Loading branch information
Showing
13 changed files
with
122 additions
and
34 deletions.
There are no files selected for viewing
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
# Corpus Samples | ||
|
||
These are datasets that are cut down to only a few samples, which can be used for testing purposes. |
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
#!/usr/bin/env python3 | ||
""" | ||
This is a test fixture which mocks wget. | ||
Set the MOCKED_DOWNLOADS environment variable to a JSON object mapping a URL to a file path. | ||
""" | ||
|
||
import argparse | ||
import json | ||
import os | ||
import shutil | ||
|
||
if not os.environ.get("MOCKED_DOWNLOADS"): | ||
raise Exception( | ||
"The mocked_wget utility expected the MOCKED_DOWNLOADS environment variable to be set." | ||
) | ||
|
||
|
||
mocked_downloads = json.loads(os.environ.get("MOCKED_DOWNLOADS")) | ||
|
||
if not isinstance(mocked_downloads, dict): | ||
raise Exception( | ||
"Expected the mocked downloads to be a json object mapping the URL to file path" | ||
) | ||
|
||
|
||
parser = argparse.ArgumentParser() | ||
parser.add_argument( | ||
"-O", "--output-document", dest="output", help="The output path", required=True | ||
) | ||
parser.add_argument("url", help="The url to download") | ||
|
||
args = parser.parse_args() | ||
|
||
print("[mocked wget]", args.url) | ||
|
||
source_file = mocked_downloads.get(args.url) | ||
if not source_file: | ||
print("[mocked wget] MOCKED_DOWNLOADS:", mocked_downloads) | ||
raise Exception(f"Received a URL that was not in MOCKED_DOWNLOADS {args.url}") | ||
|
||
if source_file == "404": | ||
print("[mocked wget]: Mocking a 404") | ||
# 8 is what wget gives as an exit code in this case. | ||
os.sys.exit(8) | ||
|
||
if not os.path.exists(source_file): | ||
raise Exception(f"The source file specified did not exist {source_file}") | ||
|
||
print("[mocked wget] copying the file") | ||
print(f"[mocked wget] from: {source_file}") | ||
print(f"[mocked wget] to: {args.output}") | ||
|
||
shutil.copyfile(source_file, args.output) | ||
|
||
print("[mocked wget] Success") |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters