Skip to content

ENH: Added downoading a dictionary from a URL. #65

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dave3d
Copy link
Member

@dave3d dave3d commented Jun 30, 2025

No description provided.

@dave3d dave3d force-pushed the LoadDictFromURL branch from ee69dfe to e44ae2d Compare June 30, 2025 19:55
@dave3d dave3d requested a review from zivy June 30, 2025 20:05

# load the additional dictionaries
if not isinstance(dict_list, list):
return checker
if len(dict_list) > 0:
for d in dict_list:
logger.info("Loading additional dictionary from: %s", d)
checker.word_frequency.load_text_file(d)
if isinstance(d, pathlib.PosixPath):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here is a bit convoluted, there is no difference between the usage of a pathlib.Path and a string representing the path, but they are separated. This can be simplified:

if isinstance(d, str) and (d.startswith("http://") or d.startswith("https://")):
  response = requests.get(d)
  ... 
else:
  checker.word_frequency.load_text_file(d)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is what I did last time I needed to support both local files and URLs: https://github.com/niaid/rap_sitkCore/blob/b43d84054c116c64a94d2188a9fa066b4562a710/rap_sitkcore/read_dcm_headers.py#L133-L138

The python conventions is that it is easier to ask for forgiveness than permission (EAFP): https://realpython.com/python-lbyl-vs-eafp/

In this ca it means try it as a URL with requests, and if it fails that try it as a Path.

dpath = Path(d)
if dpath.exists():
dict_list.append(dpath)
dict_list.extend(args.dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is removing the explicit check with an implicit one. If string represents a URL, the raise_for_status will raise an exception if there is an issue and the load_text_file will raise an exception when it tries to open a non-existent file or a directory. Is there a try-except that deals with these exceptions? If not the program may "crash", exit in a very ugly manner. It is nicer to catch these, report the error and sys.exit(1).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, at that point we don't know if it's a file path or a URL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of this check from this location makes sense, but it wasn't replaced by a check in the location where we do know if this is a file or URL. In any case, there is a need for a try-catch so that the program exits gracefully when the URL or file don't exist and an exception is raised.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants