-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split data dir, moving large files into examples/data #130
Conversation
…e build, since we only generate them when doing an inplace build
…unclear on this, but the internet seems to imply that either include_package_data+MANIFEST.in or package_data should be used but not both
Co-authored-by: Christopher Harris <xixonia@gmail.com>
…s into david-cli-rel-paths
… convert to Git LFS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good. I'm not sure what the with_data_len.json
and without_data_len.json
files are used for. Can you verify that they arent used anywhere and remove them? Otherwise, looks good.
Removed unused |
…us into david-split-data-dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to revert the pinned Neo version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now.
@gpucibot merge |
In PR #130 we moved the `data` directory into `morpheus/data` and installed it with the python package. This required changing some of the default CLI arguments from relative paths like `data/labels_nlp.txt` to absolute paths like `morpheus.DATA_DIR/labels_nlp.txt`. To make it easy for the user to see how to change the labels file, we respecified the default argument value in documentation (i.e. `--labels_file=data/labels_nlp.txt`). Now that this needs to be an absolute path, the command in the documentation does not work. Adding absolute paths in the documentation is not feasible since this would require very long paths that would change from machine to machine. Instead, if the user specifies a data file with a relative path, we first check to see if a file exists relative to the current working directory. If it doesnt exist, then we check for a relative file to the current morpheus install. This allows commands from the documentation like: `morpheus run pipeline-nlp --labels_file=data/labels_nlp.txt` to find the correct path. We only choose the fallback value when no other file is found. Related to PR #200 Authors: - Michael Demoret (https://github.com/mdemoret-nv) Approvers: - David Gardner (https://github.com/dagardner-nv) URL: #232
Brings directory size down to 1.6MB down from 270MB
exaples/data
exaples/data
to git-lfsemail_with_addresses.jsonlines
needed for phishing detection developer guideDepends on changes in
#62Fixes #120