-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate requirements.txt from Python spec #7289
Conversation
python/gen_requirements.py
Outdated
if __name__ == "__main__": | ||
main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed this script is used to generate the files under ./requirements
and also used inline within setup.py
. When a dependency is updated, it will immediately reflect in setup.py - this is good.
However, how do we make sure that when a new dependency is introduced/changed/updated/fixed, the txt
files are also correctly updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes this is a great point. I sort of ran out of time before I posted the initial cut, so i figured i'd post and let people comment. I think the right thing to do is to move join_requirements
+ the requirements.txt writing into a top-level function and call that function from both gen_requirements.main
and setup.py
. however, that does mean that if tools like poetry
or others call setup.py egg_info
, requirements.txt
will be rewritten.
a developer might not expect that, though if it makes any appreciable change, they must be either syncing to main or modifying gen_requirements.py
. i think that's okay...what does everyone else think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in thinking about this further, i'm starting to lean towards not updating the requirements.txt when setup.py egg_info
is invoked. here's why:
- given the discussion about constraint types in the RFC, the functional constraints aren't themselves particularly useful in
pip install -r
. you need additional constraints to indicate which version of packages are used by the CI. - in order to generate those we need another RFC. presumably, once we have those, we can change the language about installing Python dependencies in the Install From Source tutorial to tell you to use the CI-tested
requirements.txt
. - The CI-tested requirements.txt needs to be generated with slightly different inputs (I.e. the requirements here + the output of
pip freeze
from various CI containers). I think we would create a slightly different tool or modify this one to include configuration on how to translatepip freeze
into looser requirements.txt usable for everyday work. - until we've completed all of this, I don't think we'd recommend people use these requirements. if they want them, it's pretty easy to generate by just running
python gen_requirements.txt
. I don't intend to actually check in the files inpython/requirements
to the repo (rather add .gitignore), so they wouldn't be out of date for most people.
python/gen_requirements.py
Outdated
"psutil", | ||
"scipy", | ||
"synr", | ||
]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need tornado and cloudpickle too. They are requirements for tuning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
thanks for the reviews so far everyone! one quick note: the actual list of dependencies (and their version constraints) are still WIP. so definitely feel free to comment on things that are missing, but I know the list is wrong and I plan to fix the list and add tests if everyone's good w/ the approach. i'll leave this open to collect reviews over the weekend and on Monday (holiday for me) and then i'll work to merge this mid-next-week if we are good w/ it. |
python/gen_requirements.py
Outdated
|
||
# Maps named TVM piece (see description above) to a list of names of Python packages. Please use | ||
# alphabetical order for each package list, and do not add version constraints here! | ||
REQUIREMENTS_BY_PIECE = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You might consider storing these lists in a .yaml file and loading it, which is easier to edit/maintain than directly editing the Python code itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, I generally prefer yaml for a task like this but since this is the deps script, I don't think it should have external dependencies itself.
python/gen_requirements.py
Outdated
SEMVER_REGEX = re.compile(r"^(?P<major>0|[1-9]\d*)\.(?P<minor>0|[1-9]\d*)\.(?P<patch>0|[1-9]\d*)(?:-(?P<prerelease>(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\.(?:0|[1-9]\d*|\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\+(?P<buildmetadata>[0-9a-zA-Z-]+(?:\.[0-9a-zA-Z-]+)*))?$") | ||
|
||
|
||
def validate_requirements_by_piece(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For new code I would recommend using Python type annotations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add type annotations before submitting this PR
python/gen_requirements.py
Outdated
sys.exit(2) | ||
|
||
if args.lint: | ||
sys.exit(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend return
, not exit
, since this is a function that could be (and, in fact, is!) invoked by another Python library -- same for the error cases, raise an Exception, don't exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually in setup.py we call join_requirements
, so I think we are good here.
python/requirements/all.txt
Outdated
@@ -0,0 +1,16 @@ | |||
numpy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend adding comments at the top of each of these files to indicate what they are used for, since it may not always be obvious from the filename. PIP requirements files support #
comments.https://pip.pypa.io/en/latest/reference/pip_install/#requirements-file-format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a way to specify a comment to each piece and that's now placed at the top of each requirements.txt file
import gen_requirements | ||
sys.path.pop(0) | ||
|
||
requirements = gen_requirements.join_requirements() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to print out some details on what the requirements identified are, to help with debugging?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, just a suggestion that when this is being invoked, and the requirements being generated automatically, it might be useful to have them dumped to stdout during the invocation (perhaps optionally) so someone debugging issues with Python deps can readily see what they were. Not sure if they are exposed in an obvious way elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess, I don't typically expect to see the requirements printed when I run python setup.py
. are there cases where you do? I could add a comment saying: to produce requirements.txt files, run python gen_requirements.py
above this line, which is kind of the typical case i'd expect someone to be looking for them.
@comaniac @leandron @tqchen @mdw-octoml @tkonolige please take a look at the latest update on the RFC forum thread. |
@leandron @mdw-octoml @tkonolige @comaniac @tqchen please take another look when you have a minute |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the hard work @areusch!
This seems like lot of code just to manage dependencies, but I guess we have complicated requirements.
setattr(obj, prop_name, value) | ||
|
||
|
||
PROBLEM_REQUIREMENTS = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think this test would be easier to maintain if the expected problem string were part of this list, rather than hardcoded in the test case in the same order below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess then though, you're doing processing to construct the test stimulus, which you could screw up. plus, each line here could create multiple problem strings or contribute to composite problem strings such as the final one.
gen_requirements, | ||
REQUIREMENTS_BY_PIECE=TEST_REQUIREMENTS_BY_PIECE, | ||
CONSTRAINTS=( | ||
("unlisted", "~=3"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above, I think it would be better if the problems were kept along with the input constraints that are being tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same issue though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks everyone! I am going to merge it in tomorrow if there are no further comments :-) |
* Generate requirements.txt from Python spec. * add tests, collect actual requirements (first cut). * add tornado and cloudpickle * add xgboost * add xgboost version restriction * cleanup and prepare for merge * black format * add type annotations and docstrings * remove example requirements.txt * fix setup.py extras_require * use typing. classes for type annotations, python 2 compatible :) * fix python2 typing.Pattern * retrigger CI * address comaniac comments * retrigger ci
* Generate requirements.txt from Python spec. * add tests, collect actual requirements (first cut). * add tornado and cloudpickle * add xgboost * add xgboost version restriction * cleanup and prepare for merge * black format * add type annotations and docstrings * remove example requirements.txt * fix setup.py extras_require * use typing. classes for type annotations, python 2 compatible :) * fix python2 typing.Pattern * retrigger CI * address comaniac comments * retrigger ci
* Generate requirements.txt from Python spec. * add tests, collect actual requirements (first cut). * add tornado and cloudpickle * add xgboost * add xgboost version restriction * cleanup and prepare for merge * black format * add type annotations and docstrings * remove example requirements.txt * fix setup.py extras_require * use typing. classes for type annotations, python 2 compatible :) * fix python2 typing.Pattern * retrigger CI * address comaniac comments * retrigger ci
This PR implements the Consolidating Python Dependencies RFC.
As part of this RFC, we decided to try a solution halfway between burdening TVM with a full-blown dependency tool such as poetry and the approach we currently take (scattered Python deps). The initial patchset here contains the tool plus some examples of the dependencies. The remaining things needed to merge this PR:
I'm posting this now before I do any of the test work to get feedback on the approach now that there's something concrete to look at. I do think this tool is a bit more complex than it could be, and am concerned about maintaining it. In particular the logic needed to lint the dependencies list is complex. I don't see a solution that avoids joining a set of dependencies by feature against a constraints list as very workable given the current extras structure.
Another topic for discussion is the use of semver. Python does contain a compatibility operator we could use. We should probably debate between the two (let's use the forum for this), as it would remove some complexity here.
Comments would be quite welcomed!
@tqchen @tkonolige @mshawcroft @leandron @junrushao1994 @tmoreau89 @jwfromm @jroesch @mdw-octoml @zhiics