Skip to content

Commit

Permalink
[runtime env][feature] introduce pip_check_enable and pip_version (ra…
Browse files Browse the repository at this point in the history
  • Loading branch information
Catch-Bull authored Mar 14, 2022
1 parent 8608b64 commit 39a6c05
Show file tree
Hide file tree
Showing 13 changed files with 386 additions and 72 deletions.
2 changes: 1 addition & 1 deletion dashboard/modules/job/tests/test_http_job_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ def f():
relative_filepath = "requirements.txt"
pip_file = Path(relative_filepath)
pip_file.write_text("\n".join(pip_list))
runtime_env = {"pip": relative_filepath}
runtime_env = {"pip": {"packages": relative_filepath, "pip_check": False}}
yield {
"runtime_env": runtime_env,
"entrypoint": f"python -c '{driver_script}'",
Expand Down
15 changes: 10 additions & 5 deletions doc/source/ray-core/handling-dependencies.rst
Original file line number Diff line number Diff line change
Expand Up @@ -320,15 +320,20 @@ The ``runtime_env`` is a Python dictionary or a python class :class:`ray.runtime

- Example: ``["my_file.txt", "path/to/dir", "*.log"]``

- ``pip`` (List[str] | str): Either a list of pip `requirements specifiers <https://pip.pypa.io/en/stable/cli/pip_install/#requirement-specifiers>`_, or a string containing the path to a pip
`“requirements.txt” <https://pip.pypa.io/en/stable/user_guide/#requirements-files>`_ file.
- ``pip`` (dict | List[str] | str): Either (1) a list of pip `requirements specifiers <https://pip.pypa.io/en/stable/cli/pip_install/#requirement-specifiers>`_, (2) a string containing the path to a pip
`“requirements.txt” <https://pip.pypa.io/en/stable/user_guide/#requirements-files>`_ file, or (3) a python dictionary that has three fields: (a) ``packages`` (required, List[str]): a list of pip packages,
(b) ``pip_check`` (optional, bool): whether to enable `pip check <https://pip.pypa.io/en/stable/cli/pip_check/>`_ at the end of pip install, defaults to ``True``.
(c) ``pip_version`` (optional, str): the version of pip; Ray will spell the package name "pip" in front of the ``pip_version`` to form the final requirement string.
The syntax of a requirement specifier is defined in full in `PEP 508 <https://www.python.org/dev/peps/pep-0508/>`_.
This will be installed in the Ray workers at runtime. Packages in the preinstalled cluster environment will still be available.
To use a library like Ray Serve or Ray Tune, you will need to include ``"ray[serve]"`` or ``"ray[tune]"`` here.

- Example: ``["requests==1.0.0", "aiohttp", "ray[serve]"]``

- Example: ``"./requirements.txt"``

- Example: ``{"packages":["tensorflow", "requests"], "pip_check": False, "pip_version": "==22.0.2;python_version=='3.8.11'"}``

When specifying a ``requirements.txt`` file, referencing local files `within` that file is not supported (e.g. ``-r ./my-laptop/more-requirements.txt``, ``./my-pkg.whl``).

- ``conda`` (dict | str): Either (1) a dict representing the conda environment YAML, (2) a string containing the path to a
Expand Down Expand Up @@ -628,18 +633,18 @@ Example log output:
(pid=runtime_env) seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/private/tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/virtualenv_app_data)
(pid=runtime_env) added seed packages: pip==22.0.3, setuptools==60.6.0, wheel==0.37.1
(pid=runtime_env) activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
(pid=runtime_env)
(pid=runtime_env)
(pid=runtime_env) 2022-02-28 14:12:34,268 INFO utils.py:76 -- Run cmd[2] ['/tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/virtualenv/bin/python', '-c', 'import ray; print(ray.__version__, ray.__path__[0])']
(pid=runtime_env) 2022-02-28 14:12:35,118 INFO utils.py:97 -- Output of cmd[2]: 2.0.0.dev0 /Users/user/ray/python/ray
(pid=runtime_env)
(pid=runtime_env)
(pid=runtime_env) 2022-02-28 14:12:35,120 INFO pip.py:236 -- Installing python requirements to /tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/virtualenv
(pid=runtime_env) 2022-02-28 14:12:35,122 INFO utils.py:76 -- Run cmd[3] ['/tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/virtualenv/bin/python', '-m', 'pip', 'install', '--disable-pip-version-check', '--no-cache-dir', '-r', '/tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/requirements.txt']
(pid=runtime_env) 2022-02-28 14:12:38,000 INFO utils.py:97 -- Output of cmd[3]: Requirement already satisfied: requests in /Users/user/anaconda3/envs/ray-py38/lib/python3.8/site-packages (from -r /tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/requirements.txt (line 1)) (2.26.0)
(pid=runtime_env) Requirement already satisfied: idna<4,>=2.5 in /Users/user/anaconda3/envs/ray-py38/lib/python3.8/site-packages (from requests->-r /tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/requirements.txt (line 1)) (3.2)
(pid=runtime_env) Requirement already satisfied: certifi>=2017.4.17 in /Users/user/anaconda3/envs/ray-py38/lib/python3.8/site-packages (from requests->-r /tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/requirements.txt (line 1)) (2021.10.8)
(pid=runtime_env) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /Users/user/anaconda3/envs/ray-py38/lib/python3.8/site-packages (from requests->-r /tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/requirements.txt (line 1)) (1.26.7)
(pid=runtime_env) Requirement already satisfied: charset-normalizer~=2.0.0 in /Users/user/anaconda3/envs/ray-py38/lib/python3.8/site-packages (from requests->-r /tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/requirements.txt (line 1)) (2.0.6)
(pid=runtime_env)
(pid=runtime_env)
(pid=runtime_env) 2022-02-28 14:12:38,001 INFO utils.py:76 -- Run cmd[4] ['/tmp/ray/session_2022-02-28_14-12-29_909064_87908/runtime_resources/pip/0cc818a054853c3841171109300436cad4dcf594/virtualenv/bin/python', '-c', 'import ray; print(ray.__version__, ray.__path__[0])']
(pid=runtime_env) 2022-02-28 14:12:38,804 INFO utils.py:97 -- Output of cmd[4]: 2.0.0.dev0 /Users/user/ray/python/ray
Expand Down
100 changes: 88 additions & 12 deletions python/ray/_private/runtime_env/pip.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@
default_logger = logging.getLogger(__name__)


def _get_pip_hash(pip_list: List[str]) -> str:
serialized_pip_spec = json.dumps(pip_list, sort_keys=True)
def _get_pip_hash(pip_dict: Dict) -> str:
serialized_pip_spec = json.dumps(pip_dict, sort_keys=True)
hash = hashlib.sha1(serialized_pip_spec.encode("utf-8")).hexdigest()
return hash

Expand All @@ -29,12 +29,14 @@ def get_uri(runtime_env: Dict) -> Optional[str]:
"""Return `"pip://<hashed_dependencies>"`, or None if no GC required."""
pip = runtime_env.get("pip")
if pip is not None:
if isinstance(pip, list):
uri = "pip://" + _get_pip_hash(pip_list=pip)
if isinstance(pip, dict):
uri = "pip://" + _get_pip_hash(pip_dict=pip)
elif isinstance(pip, list):
uri = "pip://" + _get_pip_hash(pip_dict=dict(packages=pip))
else:
raise TypeError(
"pip field received by RuntimeEnvAgent must be "
f"list, not {type(pip).__name__}."
f"list or dict, not {type(pip).__name__}."
)
else:
uri = None
Expand Down Expand Up @@ -82,6 +84,10 @@ def __init__(
self._runtime_env = runtime_env
self._logger = logger

self._pip_config = self._runtime_env.pip_config()
self._pip_env = os.environ.copy()
self._pip_env.update(self._runtime_env.env_vars())

@staticmethod
def _is_in_virtualenv() -> bool:
# virtualenv <= 16.7.9 sets the real_prefix,
Expand All @@ -92,6 +98,58 @@ def _is_in_virtualenv() -> bool:
hasattr(sys, "base_prefix") and sys.base_prefix != sys.prefix
)

@classmethod
async def _ensure_pip_version(
cls,
path: str,
pip_version: Optional[str],
cwd: str,
pip_env: Dict,
logger: logging.Logger,
):
"""Run the pip command to reinstall pip to the specified version."""
if not pip_version:
return

python = _PathHelper.get_virtualenv_python(path)
# Ensure pip version.
pip_reinstall_cmd = [
python,
"-m",
"pip",
"install",
"--disable-pip-version-check",
f"pip{pip_version}",
]
logger.info("Installing pip with version %s", pip_version)

await check_output_cmd(pip_reinstall_cmd, logger=logger, cwd=cwd, env=pip_env)

async def _pip_check(
self,
path: str,
pip_check: bool,
cwd: str,
pip_env: Dict,
logger: logging.Logger,
):
"""Run the pip check command to check python dependency conflicts.
If exists conflicts, the exit code of pip check command will be non-zero.
"""
if not pip_check:
logger.info("Skip pip check.")
return
python = _PathHelper.get_virtualenv_python(path)

await check_output_cmd(
[python, "-m", "pip", "check", "--disable-pip-version-check"],
logger=logger,
cwd=cwd,
env=pip_env,
)

logger.info("Pip check on %s successfully.", path)

@staticmethod
@asynccontextmanager
async def _check_ray(python: str, cwd: str, logger: logging.Logger):
Expand Down Expand Up @@ -135,7 +193,6 @@ async def _create_or_get_virtualenv(
cls, path: str, cwd: str, logger: logging.Logger
):
"""Create or get a virtualenv from path."""

python = sys.executable
virtualenv_path = os.path.join(path, "virtualenv")
virtualenv_app_data_path = os.path.join(path, "virtualenv_app_data")
Expand Down Expand Up @@ -204,7 +261,7 @@ async def _install_pip_packages(
path: str,
pip_packages: List[str],
cwd: str,
env_vars: Dict,
pip_env: Dict,
logger: logging.Logger,
):
virtualenv_path = _PathHelper.get_virtualenv_path(path)
Expand Down Expand Up @@ -241,14 +298,13 @@ def _gen_requirements_txt():
pip_requirements_file,
]
logger.info("Installing python requirements to %s", virtualenv_path)
pip_env = os.environ.copy()
pip_env.update(env_vars)

await check_output_cmd(pip_install_cmd, logger=logger, cwd=cwd, env=pip_env)

async def _run(self):
path = self._target_dir
logger = self._logger
pip_packages = self._runtime_env.pip_packages()
pip_packages = self._pip_config["packages"]
# We create an empty directory for exec cmd so that the cmd will
# run more stable. e.g. if cwd has ray, then checking ray will
# look up ray in cwd instead of site packages.
Expand All @@ -258,10 +314,30 @@ async def _run(self):
await self._create_or_get_virtualenv(path, exec_cwd, logger)
python = _PathHelper.get_virtualenv_python(path)
async with self._check_ray(python, exec_cwd, logger):
# Ensure pip version.
await self._ensure_pip_version(
path,
self._pip_config.get("pip_version", None),
exec_cwd,
self._pip_env,
logger,
)
# Install pip packages.
await self._install_pip_packages(
path, pip_packages, exec_cwd, self._runtime_env.env_vars(), logger
path,
pip_packages,
exec_cwd,
self._pip_env,
logger,
)
# Check python environment for conflicts.
await self._pip_check(
path,
self._pip_config.get("pip_check", True),
exec_cwd,
self._pip_env,
logger,
)
# TODO(fyrestone): pip check.
except Exception:
logger.info("Delete incomplete virtualenv: %s", path)
shutil.rmtree(path, ignore_errors=True)
Expand Down
64 changes: 55 additions & 9 deletions python/ray/_private/runtime_env/validation.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,21 +102,35 @@ def parse_and_validate_conda(conda: Union[str, dict]) -> Union[str, dict]:
return result


def parse_and_validate_pip(pip: Union[str, List[str]]) -> Optional[List[str]]:
def parse_and_validate_pip(pip: Union[str, List[str], Dict]) -> Optional[Dict]:
"""Parses and validates a user-provided 'pip' option.
The value of the input 'pip' field can be one of two cases:
1) A List[str] describing the requirements. This is passed through.
2) A string pointing to a local requirements file. In this case, the
file contents will be read split into a list.
3) A python dictionary that has three fields:
a) packages (required, List[str]): a list of pip packages, it same as 1).
b) pip_check (optional, bool): whether enable pip check at the end of pip
install, default True.
c) pip_version (optional, str): the version of pip, ray will spell
the package name 'pip' in front of the `pip_version` to form the final
requirement string, the syntax of a requirement specifier is defined in
full in PEP 508.
The returned parsed value will be a list of pip packages. If a Ray library
(e.g. "ray[serve]") is specified, it will be deleted and replaced by its
dependencies (e.g. "uvicorn", "requests").
"""
assert pip is not None

pip_list = None
def _handle_local_pip_requirement_file(pip_file: str):
pip_path = Path(pip_file)
if not pip_path.is_file():
raise ValueError(f"{pip_path} is not a valid file")
return pip_path.read_text().strip().split("\n")

result = None
if sys.platform == "win32":
raise NotImplementedError(
"The 'pip' field in runtime_env "
Expand All @@ -125,12 +139,44 @@ def parse_and_validate_pip(pip: Union[str, List[str]]) -> Optional[List[str]]:
)
elif isinstance(pip, str):
# We have been given a path to a requirements.txt file.
pip_file = Path(pip)
if not pip_file.is_file():
raise ValueError(f"{pip_file} is not a valid file")
pip_list = pip_file.read_text().strip().split("\n")
pip_list = _handle_local_pip_requirement_file(pip)
result = dict(packages=pip_list, pip_check=True)
elif isinstance(pip, list) and all(isinstance(dep, str) for dep in pip):
pip_list = pip
result = dict(packages=pip, pip_check=True)
elif isinstance(pip, dict):
if set(pip.keys()) - {"packages", "pip_check", "pip_version"}:
raise ValueError(
"runtime_env['pip'] can only have these fields: "
"packages, pip_check and pip_check, but got: "
f"{list(pip.keys())}"
)

if "pip_check" in pip and not isinstance(pip["pip_check"], bool):
raise TypeError(
"runtime_env['pip']['pip_check'] must be of type bool, "
f"got {type(pip['pip_check'])}"
)
if "pip_version" in pip:
if not isinstance(pip["pip_version"], str):
raise TypeError(
"runtime_env['pip']['pip_version'] must be of type str, "
f"got {type(pip['pip_version'])}"
)
result = pip.copy()
result["pip_check"] = (
True if pip.get("pip_check") is None else pip.get("pip_check")
)
if "packages" not in pip:
raise ValueError(
f"runtime_env['pip'] must include field 'packages', but got {pip}"
)
elif isinstance(pip["packages"], str):
result["packages"] = _handle_local_pip_requirement_file(pip["packages"])
elif not isinstance(pip["packages"], list):
raise ValueError(
"runtime_env['pip']['packages'] must be of type str of list, "
f"got: {type(pip['packages'])}"
)
else:
raise TypeError(
"runtime_env['pip'] must be of type str or " f"List[str], got {type(pip)}"
Expand All @@ -140,9 +186,9 @@ def parse_and_validate_pip(pip: Union[str, List[str]]) -> Optional[List[str]]:
# OrderedDict to preserve the order of the list. This makes the output
# deterministic and easier to debug, because pip install can have
# different behavior depending on the order of the input.
result = list(OrderedDict.fromkeys(pip_list))
result["packages"] = list(OrderedDict.fromkeys(result["packages"]))

if len(result) == 0:
if len(result["packages"]) == 0:
result = None

logger.debug(f"Rewrote runtime_env `pip` field from {pip} to {result}.")
Expand Down
Loading

0 comments on commit 39a6c05

Please sign in to comment.