Releases: allegroai/clearml-agent
Releases · allegroai/clearml-agent
PyPI v1.9.2 - ClearML
New Features and Bug Fixes
- Handle OSError when checking for is_file (#215, thanks @materight!)
- Add support for pip legacy resolver for versions specified using the
agent.package_manager.pip_legacy_resolver
configuration option - Add skip existing packages
- Fix report index not advancing in resource monitoring causes more than one GPU not to be reported
- Fix use
req_token_expiration_sec
and not the default value when creating a task session - Fix reload method is found in the config object causing periodic agent error printouts
PyPI v1.9.1 - ClearML
New Features and Bug Fixes
- Add default pip version support for Python 3.12
PyPI v1.9.0 - ClearML
New Features
- Add NO_DOCKER flag to clearml-agent-services entrypoint (#206, thanks @valentinschabschneider!)
- Use
venv
module ifvirtualenv
is not supported - Find the correct python version when using a pre-installed python environment
- Add
/bin/bash
support in the task'sscript.binary
property - Add support for
.ipynb
script entry files (install nbconvert in runtime, convert file to python and execute the python script). IncludesCLEARML_AGENT_FORCE_TASK_INIT
patching of.ipynb
files (post python conversion) - Add
CLEARML_MULTI_NODE_SINGLE_TASK
(values -1, 0, 1, 2) for easier multi-node single Task workloads - Add default docker
agent.default_docker.match_rules
configuration option supported by enterprise servers (note: matching_rules are ignored if--docker container
is passed in command line) - Add
-m module args
in script entry now supports standalone script. Standalone script is placed in a file specified by theworking_dir
setting in the<dir>:<target_file>
format (e.g.:standalone.py
), or in untitled.py if not specified - Add
K8S_GLUE_POD_USE_IMAGE_ENTRYPOINT
env var to allow running k8s pods without overriding the image entrypoint (useful for agents using prebuilt images in k8s) - Add venv cache mount override for non-root containers (use:
agent.docker_internal_mounts.venvs_cache
) - Add
/bin/bash -c "command"
support. Taskbinary
should be set to/bin/bash
andentry_point
should be set to-c command
- Add queue priority info to CLI help (#211)
- Add support for tasks containing only bash script or python module command
- Add support for skipping container apt installs using
CLEARML_AGENT_SKIP_CONTAINER_APT
env var in k8s
Bug Fixes
- Fix git fetch did not update new tags #209
- Fix file mode should be optional in configuration
files
section - Fix
-m module $env
to support parsing$env
before launching - Fix setting tasks that were just marked as
aborted
tostarted
- only force task tostarted
after dequeuing it otherwise do nothing - Fix slurm multi-node rank detection
- Fix pass
--docker only
(i.e. no default container image) when using--dynamic-gpus
feature - Fix logger object was used even if
None
- Fix a race condition where in rare conditions popping a task from a queue that was aborted did not set it to started before the watchdog killed it (not applicable in k8s/slurm)
- Fix multi-node support to only send pip freeze update, only set task as started and only update task status on exit for RANK 0
- Fix do not cache venv cache if venv/python skip env var was set
- Fix use same state transition in k8s if supported by the server (instead of stopping the task before re-enqueue)
- Fix failed Task in services mode logged as "User aborted" instead of failed, add Task state reason string
- Fix remove task from pending queue and set to failed in k8s when applying the pod template fails
PyPI v1.8.1 - ClearML
New Features and Bug Fixes
- Add option to set daemon polling interval (#197, thanks @ilouzl!)
- Add Python 3.12 support
- Fix git pulling on cached invalid git entry. On error, re-clone the entire repository again (enable using
agent.vcs_cache.clone_on_pull_fail: true
) - Fix conda env should not be cached if installing into base conda or conda existing env exists
- Fix cached repositories were not passing user/token when pulling
- Fix when disabling vcs cache do not add vcs mount point to container
PyPI v1.8.0 - ClearML
New Features
- Add
CLEARML_AGENT_FORCE_POETRY
environment variable to allow forcing poetry even when using pip requirements manager - Add
CLEARML_AGENT_FORCE_TASK_INIT
environment variable to allow runtime patching of script even if no repository is specified and the code is running a preinstalled docker - Improve venv cache handling:
- Add
FileLock
readonly mode, default is write mode (i.e. exclusive lock, preserving behavior) - Add venv cache now uses readonly lock when copying folders from venv cache into target folder. This enables multiple read, single write operation
- Do not lock the cache folder if we do not need to delete old entries
- Add
agent.venvs_cache.lock_timeout
to control the venv cache folder lock timeout (in seconds, default 30)
- Add
- Add protection for
truncate()
call - Move configuration sanitization settings to the default config file
- Add queue ID report before pulling task
- Improve GPU monitoring for MIGs
Bug Fixes
- Use correct Python version in Poetry init (#179, thanks @nfzd!)
- Fix queue handling in
K8sIntegration
andk8s_glue_example.py
(#183, thanks @FeU-aKlos!) - Fix
FileNotFoundException
crash infind_python_executable_for_version
(#192, thanks @ae-ae!) - Fix delete temp console pipe log files after Task execution is completed (important for long lasting services agents to avoid collecting temp files on host machine)
- Fix
agent.enable_git_ask_pass
does not show in configuration dump - Fix
pippip
is returned as a pip version if no value exists inagent.package_manager.pip_version
- Fix Python 3.12 support by removing
distutil
imports - Fix
IOError
on file lock when using shared folder - Fix torch resolver settings applied to
PytorchRequirement
instance are not used - Fix comment lines (starting with
#
) are not ignored in docker startup bash script - Fix dynamic GPU sometimes misses the initial print
PyPI v1.7.0 - ClearML
New Features
- Add
agent.docker_args_extra_precedes_task
andagent.protected_docker_extra_args
configuration settings to prevent the same switch to be used by bothagent.extra_docker_args
and the a Task's docker args - Add
agent.resource_monitoring.disk_use_path
configuration option to allow monitoring a different volume than the one containing the home folder - Change default
agent.enable_git_ask_pass
to true - Add example and support for pre-built containers including services-mode support with overrides
CLEARML_AGENT_FORCE_CODE_DIR
andCLEARML_AGENT_FORCE_EXEC_SCRIPT
- Add
CLEARML_AGENT_SERVICE_TASK=1
environment variable in case we're running a service task - Add
CLEARML_AGENT_TEMP_STDOUT_FILE_DIR
to allow specifying temp dir used for storing agent log files and temporary log files (daemon and execution) - Update GPU stats and pynvml support
- Add git clone verbosity using
CLEARML_AGENT_GIT_CLONE_VERBOSE
environment variable - k8s glue
- Add status reason when aborting before moving to k8s_scheduler queue
- When cleaning up pending pods, verify task is still aborted and pod is still pending before deleting the pod
- Set worker ID in k8s pod execution
Bug Fixes
- Fix
agent.package_manager.poetry_install_extra_args
are used in all Poetry commands and not just ininstall
(#173) - Fix if process return code is SIGKILL (-9 or 137) and abort callback was called, do not mark as failed but as aborted
- Fix
agent.git_host
setting will cause git@domain URLs to not be replaced by SSH URLs since furl cannot parse them to obtain host - Fix an environment variable that should be set with a numerical value of
0
(i.e. end up as"0"
or"0.0"
) is set to an empty string - Fix
agent.package_manager.extra_index_url
URLs are not sanitized in configuration printout - Fix recursion issue when deep-copying a session
- k8s glue
- Fix k8s glue configuration might be contaminated when changed during apply
- Fix
KeyError
if container does not contain the arguments field
PyPI v1.6.1 - ClearML
Bug Fixes
- Fix requests requirement lower constraint breaks backwards compatibility for Python 3.6
PyPI v1.6.0 - ClearML
New Features
- Upgrade requests library (#162, thanks @jday1!)
- Add support for controlling PyTorch resolving mode using the
CLEARML_AGENT_PACKAGE_PYTORCH_RESOLVE
environment variable andagent.package_manager.pytorch_resolve
configuration setting withnone
(no resolving),pip
(sets extra index based on cuda and lets pip resolve) ordirect
(the previous parsing algorithm that does the matching and downloading), default ispip
(#152) - Add backwards compatibility in standalone mode using the
CLEARML_AGENT_STANDALONE_CONFIG_BC
environment variable - Add
CLEARML_AGENT_DOCKER_AGENT_REPO
alias for theFORCE_CLEARML_AGENT_REPO
environment variable - Show a better message for agent init when an existing
clearml.conf
is found - Add support for task field injection into container docker name using the
agent.docker_container_name_format_fields
configuration setting - Add support for adding additional labels to docker containers using the
CLEARML_AGENT_EXTRA_DOCKER_LABELS
environment variable - Add support for setting file mode in files applied by the agent (using the
files
configuration option) using themode
property - Add support for skipping agent pip upgrade in the default k8s pod container bash script using the
CLEARML_AGENT_NO_UPDATE
environment variable - Add support for additional pip install flags when installing dependencies using the
CLEARML_EXTRA_PIP_INSTALL_FLAGS
environment variable andagent.package_manager.extra_pip_install_flags
configuration option - Add support for extra docker arguments referencing machines environment variables using the
agent.docker_allow_host_environ
configuration option, allowing users to use $ENV in the task docker arguments (e.g.-e HOST_NAME=$HOST_NAME
) - Add support for k8s jobs execution (as opposed to only pods)
- Update default docker image versions
- Add Python 3.11 support
Bug Fixes
- Fix
git+ssh://
links inside installed packages not being properly converted to authenticatedhttps://
and vice versa - Fix pip version required in the "Installed Packages" is now preserved and reinstalled
- Fix various agent paths not loaded correctly if an empty string or null is used (should be disabled, not converted to
.
) - Fix docker container backwards compatibility for API<2.13
- Fix default docker match rules resolver (used incorrect field "container" instead of "image")
- Fix task docker argument might be passed twice (might cause an error with flags such as
--network
and--ipc
)
PyPI v1.5.2 - ClearML
New Features and Improvements
- Switch services agent entrypoint shell from
sh
tobash
(#141, thanks @InCogNiTo124!) - Improve poetry support
- Add poetry cwd support (#142, thanks @nielstenboom!)
- Add
agent.package_manager.poetry_install_extra_args
configuration option
- Do not allow request exceptions (keep retrying, throw error only on the initial login call)
Bug Fixes
- Fix agent update version (#132, thanks @achaiah!)
- Fix login uses GET with payload which breaks when trying to connect a server running in GCP
- Fix
clearml-agent build --docker
stuck on certain containers - Fix build fails when target is relative path
- Fix pinging running task (change default to once a minute)
- Fix
_
is allowed in k8s label names - Fix k8s glue does not delete pending pods if the tasks they represent were aborted
- Reintroduce
CLEARML_AGENT_SERVICES_DOCKER_RESTART
accidentally reverted by a previous merge - Fix
git+ssh://
links inside installed packages not being converted properly to HTTPS-authenticated links
PyPI v1.5.1 - ClearML Latest
New Features and Improvements
- Upgrade requirements for
attrs
,jsonschema
,pyparsing
,six
andpyjwt
(#129) - Add default output URI selection to
clearml-agent init
- Add
agent.disable_task_docker_override
configuration option to disable docker override specified in executing tasks - Add
CLEARML_AGENT_FORCE_SYSTEM_SITE_PACKAGES
env var (defaulttrue
) to allow overriding defaultsystem_site_packages: true
behavior when running tasks in containers (docker mode and k8s-glue)
Bug Fixes
- Fix using deprecated types validator argument raises an error (deprecated even before
jsonschema
3.0.0 and unsupported since 4.0.0) - Fix pip support allowing multiple pip version constraints (by default, one for < Python 3.10 and one for >= Python 3.10)