Skip to content

Commit 1751e31

Browse files
authored
Merge pull request #198 from tornede/develop
2 parents 6f1b654 + 06d0884 commit 1751e31

20 files changed

+1011
-557
lines changed

.vscode/settings.json

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,4 +62,10 @@
6262
"[python]": {
6363
"editor.defaultFormatter": "ms-python.black-formatter"
6464
},
65+
"grammarly.selectors": [
66+
{
67+
"language": "restructuredtext",
68+
"scheme": "file"
69+
}
70+
],
6571
}

CHANGELOG.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,19 @@
22
Changelog
33
=========
44

5+
6+
v1.4.2 (12.06.2024)
7+
===================
8+
9+
Feature
10+
-------
11+
12+
- Added documentation about how to execute PyExperimenter on distributed machines.
13+
- Improved the usage and documentation of ssh tunnel to be more flexible and user friendly.
14+
- Added add_experiment_and_execute method to PyExperimenter to add and execute an experiment in one step.
15+
- Added functionality to attach multiple processes to the same experiment, all being able to write to the database tables of the same experiment.
16+
17+
518
v1.4.1 (11.03.2024)
619
===================
720

docs/source/examples/example_general_usage.ipynb

Lines changed: 583 additions & 459 deletions
Large diffs are not rendered by default.

docs/source/examples/example_logtables.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1390,7 +1390,7 @@
13901390
"name": "python",
13911391
"nbconvert_exporter": "python",
13921392
"pygments_lexer": "ipython3",
1393-
"version": "3.9.0"
1393+
"version": "3.9.19"
13941394
},
13951395
"orig_nbformat": 4,
13961396
"vscode": {

docs/source/usage/database_credential_file.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Below is an example of a database credential file, that connects to a server wit
1919
server: example.mysqlserver.com
2020
2121
However, for security reasons, databases might only be accessible from a specific IP address. In these cases, one can use an ssh jumphost. This means that ``PyExperimenter`` will first connect to the ssh server
22-
that has access to the database and then connect to the database server from there. This is done by adding an additional ``Ssh`` section to the database credential file.
22+
that has access to the database and then connect to the database server from there. This is done by adding an additional ``Ssh`` section to the database credential file, and can be activated either by a ``PyExperimenter`` keyword argument or in the :ref:`experimenter configuration file <experiment_configuration_file>`.
2323
The following example shows how to connect to a database server using an SSH server with the address ``ssh_hostname`` and the port ``optional_ssh_port``.
2424

2525
.. code-block:: yaml
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
.. _distributed_execution:
2+
3+
=====================
4+
Distributed Execution
5+
=====================
6+
To distribute the execution of experiments across multiple machines, you can follow the standard :ref:`procedure of using PyExperimenter <execution>`, with the following additional considerations.
7+
8+
--------------
9+
Database Setup
10+
--------------
11+
You need to have a shared database that is accessible to all the machines and supports concurrent access. Thus, ``SQLite`` is not a good choice for this purpose, which is why we recommend using a ``MySQL`` database instead.
12+
13+
--------
14+
Workflow
15+
--------
16+
While it is theoretically possible for multiple jobs to create new experiments, this introduces the possibility of creating the same experiment multiple times. To prevent this, we recommend the following workflow, where a process is either the ``database handler``, i.e. responsible to create/reset experiment, or a ``experiment executer`` actually executing experiments.
17+
18+
.. note::
19+
Make sure to use the same :ref:`experiment configuration file <experiment_configuration_file>`, and :ref:`database credential file <database_credential_file>` for both types.
20+
21+
22+
Database Handling
23+
-----------------
24+
25+
The ``database handler`` process creates/resets the experiments and stores them in the database once in advance.
26+
27+
.. code-block:: python
28+
29+
from py_experimenter.experimenter import PyExperimenter
30+
31+
experimenter = PyExperimenter(
32+
experiment_configuration_file_path = "path/to/file",
33+
database_credential_file_path = "path/to/file"
34+
)
35+
experimenter.fill_table_from_config()
36+
37+
38+
Experiment Execution
39+
--------------------
40+
41+
Multiple ``experiment executer`` processes execute the experiments in parallel on different machines, all using the same code. In a typical HPC context, each job starts a single ``experiment executer`` process on a different node.
42+
43+
.. code-block:: python
44+
45+
from py_experimenter.experimenter import PyExperimenter
46+
47+
experimenter.execute(experiment_function, max_experiments=1)
48+
49+
Add Experiment and Execute
50+
--------------------------
51+
52+
When executing jobs on clusters one might want to use `hydra combined with submitit <hydra_submitit_>`_ or a similar software that configures different jobs. If so it makes sense to create the database initially
53+
54+
.. code-block:: python
55+
56+
...
57+
experimenter = PyExperimenter(
58+
experiment_configuration_file_path = "path/to/file",
59+
database_credential_file_path = "path/to/file"
60+
)
61+
experimenter.create_table()
62+
63+
and then add the configured experiments experiments in the worker job, followed by an immediate execution.
64+
65+
.. code-block:: python
66+
67+
def _experiment_function(keyfields: dict, result_processor: ResultProcessor, custom_fields: dict):
68+
...
69+
70+
...
71+
@hydra.main(config_path="config", config_name="hydra_configname", version_base="1.1")
72+
def experiment_wrapepr(config: Configuration):
73+
74+
...
75+
experimenter = PyExperimenter(
76+
experiment_configuration_file_path = "some/value/from/config",
77+
database_credential_file_path = "path/to/file"
78+
)
79+
experimenter.add_experiment_and_execute(keyfield_values_from_config, _experiment_function)
80+
81+
.. _hydra_submitit: https://hydra.cc/docs/plugins/submitit_launcher/

docs/source/usage/execution.rst

Lines changed: 76 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,78 @@ An experiment can be executed easily with the following call:
109109
- ``max_experiments`` determines how many experiments will be executed by this ``PyExperimenter``. If set to ``-1``, it will execute experiments in a sequential fashion until no more open experiments are available.
110110
- ``random_order`` determines if the experiments will be executed in a random order. By default, the parameter is set to ``False``, meaning that experiments will be executed ordered by their ``id``.
111111

112+
.. _add_experiment_and_execute:
113+
114+
--------------------------
115+
Add Experiment and Execute
116+
--------------------------
117+
118+
Instead of filling the database table with rows and then executing the experiments, it is also possible to add an experiment and execute it directly. This can be done with the following call:
119+
120+
.. code-block:: python
121+
122+
experimenter.add_experiment_and_execute(
123+
keyfields = {'dataset': 'new_data', 'cross_validation_splits': 4, 'seed': 42, 'kernel': 'poly'},
124+
experiment_function = run_experiment
125+
)
126+
127+
This function may be useful in case of dependencies, where the result of one experiment is needed to configure the next one, or if the experiments are supposed to be configured with software such as `Hydra <hydra_>`_.
128+
129+
.. _attach:
130+
131+
----------------------------
132+
Attach to Running Experiment
133+
----------------------------
134+
135+
For cases of multiprocessing, where the ``experiment_function`` contains a main job, that runs multiple additional workers in other processes (maybe on a different machine), it is inconvenient to log all information through the main job. Therefore, we allow these workers to also attach to the database and log their information about the same experiment.
136+
137+
First, a worker experiment function wrapper has to be defined, which handles the parallel execution of something in a different process. The actual worker experiment function is defined inside the wrapper. The worker function is then attached to the experiment and logs its information on its own. In case more arguments are needed within the worker function, they can be passed to the wrapper function as keyword arguments.
138+
139+
.. code-block:: python
140+
141+
def worker_experiment_function_wrapper(experiment_id: int, **kwargs):
142+
143+
def worker_experiment_function(result_processor: ResultProcessor):
144+
# Worker Experiment Execution
145+
result = do_something_else()
146+
147+
result_processor.process_logs(
148+
# Some Logs
149+
)
150+
return result
151+
152+
return experimenter.attach(worker_experiment_function, experiment_id)
153+
154+
155+
.. note::
156+
157+
The ``experimenter.attach`` function returns the result of ``worker_experiment_function``.
158+
159+
Second, the main experiment function has to be defined calling the above created wrapper, which is provided with the ``experiment_id`` and started in a different process:
160+
161+
.. code-block:: python
162+
163+
def main_experiment_function(keyfields: dict, result_processor: ResultProcessor, custom_fields: Dict):
164+
# Main Experiment Execution
165+
do_something()
166+
167+
# Start worker in different process, and provide it with the experiment_id
168+
result = worker_experiment_function_wrapper(result_processor.experiment_id)
169+
170+
# Compute Something
171+
do_more()
172+
173+
result_processor.process_results(
174+
# Results
175+
)
176+
177+
Afterwards, the main experiment function can be started as usual:
178+
179+
.. code-block:: python
180+
181+
experimenter.execute(main_experiment_function, max_experiments=-1)
182+
183+
112184
.. _reset_experiments:
113185

114186
-----------------
@@ -214,4 +286,7 @@ If an SSH tunnel was opened during the creation of the ``PyExperimenter``, it ha
214286
.. code-block:: python
215287
216288
experimenter.execute(...)
217-
experimenter.close_ssh_tunnel()
289+
experimenter.close_ssh_tunnel()
290+
291+
292+
.. _hydra: https://hydra.cc/

docs/source/usage/experiment_configuration_file.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ The experiment configuration file is primarily used to define the database backe
1414
Database:
1515
provider: sqlite
1616
database: py_experimenter
17+
use_ssh_tunnel: False
1718
table:
1819
name: example_general_usage
1920
keyfields:
@@ -69,6 +70,7 @@ The ``Database`` section defines the database and its structure.
6970

7071
- ``provider``: The provider of the database connection. Currently, ``sqlite`` and ``mysql`` are supported. In the case of ``mysql`` an additional :ref:`database credential file <database_credential_file>` has to be created.
7172
- ``database``: The name of the database to create or connect to.
73+
- ``use_ssh_tunnel``: Flag to decide if the database is connected via ssh as defined in the :ref:`database credential file <database_credential_file>`. This is ignored if ``sqlite`` is chosen as provider. Optional Parameter, default is False.
7274
- ``table``: Defines the structure and predefined values for the experiment table.
7375

7476
- ``name``: The name of the experiment table to create or connect to.

docs/source/usage/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,3 +38,4 @@ The following steps are necessary to execute the ``PyExperimenter``.
3838
./database_credential_file
3939
./experiment_function
4040
./execution
41+
./distributed_execution

py_experimenter/config.py

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,7 @@
99
from omegaconf import DictConfig, ListConfig, OmegaConf
1010

1111
from py_experimenter import utils
12-
from py_experimenter.exceptions import (
13-
InvalidColumnError,
14-
InvalidConfigError,
15-
InvalidLogtableError,
16-
)
12+
from py_experimenter.exceptions import InvalidColumnError, InvalidConfigError, InvalidLogtableError
1713

1814

1915
class Cfg(ABC):
@@ -45,6 +41,7 @@ class DatabaseCfg(Cfg):
4541
def __init__(
4642
self,
4743
provider: str,
44+
use_ssh_tunnel: bool,
4845
database_name: str,
4946
table_name: str,
5047
result_timestamps: bool,
@@ -58,6 +55,8 @@ def __init__(
5855
5956
:param provider: Database Provider; either `sqlite` or `mysql`
6057
:type provider: str
58+
:param use_ssh_tunnel: Whether to use an SSH tunnel to connect to the database
59+
:type use_ssh_tunnel: bool
6160
:param database_name: Name of the database
6261
:type database_name: str
6362
:param table_name: Name of the table
@@ -71,6 +70,7 @@ def __init__(
7170
:type logtables: Dict[str, Dict[str,str]]
7271
"""
7372
self.provider = provider
73+
self.use_ssh_tunnel = use_ssh_tunnel
7474
self.database_name = database_name
7575
self.table_name = table_name
7676
self.result_timestamps = result_timestamps
@@ -85,6 +85,8 @@ def extract_config(config: OmegaConf, logger: logging.Logger) -> Tuple["Database
8585
database_config = config["PY_EXPERIMENTER"]["Database"]
8686
table_config = database_config["table"]
8787
provider = database_config["provider"]
88+
# Optional use_ssh_tunnel
89+
use_ssh_tunnel = database_config["use_ssh"] if "use_ssh" in database_config else False
8890
database_name = database_config["database"]
8991
table_name = database_config["table"]["name"]
9092

@@ -97,6 +99,7 @@ def extract_config(config: OmegaConf, logger: logging.Logger) -> Tuple["Database
9799

98100
return DatabaseCfg(
99101
provider,
102+
use_ssh_tunnel,
100103
database_name,
101104
table_name,
102105
result_timestamps,
@@ -208,6 +211,9 @@ def valid(self) -> bool:
208211
if self.provider not in ["sqlite", "mysql"]:
209212
self.logger.error("Database provider must be either sqlite or mysql")
210213
return False
214+
if self.use_ssh_tunnel not in [True, False]:
215+
self.logger.error("Use SSH tunnel must be a boolean.")
216+
return False
211217
if not isinstance(self.database_name, str):
212218
self.logger.error("Database name must be a string")
213219
return False
@@ -372,6 +378,7 @@ def extract_config(config_path: str, logger: logging.Logger) -> "PyExperimenterC
372378
def valid(self) -> bool:
373379
if not isinstance(self.n_jobs, int) and self.n_jobs > 0:
374380
self.logger.error("n_jobs must be a positive integer")
381+
return False
375382
if not (self.database_configuration.valid() and self.custom_configuration.valid() and self.codecarbon_configuration.valid()):
376383
self.logger.error("Database configuration invalid")
377384
return False

0 commit comments

Comments
 (0)