Skip to content

Web UI #634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 140 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
140 commits
Select commit Hold shift + click to select a range
040d446
added fastapi as web ui backend
VukW Jul 11, 2024
97e6bd7
Added cube + benchmark basic listing
VukW Jul 12, 2024
0382684
Adds navigation
VukW Jul 15, 2024
55fe60e
Aded mlcube detailed page
VukW Jul 19, 2024
fb1bca3
Improved mlcubes detailed layout
VukW Jul 19, 2024
64cf53e
Improved mlcube layout
VukW Jul 19, 2024
36611e1
yaml displaying
VukW Jul 19, 2024
56fa5c4
yaml: spinner
VukW Jul 19, 2024
8563887
yaml panel improvement
VukW Jul 19, 2024
07ce4ab
yaml panel layout improvement
VukW Jul 19, 2024
b260401
layout fixes
VukW Jul 19, 2024
b7980a8
Added benchmark detailed page
VukW Jul 19, 2024
ca356cc
added links to mlcube
VukW Jul 19, 2024
6efd724
benchmark page: added owner
VukW Jul 19, 2024
319b1bf
Colors refactoring
VukW Jul 19, 2024
58008f3
Dataset detailed page
VukW Jul 23, 2024
375d89e
Forgot to add js file
VukW Jul 23, 2024
c6d8a56
Unified data format for all data fields automatically
VukW Jul 23, 2024
74f7743
(mlcube-detailed) Display image tarball and additional files always
VukW Jul 24, 2024
b312882
Fixed scrolling and reinvented basic page layout
VukW Jul 24, 2024
0e282cb
Fix navbar is hiding
VukW Jul 24, 2024
6b28ebb
Make templates & static files independent of user's workdir
VukW Jul 29, 2024
881b281
Added error handling
VukW Jul 29, 2024
e28107b
Display invalid entities correctly
VukW Jul 30, 2024
5b718eb
Added invalid entities highlighting + badges
VukW Jul 30, 2024
0f95027
Added benchmark associations
VukW Aug 5, 2024
444786e
Improved association panel style
VukW Aug 5, 2024
e273577
Added association card
VukW Aug 6, 2024
eea1e77
Sorted associations by status / timestamp
VukW Aug 6, 2024
7b68911
Sorted mlcubes and datasets: mine first
VukW Aug 6, 2024
8251c42
Added associations to dataset page
VukW Aug 7, 2024
b669358
Added associations to mlcube page
VukW Aug 7, 2024
039f496
Refactored details page - extracted common styles to the base template
VukW Aug 10, 2024
c225a5e
Refactored association sorting to common util
VukW Aug 10, 2024
ad0451f
Display my benchmarks first
VukW Aug 10, 2024
12ffef2
Hid empty links
VukW Aug 12, 2024
cedad96
Mlcube-as-a-link unified view
VukW Aug 12, 2024
3ac8a74
resources.path cannot return a dir with subdirs for py3.9
VukW Aug 13, 2024
6170b53
Fixed resources path for templates also
VukW Aug 14, 2024
53b557b
linter fix
VukW Aug 14, 2024
2b73c4f
static local resources instead of remote ones
VukW Aug 26, 2024
75d6776
layout fix: align mlcubes vertically
VukW Aug 27, 2024
c47a751
bugfix: add some dependencies for isolated run
VukW Aug 27, 2024
d837837
Merge branch 'main' into web-ui
VukW Aug 27, 2024
c58efd8
Fixes after merging main
VukW Aug 28, 2024
f2f25c0
Dataset creation step 1
VukW Sep 10, 2024
4da2628
Dataset submission wizard
VukW Sep 11, 2024
8e73e54
MedperfSchema requires a name field
VukW Sep 17, 2024
a78ef8d
Linter fix
VukW Sep 17, 2024
64f26ff
Merge branch 'web-ui' into web-ui-dataset
VukW Sep 17, 2024
14f87a9
Linter fix
VukW Sep 17, 2024
812cd7e
Almost added dataset prepare
VukW Sep 23, 2024
7f86b1b
Added set-operational functionality
VukW Sep 25, 2024
cfcf9df
Handling set-op errors (unsuccessful)
VukW Sep 25, 2024
08f2ca7
Handling set-op errors
VukW Sep 30, 2024
04f8c11
Displaying preparation logs in a beauty way
VukW Oct 2, 2024
d617a04
refactored dataset routes
VukW Oct 3, 2024
1bd0926
Associate dataset with the benchmark
VukW Oct 6, 2024
f38a6ab
Association: choose benchmark
VukW Oct 6, 2024
f0769b2
Unified page name
VukW Oct 8, 2024
1384b21
Pass mlcube params instead of url
aristizabal95 Oct 8, 2024
64d8b3c
Pass mlcube parameters to fetch-yaml
aristizabal95 Oct 8, 2024
7d6f01a
Merge pull request #9 from aristizabal95/web-ui-fetch-yaml
VukW Oct 9, 2024
015354e
Merge remote-tracking branch 'personal/web-ui' into web-ui-dataset
VukW Oct 9, 2024
96362de
Added dataset report + refactored yaml panel styles
VukW Oct 9, 2024
b3c81c1
linter fix
VukW Oct 9, 2024
43d2b77
Backend for running bmk over dataset in background
VukW Oct 14, 2024
75f9c5c
Added FE for model run
VukW Oct 15, 2024
f2ddd62
bugfix: mark last stage as completed also
VukW Oct 15, 2024
df0e2c2
Redesigned dataset run page
VukW Oct 15, 2024
e082cec
bugfix
VukW Oct 15, 2024
4926f07
Restyled model list
VukW Oct 15, 2024
4b45f49
Updated models list layout
VukW Oct 15, 2024
35ded73
Restyled models list
VukW Oct 15, 2024
2ab585d
Restyled the run buttons
VukW Oct 16, 2024
a7fdd52
Added "Running" state
VukW Oct 16, 2024
d9d2932
"Run all" button
VukW Oct 16, 2024
397aed4
removed unused code
VukW Oct 16, 2024
7780bc0
minor bugfixes
VukW Oct 17, 2024
7d20e31
Result submission
VukW Oct 22, 2024
2a1d55d
bugfix: status was passed wrongly if result is submitted (as draft is…
VukW Oct 22, 2024
48e388e
Auth by security token
VukW Oct 24, 2024
23908f5
Restyled dataset pipeline buttons
VukW Oct 24, 2024
020da3e
Merge remote-tracking branch 'origin/main' into webui-dataset
mhmdk0 Dec 21, 2024
e59d877
Merge remote-tracking branch 'upstream/main' into web-ui
hasan7n Dec 21, 2024
88c5eb0
Merge remote-tracking branch 'upstream/web-ui' into webui-dataset
hasan7n Dec 21, 2024
e265382
temp
mhmdk0 Jan 6, 2025
d4e7151
temp 1
mhmdk0 Jan 9, 2025
e636cdd
fix dataset submission and preparation
mhmdk0 Jan 9, 2025
79e25e6
set operation temp
mhmdk0 Jan 9, 2025
72b2eef
refactoring dataset - temp
mhmdk0 Jan 11, 2025
2bdd9f4
temp
mhmdk0 Jan 12, 2025
638fc5e
finalize dataset
mhmdk0 Jan 18, 2025
37b5106
include bootstrap 5, update old ignored 'non commited' files
mhmdk0 Jan 18, 2025
bb7a5ae
Finalize Dataset
mhmdk0 Jan 24, 2025
a21dea0
finalize model owner
mhmdk0 Jan 28, 2025
e135f09
Finalize data owner, model owner, and benchmark owner functionalities
mhmdk0 Jan 30, 2025
4640730
modifications for demo video
mhmdk0 Jan 30, 2025
c6381d8
finalize running medperf tutorial
mhmdk0 Feb 23, 2025
dc9e6fd
finalize running medperf tutorial - 2
mhmdk0 Feb 23, 2025
32a4665
fixes missing route
mhmdk0 Feb 23, 2025
3211ca6
Medperf Login Implementation
mhmdk0 Feb 23, 2025
05b35cc
add profiles (activate/view)
mhmdk0 Feb 23, 2025
2dc1dfb
remove "login" name for token security check
hasan7n Feb 25, 2025
8c104ae
remove outdated code
hasan7n Feb 25, 2025
68aa2dc
rename security check button
hasan7n Feb 25, 2025
47eb79d
view benchmark associations only if owner
hasan7n Feb 25, 2025
65c6407
misc improvements
hasan7n Feb 25, 2025
4df5ca8
make parameters and additional files optional
hasan7n Feb 25, 2025
f63f4fd
fix result view and submission bug
hasan7n Feb 25, 2025
3377a3d
add script for admin benchmark approval
hasan7n Feb 25, 2025
1bca4cc
remove logins from tutorial setup scripts
hasan7n Feb 25, 2025
493e969
fix some fields names
hasan7n Feb 25, 2025
6144fa0
show mlcube ID in details
hasan7n Feb 25, 2025
b8c72f3
Update benchmark_detail.html (#636)
alexkarargyris Mar 12, 2025
2d5187e
Update benchmark_submit.html (#637)
alexkarargyris Mar 12, 2025
7be2bac
Update benchmarks.html (#638)
alexkarargyris Mar 12, 2025
18812cf
Update benchmarks.html (#639)
alexkarargyris Mar 12, 2025
c074993
Update workflow_test.html (#640)
alexkarargyris Mar 12, 2025
4acedd4
view results after benchmark workflow test and modal compatibility te…
mhmdk0 Mar 22, 2025
72099a2
fix association dropdown in mlcube and dataset pages.
mhmdk0 Mar 23, 2025
9795660
modify "show only my {entity}" position
mhmdk0 Mar 23, 2025
ac3d0b3
change MLCube -> Container, Demo -> Reference, Submit -> Register, an…
mhmdk0 Apr 3, 2025
cdd049f
Implement logs/notifications
mhmdk0 Apr 15, 2025
ed1d25f
front-end refactoring
mhmdk0 Apr 23, 2025
73df291
modals refactoring + modifications for maintainability
mhmdk0 Apr 25, 2025
6a26f74
Front end refactoring + Bug fixes
mhmdk0 Apr 29, 2025
8c8c6c3
add confirmation popup before running tasks
mhmdk0 May 1, 2025
db5e4f5
add security check help
mhmdk0 May 1, 2025
b8ee61b
add file/folder browsing
mhmdk0 May 2, 2025
8a86944
fix dataset/model association cancellation for web-ui
mhmdk0 May 2, 2025
15ea1c4
profile activation fix, enhancements
mhmdk0 May 8, 2025
b6fcda9
add notification for prompt / bug fixes
mhmdk0 May 8, 2025
4618ab3
design changes for associations in benchmark details
mhmdk0 May 8, 2025
227c509
Merge remote-tracking branch 'origin/main' into web-ui
mhmdk0 May 22, 2025
c65a0cf
update web-ui according to cli changes
mhmdk0 May 23, 2025
bc1f3ad
add import/export to web-ui
mhmdk0 May 24, 2025
d0394bf
prevent multiple tasks from running - web-ui backend
mhmdk0 May 24, 2025
47a8c33
change how web-ui display actions depending on entity owner
mhmdk0 May 27, 2025
f81a57d
improve dataset import and fix its tests
mhmdk0 May 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ cover/
local_settings.py
db.sqlite3
db.sqlite3-journal
static/
server/static/
*.crt
*.key
*.pem
Expand Down Expand Up @@ -147,6 +147,9 @@ cython_debug/
# Dev Environment Specific
.vscode
.venv

# Medperf tutorial files
medperf_tutorial
server/keys

# exclude fl example
Expand Down
3 changes: 2 additions & 1 deletion cli/medperf/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
import medperf.commands.ca.ca as ca
import medperf.commands.certificate.certificate as certificate
import medperf.commands.storage as storage

import medperf.web_ui.app as web_ui
from medperf.utils import check_for_updates
from medperf.logging.utils import log_machine_details

Expand All @@ -40,6 +40,7 @@
app.add_typer(aggregator.app, name="aggregator", help="Manage aggregators")
app.add_typer(ca.app, name="ca", help="Manage CAs")
app.add_typer(certificate.app, name="certificate", help="Manage certificates")
app.add_typer(web_ui.app, name="web-ui", help="local web UI to manage medperf entities")


@app.command("run")
Expand Down
6 changes: 4 additions & 2 deletions cli/medperf/commands/benchmark/submit.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ def run(
ui.print("Uploaded")
submission.to_permanent_path(updated_benchmark_body)
submission.write(updated_benchmark_body)
print(submission.bmk.id)
return submission.bmk.id

def __init__(
self,
Expand Down Expand Up @@ -103,5 +105,5 @@ def to_permanent_path(self, bmk_dict: dict):
os.rename(old_bmk_loc, new_bmk_loc)

def write(self, updated_body):
bmk = Benchmark(**updated_body)
bmk.write()
self.bmk = Benchmark(**updated_body)
self.bmk.write()
6 changes: 4 additions & 2 deletions cli/medperf/commands/compatibility_test/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from medperf.exceptions import InvalidArgumentError
from .validate_params import CompatibilityTestParamsValidator
from .utils import download_demo_data, prepare_cube, get_cube, create_test_dataset
import medperf.config as config


class CompatibilityTestExecution:
Expand Down Expand Up @@ -87,8 +88,9 @@ def run(
test_exec.validate()
test_exec.set_data_source()
test_exec.process_benchmark()
test_exec.prepare_cubes()
test_exec.prepare_dataset()
with config.ui.interactive():
test_exec.prepare_cubes()
test_exec.prepare_dataset()
test_exec.initialize_report()
results = test_exec.cached_results()
if results is None:
Expand Down
2 changes: 1 addition & 1 deletion cli/medperf/commands/dataset/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,7 @@ def import_dataset(
raw_path: str = typer.Option(
None,
"--raw_dataset_path",
help="New path of the DEVELOPMENT dataset raw data to be saved.",
help="New path of the DEVELOPMENT dataset raw data to be saved. Directory should be empty or doesn't exist.",
),
):
"""Imports dataset files from specified tar.gz file."""
Expand Down
6 changes: 4 additions & 2 deletions cli/medperf/commands/dataset/import_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,17 +24,19 @@

def validate_input(self):
# The input archive file should exist and be a file
if not os.path.exists(self.input_path):

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
raise InvalidArgumentError(f"File {self.input_path} doesn't exist.")
if not os.path.isfile(self.input_path):

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
raise InvalidArgumentError(f"{self.input_path} is not a file.")

# raw_data_path should be provided if the imported dataset is in dev
if self.dataset.state == "DEVELOPMENT" and (
self.raw_data_path is None or os.path.exists(self.raw_data_path)
self.raw_data_path is None
or os.path.isfile(self.raw_data_path)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 6 days ago

To fix the issue, we need to validate the raw_data_path to ensure it is within a predefined safe root directory. This can be achieved by:

  1. Normalizing the raw_data_path using os.path.realpath or Path.resolve to remove any .. segments or symbolic links.
  2. Verifying that the normalized path starts with a predefined safe root directory (e.g., a directory dedicated to storing raw data).
  3. Raising an exception if the validation fails.

The validation should be added in the validate_input method of the ImportDataset class, as this is where the input parameters are initially checked.


Suggested changeset 2
cli/medperf/commands/dataset/import_dataset.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/cli/medperf/commands/dataset/import_dataset.py b/cli/medperf/commands/dataset/import_dataset.py
--- a/cli/medperf/commands/dataset/import_dataset.py
+++ b/cli/medperf/commands/dataset/import_dataset.py
@@ -32,10 +32,22 @@
         # raw_data_path should be provided if the imported dataset is in dev
-        if self.dataset.state == "DEVELOPMENT" and (
-            self.raw_data_path is None
-            or os.path.isfile(self.raw_data_path)
-            or (os.path.exists(self.raw_data_path) and os.listdir(self.raw_data_path))
-        ):
-            raise InvalidArgumentError(
-                "Output raw data path must be specified and, the directory should be empty or does not exist."
-            )
+        if self.dataset.state == "DEVELOPMENT":
+            if self.raw_data_path is None:
+                raise InvalidArgumentError(
+                    "Output raw data path must be specified."
+                )
+
+            # Normalize and validate raw_data_path
+            safe_root = config.raw_data_storage  # Define a safe root directory in the config
+            normalized_path = str(Path(self.raw_data_path).resolve())
+            if not normalized_path.startswith(str(Path(safe_root).resolve())):
+                raise InvalidArgumentError(
+                    f"Invalid raw data path: {self.raw_data_path}. Path must be within {safe_root}."
+                )
+
+            if os.path.isfile(normalized_path) or (
+                os.path.exists(normalized_path) and os.listdir(normalized_path)
+            ):
+                raise InvalidArgumentError(
+                    "Output raw data path must be an empty directory or not exist."
+                )
 
EOF
@@ -32,10 +32,22 @@
# raw_data_path should be provided if the imported dataset is in dev
if self.dataset.state == "DEVELOPMENT" and (
self.raw_data_path is None
or os.path.isfile(self.raw_data_path)
or (os.path.exists(self.raw_data_path) and os.listdir(self.raw_data_path))
):
raise InvalidArgumentError(
"Output raw data path must be specified and, the directory should be empty or does not exist."
)
if self.dataset.state == "DEVELOPMENT":
if self.raw_data_path is None:
raise InvalidArgumentError(
"Output raw data path must be specified."
)

# Normalize and validate raw_data_path
safe_root = config.raw_data_storage # Define a safe root directory in the config
normalized_path = str(Path(self.raw_data_path).resolve())
if not normalized_path.startswith(str(Path(safe_root).resolve())):
raise InvalidArgumentError(
f"Invalid raw data path: {self.raw_data_path}. Path must be within {safe_root}."
)

if os.path.isfile(normalized_path) or (
os.path.exists(normalized_path) and os.listdir(normalized_path)
):
raise InvalidArgumentError(
"Output raw data path must be an empty directory or not exist."
)

cli/medperf/web_ui/datasets/routes.py
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/cli/medperf/web_ui/datasets/routes.py b/cli/medperf/web_ui/datasets/routes.py
--- a/cli/medperf/web_ui/datasets/routes.py
+++ b/cli/medperf/web_ui/datasets/routes.py
@@ -434,3 +434,3 @@
     try:
-        ImportDataset.run(dataset_id, input_path, raw_dataset_path)
+        ImportDataset.run(dataset_id, input_path, raw_dataset_path or config.raw_data_storage)
         return_response["status"] = "success"
EOF
@@ -434,3 +434,3 @@
try:
ImportDataset.run(dataset_id, input_path, raw_dataset_path)
ImportDataset.run(dataset_id, input_path, raw_dataset_path or config.raw_data_storage)
return_response["status"] = "success"
Copilot is powered by AI and may make mistakes. Always verify output.
or (os.path.exists(self.raw_data_path) and os.listdir(self.raw_data_path))

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 6 days ago

To fix the issue, we need to validate the raw_data_path to ensure it is within a safe root directory and does not allow directory traversal. This can be achieved by:

  1. Defining a safe root directory for raw_data_path.
  2. Normalizing the user-provided path using os.path.normpath or Path.resolve to eliminate any .. segments.
  3. Verifying that the normalized path starts with the safe root directory.

The changes will be made in the validate_input method of the ImportDataset class in cli/medperf/commands/dataset/import_dataset.py. Additionally, we will update the import_dataset function in cli/medperf/web_ui/datasets/routes.py to define a safe root directory for raw_data_path.


Suggested changeset 2
cli/medperf/commands/dataset/import_dataset.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/cli/medperf/commands/dataset/import_dataset.py b/cli/medperf/commands/dataset/import_dataset.py
--- a/cli/medperf/commands/dataset/import_dataset.py
+++ b/cli/medperf/commands/dataset/import_dataset.py
@@ -32,10 +32,18 @@
         # raw_data_path should be provided if the imported dataset is in dev
-        if self.dataset.state == "DEVELOPMENT" and (
-            self.raw_data_path is None
-            or os.path.isfile(self.raw_data_path)
-            or (os.path.exists(self.raw_data_path) and os.listdir(self.raw_data_path))
-        ):
-            raise InvalidArgumentError(
-                "Output raw data path must be specified and, the directory should be empty or does not exist."
-            )
+        if self.dataset.state == "DEVELOPMENT":
+            if self.raw_data_path is None:
+                raise InvalidArgumentError("Output raw data path must be specified.")
+
+            # Normalize and validate the raw_data_path
+            safe_root = config.safe_root  # Safe root directory defined in config
+            normalized_path = Path(self.raw_data_path).resolve()
+            if not str(normalized_path).startswith(str(Path(safe_root).resolve())):
+                raise InvalidArgumentError(
+                    f"Invalid raw data path: {self.raw_data_path}. Path must be within {safe_root}."
+                )
+
+            if normalized_path.is_file() or (normalized_path.exists() and any(normalized_path.iterdir())):
+                raise InvalidArgumentError(
+                    "Output raw data path must be an empty directory or not exist."
+                )
 
EOF
@@ -32,10 +32,18 @@
# raw_data_path should be provided if the imported dataset is in dev
if self.dataset.state == "DEVELOPMENT" and (
self.raw_data_path is None
or os.path.isfile(self.raw_data_path)
or (os.path.exists(self.raw_data_path) and os.listdir(self.raw_data_path))
):
raise InvalidArgumentError(
"Output raw data path must be specified and, the directory should be empty or does not exist."
)
if self.dataset.state == "DEVELOPMENT":
if self.raw_data_path is None:
raise InvalidArgumentError("Output raw data path must be specified.")

# Normalize and validate the raw_data_path
safe_root = config.safe_root # Safe root directory defined in config
normalized_path = Path(self.raw_data_path).resolve()
if not str(normalized_path).startswith(str(Path(safe_root).resolve())):
raise InvalidArgumentError(
f"Invalid raw data path: {self.raw_data_path}. Path must be within {safe_root}."
)

if normalized_path.is_file() or (normalized_path.exists() and any(normalized_path.iterdir())):
raise InvalidArgumentError(
"Output raw data path must be an empty directory or not exist."
)

cli/medperf/web_ui/datasets/routes.py
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/cli/medperf/web_ui/datasets/routes.py b/cli/medperf/web_ui/datasets/routes.py
--- a/cli/medperf/web_ui/datasets/routes.py
+++ b/cli/medperf/web_ui/datasets/routes.py
@@ -434,3 +434,8 @@
     try:
-        ImportDataset.run(dataset_id, input_path, raw_dataset_path)
+        # Define a safe root directory for raw_dataset_path
+        safe_root = config.safe_root  # Safe root directory defined in config
+        if raw_dataset_path:
+            raw_dataset_path = str(Path(safe_root).joinpath(raw_dataset_path).resolve())
+
+        ImportDataset.run(dataset_id, input_path, raw_dataset_path)
         return_response["status"] = "success"
EOF
@@ -434,3 +434,8 @@
try:
ImportDataset.run(dataset_id, input_path, raw_dataset_path)
# Define a safe root directory for raw_dataset_path
safe_root = config.safe_root # Safe root directory defined in config
if raw_dataset_path:
raw_dataset_path = str(Path(safe_root).joinpath(raw_dataset_path).resolve())

ImportDataset.run(dataset_id, input_path, raw_dataset_path)
return_response["status"] = "success"
Copilot is powered by AI and may make mistakes. Always verify output.

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 6 days ago

To fix the issue, we need to validate the raw_data_path to ensure it is safe to use. This involves:

  1. Normalizing the path using os.path.normpath or Path.resolve to remove any .. segments or symbolic links.
  2. Ensuring the normalized path is contained within a predefined safe root directory (e.g., a specific directory for raw data).
  3. Raising an exception if the path is invalid or outside the allowed directory.

The changes will be made in the validate_input method of the ImportDataset class in cli/medperf/commands/dataset/import_dataset.py.


Suggested changeset 1
cli/medperf/commands/dataset/import_dataset.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/cli/medperf/commands/dataset/import_dataset.py b/cli/medperf/commands/dataset/import_dataset.py
--- a/cli/medperf/commands/dataset/import_dataset.py
+++ b/cli/medperf/commands/dataset/import_dataset.py
@@ -32,10 +32,23 @@
         # raw_data_path should be provided if the imported dataset is in dev
-        if self.dataset.state == "DEVELOPMENT" and (
-            self.raw_data_path is None
-            or os.path.isfile(self.raw_data_path)
-            or (os.path.exists(self.raw_data_path) and os.listdir(self.raw_data_path))
-        ):
-            raise InvalidArgumentError(
-                "Output raw data path must be specified and, the directory should be empty or does not exist."
-            )
+        if self.dataset.state == "DEVELOPMENT":
+            if self.raw_data_path is None:
+                raise InvalidArgumentError(
+                    "Output raw data path must be specified."
+                )
+
+            # Normalize and validate the raw_data_path
+            safe_root = config.raw_data_storage  # Define a safe root directory
+            normalized_path = Path(self.raw_data_path).resolve()
+            if not str(normalized_path).startswith(str(Path(safe_root).resolve())):
+                raise InvalidArgumentError(
+                    f"Invalid raw data path: {self.raw_data_path}. Path must be within {safe_root}."
+                )
+
+            # Ensure the directory is empty or does not exist
+            if os.path.isfile(normalized_path) or (
+                os.path.exists(normalized_path) and os.listdir(normalized_path)
+            ):
+                raise InvalidArgumentError(
+                    "Output raw data path must be an empty directory or not exist."
+                )
 
EOF
@@ -32,10 +32,23 @@
# raw_data_path should be provided if the imported dataset is in dev
if self.dataset.state == "DEVELOPMENT" and (
self.raw_data_path is None
or os.path.isfile(self.raw_data_path)
or (os.path.exists(self.raw_data_path) and os.listdir(self.raw_data_path))
):
raise InvalidArgumentError(
"Output raw data path must be specified and, the directory should be empty or does not exist."
)
if self.dataset.state == "DEVELOPMENT":
if self.raw_data_path is None:
raise InvalidArgumentError(
"Output raw data path must be specified."
)

# Normalize and validate the raw_data_path
safe_root = config.raw_data_storage # Define a safe root directory
normalized_path = Path(self.raw_data_path).resolve()
if not str(normalized_path).startswith(str(Path(safe_root).resolve())):
raise InvalidArgumentError(
f"Invalid raw data path: {self.raw_data_path}. Path must be within {safe_root}."
)

# Ensure the directory is empty or does not exist
if os.path.isfile(normalized_path) or (
os.path.exists(normalized_path) and os.listdir(normalized_path)
):
raise InvalidArgumentError(
"Output raw data path must be an empty directory or not exist."
)

Copilot is powered by AI and may make mistakes. Always verify output.
):
raise InvalidArgumentError(
"Output raw data path must be specified and shouldn't exist."
"Output raw data path must be specified and, the directory should be empty or does not exist."
)

def untar_files(self):
Expand Down Expand Up @@ -83,11 +85,11 @@
archive_config = os.path.join(
root_archive_folder, config.archive_config_filename
)
if not os.path.exists(archive_config):

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
raise ExecutionError(
"Dataset archive is invalid, config file doesn't exist"
)
with open(archive_config) as f:

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
archive_config = yaml.safe_load(f)

# validate config
Expand All @@ -97,7 +99,7 @@
archive_prepared_dataset_path = os.path.join(
root_archive_folder, str(self.dataset_id)
)
if not os.path.exists(archive_prepared_dataset_path):

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
raise ExecutionError("No prepared dataset in archive")

if os.path.exists(self.dataset.data_path) or os.path.exists(
Expand All @@ -114,8 +116,8 @@
archive_raw_labels_path = os.path.join(
root_archive_folder, archive_config["raw_labels"]
)
if not os.path.exists(archive_raw_data_path) or not os.path.exists(

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
archive_raw_labels_path

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
):
raise ExecutionError("No raw data in archive")

Expand All @@ -137,8 +139,8 @@
return

# For development datasets, move raw data as well
os.makedirs(self.raw_data_path, exist_ok=True)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
self.raw_data_path = str(Path(self.raw_data_path).resolve())

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
new_raw_data_path = os.path.join(
self.raw_data_path, os.path.basename(self.archive_raw_data_path)
)
Expand All @@ -147,7 +149,7 @@
)

same_raw_data_and_labels = os.path.samefile(
self.archive_raw_data_path, self.archive_raw_labels_path

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
)
move_folder(self.archive_raw_data_path, new_raw_data_path)
if not same_raw_data_and_labels:
Expand Down
6 changes: 5 additions & 1 deletion cli/medperf/commands/dataset/prepare.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,8 @@ def run(cls, dataset_id: int, approve_sending_reports: bool = False):
preparation.prompt_for_report_sending_approval()

if preparation.should_run_prepare():
preparation.run_prepare()
with preparation.ui.interactive():
preparation.run_prepare()

with preparation.ui.interactive():
preparation.run_sanity_check()
Expand Down Expand Up @@ -277,6 +278,8 @@ def __generate_report_dict(self):
with open(self.report_path, "r") as f:
report_dict = yaml.safe_load(f)

# TODO: this specific logic with status is very tuned to the RANO. Hope we'd
# make it more general once
report = pd.DataFrame(report_dict)
if "status" in report.keys():
report_status = report.status.value_counts() / len(report)
Expand All @@ -288,6 +291,7 @@ def __generate_report_dict(self):

return report_status_dict

@staticmethod
def prompt_for_report_sending_approval(self):
example = {
"execution_status": "running",
Expand Down
2 changes: 2 additions & 0 deletions cli/medperf/commands/dataset/set_operational.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@ def set_operational(self):
self.dataset.state = "OPERATION"

def update(self):
msg = "This is the information that is going to be transmitted to the medperf server"
config.ui.print_warning(msg)
body = self.todict()
dict_pretty_print(body)
msg = "Do you approve sending the presented data to MedPerf? [Y/n] "
Expand Down
35 changes: 20 additions & 15 deletions cli/medperf/commands/dataset/submit.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,11 +43,17 @@
submit_as_prepared,
for_test,
)
preparation.validate()
preparation.validate_prep_cube()
preparation.create_dataset_object()
if submit_as_prepared:
preparation.make_dataset_prepared()
submission_dict = preparation.prepare_dict(submit_as_prepared)
dict_pretty_print(submission_dict)

msg = "Do you approve the registration of the presented data to MedPerf? [Y/n] "
warning = (
"Upon submission, your email address will be visible to the Data Preparation"
+ " Owner for traceability and debugging purposes."
)
config.ui.print_warning(warning)
preparation.approved = preparation.approved or approval_prompt(msg)

updated_dataset_dict = preparation.upload()
preparation.to_permanent_path(updated_dataset_dict)
preparation.write(updated_dataset_dict)
Expand All @@ -69,8 +75,8 @@
for_test: bool,
):
self.ui = config.ui
self.data_path = str(Path(data_path).resolve())

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
self.labels_path = str(Path(labels_path).resolve())

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
self.metadata_path = metadata_path
self.name = name
self.description = description
Expand All @@ -82,9 +88,9 @@
self.for_test = for_test

def validate(self):
if not os.path.exists(self.data_path):

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
raise InvalidArgumentError("The provided data path doesn't exist")
if not os.path.exists(self.labels_path):

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
raise InvalidArgumentError("The provided labels path doesn't exist")

if not self.submit_as_prepared and self.metadata_path:
Expand Down Expand Up @@ -137,8 +143,8 @@
self.dataset = dataset

def make_dataset_prepared(self):
shutil.copytree(self.data_path, self.dataset.data_path)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
shutil.copytree(self.labels_path, self.dataset.labels_path)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.
if self.metadata_path:
shutil.copytree(self.metadata_path, self.dataset.metadata_path)
else:
Expand All @@ -147,17 +153,16 @@
# have prepared datasets with no the metadata information
os.makedirs(self.dataset.metadata_path, exist_ok=True)

def upload(self):
submission_dict = self.dataset.todict()
dict_pretty_print(submission_dict)
msg = "Do you approve the registration of the presented data to MedPerf? [Y/n] "
warning = (
"Upon submission, your email address will be visible to the Data Preparation"
+ " Owner for traceability and debugging purposes."
)
self.ui.print_warning(warning)
self.approved = self.approved or approval_prompt(msg)
def prepare_dict(self, submit_as_prepared: bool):
self.validate()
self.validate_prep_cube()
self.create_dataset_object()
if submit_as_prepared:
self.make_dataset_prepared()

return self.dataset.todict()

def upload(self):
if self.approved:
updated_body = self.dataset.upload()
return updated_body
Expand Down
2 changes: 2 additions & 0 deletions cli/medperf/commands/list.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,8 @@ def run(
"""Lists all local datasets

Args:
entity_class: entity class to instantiate (Dataset, Model, etc.)
fields (list[str]): list of fields to display
unregistered (bool, optional): Display only local unregistered results. Defaults to False.
mine_only (bool, optional): Display all registered current-user results. Defaults to False.
kwargs (dict): Additional parameters for filtering entity lists.
Expand Down
3 changes: 2 additions & 1 deletion cli/medperf/commands/mlcube/associate.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from medperf import config
from medperf.entities.cube import Cube
from medperf.entities.benchmark import Benchmark
from medperf.exceptions import CleanExit
from medperf.utils import dict_pretty_print, approval_prompt
from medperf.commands.compatibility_test.run import CompatibilityTestExecution

Expand Down Expand Up @@ -42,4 +43,4 @@ def run(
metadata = {"test_result": results}
comms.associate_benchmark_model(cube_uid, benchmark_uid, metadata)
else:
ui.print("Model association operation cancelled")
raise CleanExit("Model association operation cancelled")
5 changes: 3 additions & 2 deletions cli/medperf/commands/mlcube/submit.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ def run(cls, submit_info: dict):
updated_cube_dict = submission.upload()
submission.to_permanent_path(updated_cube_dict)
submission.write(updated_cube_dict)
return submission.cube.id

def __init__(self, submit_info: dict):
self.comms = config.comms
Expand All @@ -49,5 +50,5 @@ def to_permanent_path(self, cube_dict):
os.rename(old_cube_loc, new_cube_loc)

def write(self, updated_cube_dict):
cube = Cube(**updated_cube_dict)
cube.write()
self.cube = Cube(**updated_cube_dict)
self.cube.write()
6 changes: 4 additions & 2 deletions cli/medperf/commands/profile.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def activate(profile: str):
config_p = read_config()

if profile not in config_p:
raise InvalidArgumentError("The provided profile does not exists")
raise InvalidArgumentError("The provided profile does not exist")

config_p.activate(profile)
write_config(config_p)
Expand Down Expand Up @@ -81,6 +81,8 @@ def view(profile: str = typer.Argument(None)):
config_p = read_config()
profile_config = config_p.active_profile
if profile:
if profile not in config_p:
raise InvalidArgumentError("The provided profile does not exist")
profile_config = config_p[profile]

profile_config.pop(config.credentials_keyword, None)
Expand All @@ -99,7 +101,7 @@ def delete(profile: str):
"""
config_p = read_config()
if profile not in config_p.profiles:
raise InvalidArgumentError("The provided profile does not exists")
raise InvalidArgumentError("The provided profile does not exist")

if profile in [
config.default_profile_name,
Expand Down
7 changes: 4 additions & 3 deletions cli/medperf/commands/result/create.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ def run(
ignore_failed_experiments=False,
no_cache=False,
show_summary=False,
):
) -> list[Result]:
"""Benchmark execution flow.

Args:
Expand All @@ -48,7 +48,8 @@ def run(
ignore_model_errors,
ignore_failed_experiments,
)
execution.prepare()
with execution.ui.interactive():
execution.prepare()
execution.validate()
execution.prepare_models()
if not no_cache:
Expand Down Expand Up @@ -166,7 +167,7 @@ def __get_cube(self, uid: int, name: str) -> Cube:
self.ui.print(f"> Container '{name}' download complete")
return cube

def run_experiments(self):
def run_experiments(self) -> list[Result]:
for model_uid in self.models_uids:
if model_uid in self.cached_results:
self.experiments.append(
Expand Down
11 changes: 7 additions & 4 deletions cli/medperf/comms/auth/auth0.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,11 +39,14 @@ def login(self, email):
interval = device_code_response["interval"]

config.ui.print(
"\nPlease go to the following link to complete your login request:\n"
f"\t{verification_uri_complete}\n\n"
"Make sure that you will be presented with the following code:\n"
f"\t{user_code}\n\n"
"\nPlease go to the following link to complete your login request:\n\t"
)
config.ui.print_url(verification_uri_complete)
config.ui.print(
"\n\nMake sure that you will be presented with the following code:\n\t"
)
config.ui.print_code(user_code)
config.ui.print("\n\n")
config.ui.print_warning(
"Keep this terminal open until you complete your login request. "
"The command will exit on its own once you complete the request. "
Expand Down
14 changes: 8 additions & 6 deletions cli/medperf/comms/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,11 @@
class CommsFactory:
@staticmethod
def create_comms(name: str, host: str) -> Comms:
name = name.lower()
if name == "rest":
return REST(host)
else:
msg = "the indicated communication interface doesn't exist"
raise InvalidArgumentError(msg)
if isinstance(name, str):
name = name.lower()
if name == "rest":
return REST(host)
else:
msg = "the indicated communication interface doesn't exist"
raise InvalidArgumentError(msg)
return REST(host)
1 change: 1 addition & 0 deletions cli/medperf/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -220,6 +220,7 @@
logs_backup_count = 100
cleanup = True
ui = "CLI"
webui = "WEBUI"

default_profile_name = "default"
testauth_profile_name = "testauth"
Expand Down
16 changes: 16 additions & 0 deletions cli/medperf/entities/association.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
from datetime import datetime
from typing import Optional

from medperf.entities.schemas import ApprovableSchema, MedperfSchema


class Association(MedperfSchema, ApprovableSchema):
id: int
metadata: dict
dataset: Optional[int]
model_mlcube: Optional[int]
benchmark: int
initiated_by: int
created_at: Optional[datetime]
modified_at: Optional[datetime]
name: str = "Association" # The server data doesn't have name, while MedperfSchema requires it
Loading
Loading