-
Notifications
You must be signed in to change notification settings - Fork 36
[WIP] Web UI dataset #617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Web UI dataset #617
Conversation
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
full_path = os.path.join(BASE_DIR, path) | ||
os.path.join(BASE_DIR, path) | ||
|
||
if not os.path.exists(full_path) or not os.path.isdir(full_path): |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function lists all files and folders on the server side. So, yes, it is expected to use user's given path to list its content (though still may be dangerous? I don't know)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should handle all of these vulnerabilities as suggested, even if they're expected to be for local calls or not
full_path = os.path.join(BASE_DIR, path) | ||
os.path.join(BASE_DIR, path) | ||
|
||
if not os.path.exists(full_path) or not os.path.isdir(full_path): |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
|
||
# List directories inside the path | ||
folders = [] | ||
for item in os.listdir(full_path): |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
folders = [] | ||
for item in os.listdir(full_path): | ||
item_path = os.path.join(full_path, item) | ||
if os.path.isdir(item_path): |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
# Conflicts: # cli/medperf/web_ui/yaml_fetch/routes.py
# Check if user is already authenticated | ||
if token == security_token: | ||
# User is already authenticated, redirect to original URL | ||
return RedirectResponse(url=redirect_url, status_code=status.HTTP_302_FOUND) |
Check warning
Code scanning / CodeQL
URL redirection from remote source Medium
user-provided value
redirect_url: str = Form("/"), | ||
): | ||
if token == security_token: | ||
response = RedirectResponse(url=redirect_url, status_code=status.HTTP_302_FOUND) |
Check warning
Code scanning / CodeQL
URL redirection from remote source Medium
user-provided value
): | ||
if token == security_token: | ||
response = RedirectResponse(url=redirect_url, status_code=status.HTTP_302_FOUND) | ||
response.set_cookie(key=AUTH_COOKIE_NAME, value=token) |
Check warning
Code scanning / CodeQL
Failure to use secure cookies Medium
): | ||
if token == security_token: | ||
response = RedirectResponse(url=redirect_url, status_code=status.HTTP_302_FOUND) | ||
response.set_cookie(key=AUTH_COOKIE_NAME, value=token) |
Check warning
Code scanning / CodeQL
Construction of a cookie using user-supplied input Medium
user-supplied input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a ton of work! There are a few suggestions on maintaining style, and covering vulnerabilities with the suggestions generated by github would be ideal. Also, this work should prompt some discussion on the current design and future design, since this implementation could not be added without repeating code. Leaving as is could lead to several bugs if changes need to be done in the current workflow (as they would need to be implemented in both CLI and Web UI). This is some amazing work for a task that was very intimidating and difficult to define
+ " Owner for traceability and debugging purposes." | ||
) | ||
config.ui.print_warning(warning) | ||
preparation.approved = preparation.approved or approval_prompt(msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to instead convert this logic into a function, and keep the original flow with the added new function. Just so that we keep a consistent style where you can get an idea of the workflow of a command by looking at the run
function.
@@ -1,4 +1,5 @@ | |||
import os | |||
import time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This dependency doesn't appear to be needed
full_path = os.path.join(BASE_DIR, path) | ||
os.path.join(BASE_DIR, path) | ||
|
||
if not os.path.exists(full_path) or not os.path.isdir(full_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should handle all of these vulnerabilities as suggested, even if they're expected to be for local calls or not
""" | ||
dataset = Dataset.get(dataset_id) | ||
benchmark = Benchmark.get(benchmark_id) | ||
draft_id = str(uuid.uuid4()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't know of this. Nice!
depends on #606
Adds functionality for creating & managing datasets
Dataset submission
p1
list benchmarks as cards with detailed info (as on "benchmarks list" page), + highlight chosen onep3
allow to choose dataprep mlcube insteadp2
check length beforehand? DS submission would fail if description is > 20 characters ?p2
check length beforehand?p3
"Submit as prepared" flagp1
check how errors are displayed & handledp2
metadata path (what is it?) (is required if dataset is already prepared)p2
redesign path picking panelp2
"go back" button / navigationp1
check how errors are displayed & handledSubmitted dataset displaying (dataset details page)
p2
display paths (data path, labels path)p1
display dataset statisticsDataset preparation
p1
clean log messages from magic bytesp1
distinguish messages to headers + usual lines in the medperf codep1
display header messages properlyp1
display log messages without jsonp2
log lines highlighting?p1
spinner at text headers to underline process is runningp1
check how errors are displayed & handledp1
button "back to the dataset" renamep1
check how errors are displayed & handled (display exceptions in the log)Prepared dataset displaying (dataset details page)
p2
if report exists, "Allowed automatic report submission" flagp2
if report exists, link / path to the reportp1
if dataset is prepared, unlock next button "set operational" (locked if not prepared)Set operational
p0
"Set operational" button on dataset detail pagep0
Set operationalp0
Disable button if already operationalp1
check how errors are displayed & handledAssociate
p1
"Associate with the benchmark" button on the dataset detail page (no choice of benchmarks)p3
choice of benchmarks if dataset was created with dataprep mlcubep1
associatep1
check how errors are displayed & handledRun benchmark
p1
"Run benchmark" button on the dataset detail pagep1
running benchmark page with logsp1
check how errors are displayed & handledp2
"Run all" buttonp2
add "Running" status in addition to "Pending"p1
make all status (Run/Failed/Pending/Done) buttons as separate objectsp1
modify default (right-after-load) button to "Waiting for status.."p2
adjust logs panel height if there is no logsSubmit result
p1
"Submit result" button on the dataset detail pagep1
Submit resultp1
check how errors are displayed & handledp2
"Submit all" buttongeneral dataset UI
p2
design buttons prepare-operational-... navigation to one line?p1
hide buttons panel if you're not the dataset owner?p3
redesign state displaying (we'd have dev/op floating blocks in the header + set-op button in the footer)Technical refactoring
p1
split routes to different files:dataset/submission.py
,dataset/preparation.py