Skip to content

Conversation

@momchil-flex
Copy link
Collaborator

@momchil-flex momchil-flex commented Nov 8, 2025

Greptile Overview

Greptile Summary

This PR refactors the RF webapi to use a new simplified workflow for Component Modeler batch tasks. The changes migrate from legacy tidy3d/tasks/{id}/batch-* endpoints to new rf/task/{id}/* endpoints, removing ~400 lines of complex async split/validation logic.

Key Changes

  • Simplified batch workflow: Removed _upload_component_modeler_subtasks() function that handled async job splitting and multi-phase validation (~170 lines)
  • Unified batch handling: BatchTask now uses consistent rf/task/* endpoints for check, submit, detail, and statistics
  • Removed postprocessing logic: Eliminated separate postprocess phase and auto-trigger logic from Job and Batch classes
  • Endpoint migration:
    • Task creation: POST rf/task (new) vs POST tidy3d/projects/{folder}/tasks (old)
    • Batch detail: GET rf/task/{id}/statistics (new) vs GET tidy3d/tasks/{id}/batch-detail (old)
    • Batch submit: POST rf/task/{id}/submit (new) vs POST tidy3d/projects/{id}/batch-submit (old)
  • Validation deferral: Component modeler simulations now use validate=False in updated_copy() calls to defer validation until final assembly
  • Removed methods: BatchTask.wait_for_validate(), BatchTask.wait_for_run(), BatchTask.postprocess(), Job.postprocess_start(), Batch.postprocess_start()

Potential Issues

  • Critical: BatchTask.submit() accepts pay_type and priority parameters but doesn't send them to the API (line 846-856 in task_core.py). This could cause silent failures where users believe they've set priority/payment method but haven't.

Confidence Score: 3/5

  • This PR has one critical logic issue with silently ignored parameters that must be resolved before merging
  • Score of 3 reflects: (1) Critical logic bug where pay_type/priority parameters are accepted but not sent to API for batch tasks, potentially causing silent failures. (2) Large refactor (~400 lines removed) migrating to new endpoints with limited visibility into backend changes. (3) Validation deferral in component modelers changes error detection timing. However, the refactor itself appears well-structured with consistent patterns and removes complex async logic.
  • Pay close attention to tidy3d/web/core/task_core.py (BatchTask.submit method) - fix the ignored parameters issue

Important Files Changed

File Analysis

Filename Score Overview
tidy3d/web/api/webapi.py 4/5 Refactored RF webapi to use new RF endpoints, removed complex async split/validation logic (~200 lines), simplified batch handling
tidy3d/web/core/task_core.py 3/5 Migrated batch operations to rf/* endpoints, removed wait_for_validate/wait_for_run methods, pay_type/priority params not implemented in submit
tidy3d/web/api/container.py 5/5 Removed Component Modeler-specific postprocess logic and status handling (~100 lines cleanup)
tidy3d/plugins/smatrix/component_modelers/terminal.py 4/5 Added validate=False to simulation.updated_copy() calls to defer validation, moved grid size checks earlier in workflow

Sequence Diagram

sequenceDiagram
    participant User
    participant webapi
    participant SimulationTask
    participant BatchTask
    participant API as RF API (rf/task/*)

    User->>webapi: upload(simulation)
    webapi->>webapi: Check if RF task type
    
    alt RF Task (Component Modeler)
        webapi->>SimulationTask: create(task_type="RF")
        SimulationTask->>API: POST rf/task (create group)
        API-->>SimulationTask: batchId/groupId
        SimulationTask->>API: Upload simulation data to S3
        webapi->>webapi: estimate_cost(resource_id)
    else Regular Task
        webapi->>SimulationTask: create(task_type)
        SimulationTask->>API: POST tidy3d/projects/{folder}/tasks
        API-->>SimulationTask: taskId
        SimulationTask->>API: Upload simulation data to S3
    end

    User->>webapi: start(task_id)
    webapi->>webapi: Check if modeler batch
    
    alt Batch Task
        webapi->>BatchTask: new(batch_id)
        BatchTask->>API: POST rf/task/{id}/submit
        Note right of API: pay_type/priority<br/>NOT sent (known issue)
    else Regular Task
        webapi->>SimulationTask: get(task_id)
        SimulationTask->>API: POST tidy3d/tasks/{id}/submit
    end

    User->>webapi: monitor(task_id)
    
    loop Poll Status
        alt Batch Task
            webapi->>BatchTask: detail()
            BatchTask->>API: GET rf/task/{id}/statistics
            API-->>BatchTask: BatchDetail
        else Regular Task
            webapi->>SimulationTask: get_info()
            SimulationTask->>API: GET tidy3d/tasks/{id}/detail
            API-->>SimulationTask: TaskInfo
        end
    end

    User->>webapi: download(task_id)
    
    alt Batch Task
        webapi->>BatchTask: get_data_hdf5()
        BatchTask->>API: Download from S3
    else Regular Task
        webapi->>SimulationTask: get_sim_data_hdf5()
        SimulationTask->>API: Download from S3
    end
Loading

@momchil-flex momchil-flex changed the title Momchil/rf refactor SCRF-1630: Refactor RF webapi for new workflow Nov 8, 2025
Copy link
Contributor

@dmarek-flex dmarek-flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All your changes to TCM seem good to me

@momchil-flex momchil-flex marked this pull request as ready for review November 11, 2025 15:37
Copilot AI review requested due to automatic review settings November 11, 2025 15:37
Copilot finished reviewing on behalf of momchil-flex November 11, 2025 15:41
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

14 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +846 to 856
# TODO: add support for pay_type and priority arguments, currently ignored if provided
if protocol_version is None:
protocol_version = _get_protocol_version()
return http.post(
f"tidy3d/projects/{self.batch_id}/postprocess",
f"rf/task/{self.batch_id}/submit",
{
"batchType": batch_type,
"solverVersion": solver_version,
"protocolVersion": protocol_version,
"workerGroup": worker_group,
},
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: pay_type and priority parameters are accepted but not actually sent to the API (see line 849-856). However, webapi.start() at line 722-727 passes these parameters expecting them to work for batch tasks, which could lead to silent failures where users think they've set priority/payment but haven't.

Either implement support for these parameters in the API call or raise an error if they're provided for batch tasks.

Prompt To Fix With AI
This is a comment left during a code review.
Path: tidy3d/web/core/task_core.py
Line: 846:856

Comment:
**logic:** `pay_type` and `priority` parameters are accepted but not actually sent to the API (see line 849-856). However, `webapi.start()` at line 722-727 passes these parameters expecting them to work for batch tasks, which could lead to silent failures where users think they've set priority/payment but haven't.

Either implement support for these parameters in the API call or raise an error if they're provided for batch tasks.

How can I resolve this? If you propose a fix, please make it concise.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the RF (component modeler) web API to align with a new backend workflow, simplifying the task lifecycle management and removing manual postprocessing steps.

  • Removes multi-phase task workflows (validate, run, postprocess) in favor of a unified flow
  • Updates RF task creation to use new rf/task endpoints instead of legacy tidy3d/projects endpoints
  • Eliminates manual postprocessing orchestration, now handled automatically by the backend

Reviewed Changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
tidy3d/web/core/task_info.py Removes TaskStatus and BatchStatus enums; adds taskType field to BatchDetail and removes deprecated totalStatus field
tidy3d/web/core/task_core.py Refactors task creation to use RF-specific endpoints; removes wait_for_validate, wait_for_run, and postprocess methods; adds abort method for batch tasks
tidy3d/web/core/http_util.py Temporarily hardcodes version string (should be reverted before production)
tidy3d/web/api/webapi.py Simplifies batch monitoring by removing manual postprocessing logic; updates status checking and error handling for new workflow; removes postprocess_start function
tidy3d/web/api/tidy3d_stub.py Generalizes validation to use else instead of explicit EMESimulation check
tidy3d/web/api/states.py Updates error state names to match new backend conventions (e.g., validate_error instead of validate_fail)
tidy3d/web/api/container.py Removes postprocess_status property and postprocess_start methods from Job and Batch classes
tidy3d/web/init.py Removes postprocess_start from public API exports
tidy3d/plugins/smatrix/component_modelers/terminal.py Adds validate=False to intermediate simulation copies for performance; moves grid checks to base simulation; adds validate_pre_upload method
tidy3d/plugins/smatrix/component_modelers/base.py Adds validate_pre_upload method to base class
tidy3d/config/profiles.py Updates dev profile endpoints to new infrastructure
tidy3d/components/mode/simulation.py Removes source_required parameter from validate_pre_upload signature
tidy3d/components/mode/mode_solver.py Removes source_required parameter from validate_pre_upload signature
docs/api/submit_simulations.rst Removes TaskStatus from documentation
Comments suppressed due to low confidence (1)

tidy3d/web/core/task_core.py:268

  • The payload structure for RF tasks differs from non-RF tasks. The RF task uses groupName and folderId directly while non-RF tasks use taskName. However, line 266 suggests that RF tasks should return a groupId or batchId as the taskId. Ensure that the response parsing logic on lines 264-268 correctly handles both response formats and that downstream code expecting a taskId can work with a groupId/batchId.
        if task_type == "RF":
            payload = {
                "groupName": task_name,
                "folderId": folder.folder_id,
                "fileType": file_type,
            }
            resp = http.post("rf/task", payload)
        else:
            payload = {
                "taskName": task_name,
                "taskType": task_type,
                "callbackUrl": callback_url,
                "simulationType": simulation_type,
                "parentTasks": parent_tasks,
                "fileType": file_type,
            }
            resp = http.post(f"{projects_endpoint}/{folder.folder_id}/tasks", payload)
        # RF group creation may return group-level info without 'taskId'.
        # Use 'groupId' (or 'batchId' as fallback) as the resource id for subsequent uploads.
        if "taskId" not in resp and task_type == "RF":
            # Prefer using 'batchId' as the resource id for uploads (S3 STS expects a task-like id).
            if "batchId" in resp:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

detail = batch.detail()
status = detail.status.lower()
if status in ERROR_STATES:
_batch_detail_error(task_id)
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _batch_detail_error call on line 758 doesn't return its result. This means the error will be logged internally but not raised. The pattern should be raise _batch_detail_error(task_id) or the function should handle the error raising internally and this call should just trigger it.

Suggested change
_batch_detail_error(task_id)
raise _batch_detail_error(task_id)

Copilot uses AI. Check for mistakes.
return False

def detail(self, batch_type: str) -> BatchDetail:
def detail(self) -> BatchDetail:
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing return type annotation. The function should specify that it returns BatchDetail or None.

Copilot uses AI. Check for mistakes.
Comment on lines +126 to 127
status in ERROR_STATES
and hasattr(batch_detail, "validateErrors")
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling logic appears inconsistent. On line 117, batch_detail.status.lower() is called, but if status is a string already, this should work. However, the logic on line 126 checks if status in ERROR_STATES, which should be checking the actual status condition for raising errors. The current condition seems redundant with the check on line 122.

Suggested change
status in ERROR_STATES
and hasattr(batch_detail, "validateErrors")
hasattr(batch_detail, "validateErrors")

Copilot uses AI. Check for mistakes.
Comment on lines +1026 to +1027
# TODO: seems like the default should then be maybe set to None and defined somewhere else
# per task type?
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The TODO comment suggests the default path handling logic should be refactored to be task-type specific rather than hardcoded here. Consider extracting this logic into a task-type configuration or factory pattern for better maintainability.

Copilot uses AI. Check for mistakes.
Comment on lines +184 to 186
else:
self.simulation.validate_pre_upload()

Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The change from elif to else removes the explicit type check for EMESimulation. If other simulation types are added in the future, they will automatically call validate_pre_upload() without the source_required parameter. Consider whether this should remain an elif isinstance(self.simulation, EMESimulation): check for explicitness and to catch unexpected simulation types.

Suggested change
else:
self.simulation.validate_pre_upload()
elif isinstance(self.simulation, EMESimulation):
self.simulation.validate_pre_upload()
else:
raise TypeError(f"Unexpected simulation type: {type(self.simulation).__name__}")

Copilot uses AI. Check for mistakes.
return file

def abort(self) -> requests.Response:
"""Aborting current task from server."""
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message has inconsistent grammar: "Aborting current task from server" should be "Aborting current task on server" or "Aborting task from server".

Suggested change
"""Aborting current task from server."""
"""Abort the current task on the server."""

Copilot uses AI. Check for mistakes.
return ("validating", 2)
if s in ("validate_success", "validate_warn"):
return ("validate", 3)
if s in ("running", "preprocess_success"):
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mapping for "preprocess_success" on line 1249 returns stage ("running", 4), but this seems inconsistent. If preprocessing succeeded, the task should be at the "preprocess" stage (stage 1) or moving to validation (stage 2), not "running". This mapping should be reviewed for correctness.

Suggested change
if s in ("running", "preprocess_success"):
if s == "preprocess_success":
return ("validating", 2)
if s == "running":

Copilot uses AI. Check for mistakes.
"workerGroup": worker_group,
},
)

Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wait_for_validate and wait_for_run methods have been removed from the BatchTask class. If any code in the codebase depends on these methods, this will cause AttributeError exceptions. Verify that all usages of these methods have been refactored or removed.

Copilot uses AI. Check for mistakes.
Comment on lines +752 to +759
def _get_batch_detail_handle_error_status(task_id: TaskId) -> str:
"""Get batch detail and raise error if status is in ERROR_STATES."""
batch = BatchTask(task_id)
detail = batch.detail()
status = detail.status.lower()
if status in ERROR_STATES:
_batch_detail_error(task_id)
return detail
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function _get_batch_detail_handle_error_status has a return type annotation of str, but it returns a BatchDetail object on line 759. This will cause type checking issues and the function is used on line 771 as if it returns a BatchDetail (accessing .status). The return type should be BatchDetail.

Copilot uses AI. Check for mistakes.
Comment on lines +44 to +45
# return core_config.get_version()
return "2.10.0rc2.1"
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded version string "2.10.0rc2.1" appears to be a temporary change for testing. This should be reverted to use core_config.get_version() before merging to production.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants