-
Notifications
You must be signed in to change notification settings - Fork 81
feat(clp-package)!: Add query engine option to support starting only compression and UI components when using the Presto query engine. #1095
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds query engine awareness and runtime component filtering. The config introduces QueryEngine options and component groupings, exposes a get_runnable_components API, and maps targets to components. The start script now intersects target components with runnable ones, exiting if none. UI exports query engine enum. Packaging sets query_engine, and template config enables it. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant start_clp.py
participant CLPConfig
participant Components
User->>start_clp.py: invoke with target
start_clp.py->>CLPConfig: load config
CLPConfig-->>start_clp.py: get_runnable_components()
start_clp.py->>CLPConfig: get_components_for_target(target)
CLPConfig-->>start_clp.py: component set
start_clp.py->>start_clp.py: intersect sets
alt empty intersection
start_clp.py-->>User: exit -1 (target not startable)
else components present
start_clp.py->>Components: start only components in intersection
Components-->>start_clp.py: started
start_clp.py-->>User: done
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
(4 hunks)
🧰 Additional context used
🧠 Learnings (1)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2)
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: haiqi96
PR: y-scope/clp#569
File: components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py:380-392
Timestamp: 2024-11-15T20:07:22.256Z
Learning: The current implementation assumes single-threaded execution, so race conditions in functions like `is_target_extracted` in `components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py` are not currently a concern.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: build (ubuntu-latest)
- GitHub Check: build (macos-latest)
- GitHub Check: lint-check (macos-latest)
- GitHub Check: lint-check (ubuntu-latest)
🔇 Additional comments (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2)
28-28
: LGTM: Correct import for new functionality.The import of
QueryEngine
is necessary and properly placed with other related imports.
884-884
: LGTM: Proper integration with web UI settings.The query engine configuration is correctly passed to the client settings, allowing the web UI to adjust its behaviour based on the configured query engine.
if QueryEngine.PRESTO != clp_config.package.query_engine: | ||
start_query_scheduler(instance_id, clp_config, container_clp_config, mounts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick (assertive)
Consider using positive conditionals for better readability.
The negative conditional QueryEngine.PRESTO != clp_config.package.query_engine
works but could be more explicit. Consider using QueryEngine.NATIVE == clp_config.package.query_engine
for clarity, assuming native is the alternative.
- if QueryEngine.PRESTO != clp_config.package.query_engine:
+ if QueryEngine.NATIVE == clp_config.package.query_engine:
start_query_scheduler(instance_id, clp_config, container_clp_config, mounts)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if QueryEngine.PRESTO != clp_config.package.query_engine: | |
start_query_scheduler(instance_id, clp_config, container_clp_config, mounts) | |
if QueryEngine.NATIVE == clp_config.package.query_engine: | |
start_query_scheduler(instance_id, clp_config, container_clp_config, mounts) |
🤖 Prompt for AI Agents
In components/clp-package-utils/clp_package_utils/scripts/start_clp.py around
lines 1214 to 1215, replace the negative conditional checking if the query
engine is not PRESTO with a positive conditional explicitly checking if it
equals NATIVE. This improves readability by clearly stating the intended
condition. Change the if statement to use `QueryEngine.NATIVE ==
clp_config.package.query_engine` instead of `QueryEngine.PRESTO !=
clp_config.package.query_engine`.
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
Outdated
Show resolved
Hide resolved
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
Outdated
Show resolved
Hide resolved
@haiqi96 @gibber9809 , since this isnt really a UI pr, i asked kirk for who should review, and he mentioned u guys |
QUERY_WORKER_COMPONENT_NAME, | ||
REDUCER_COMPONENT_NAME, | ||
): | ||
logger.error(f"{target} not available when using Presto as query engine") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
logger.error(f"{target} not available when using Presto as query engine") | |
logger.error(f"{target} not available when using {clp_config.package.query_engine} as query engine") |
Maybe can also create a variable query_engine for reuse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about just QueryEngine.PRESTO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, works for me.
also tagging @junhaoliao since this may be used in cloud |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it would make sense to refactor start_clp.py
to the list of components that need to be started based on the config and comparing against that list everywhere instead of having checks littered throughout that are related to specific configuration parameters.
e.g. somewhat like the following
runnable_components = clp_config.get_runnable_components()
if target not in [ALL_TARGET_NAME, *runnable_components]:
logger.error(...)
...
if target in (ALL_TARGET_NAME, CONTROLLER_TARGET_NAME, QUERY_SCHEDULER_COMPONENT_NAME):
if QUERY_SCHEDULER_COMPONENT_NAME in runnable_components:
start_query_scheduler(...)
...
Then all we have to do is keep get_runnable_components
up to date as we add configuration parameters that effect which components need to be run.
I'm okay with devin's suggestion. It will also add an extra line above each component for for the new condition. |
Sure, this sounds like a more general approach and can be extended in the future |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
(19 hunks)components/clp-py-utils/clp_py_utils/clp_config.py
(17 hunks)
🧰 Additional context used
🧠 Learnings (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (15)
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/del_archives.py:56-65
Timestamp: 2024-11-18T16:49:20.248Z
Learning: When reviewing wrapper scripts in `components/clp-package-utils/clp_package_utils/scripts/`, note that it's preferred to keep error handling simple without adding extra complexity.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like `readonly` declarations.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-packages-from-source.sh:6-8
Timestamp: 2025-07-01T14:51:19.172Z
Learning: In CLP installation scripts within `components/core/tools/scripts/lib_install/`, maintain consistency with existing variable declaration patterns across platforms rather than adding individual improvements like `readonly` declarations.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like `apt-get update && apt-get install` or `apk update && apk add`), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like `apt-get update && apt-get install` or `apk update && apk add`), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across platforms rather than applying platform-specific optimizations. When a platform follows a pattern of separate update and install commands (like `apt-get update && apt-get install` or `apk update && apk add`), preserve this pattern for uniform script structure.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:6-15
Timestamp: 2025-07-01T14:52:15.217Z
Learning: For installation scripts in the CLP project, maintain consistency in command patterns across different platforms (e.g., using separate update and install commands like `apk update && apk add`, `apt update && apt install`, `yum update && yum install`) rather than platform-specific optimizations, to ensure uniform script structure and readability.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/scripts/lib_install/musllinux_1_2/install-prebuilt-packages.sh:27-32
Timestamp: 2025-07-07T17:41:15.655Z
Learning: In CLP installation scripts, consistency across platform scripts is prioritized over defensive programming improvements. For example, when extracting Task binaries with tar in `install-prebuilt-packages.sh`, the extraction pattern should remain consistent with other platform scripts rather than adding defensive flags like `--strip-components=1` to handle potential tarball layout changes.
Learnt from: kirkrodrigues
PR: y-scope/clp#881
File: components/core/tools/scripts/lib_install/ubuntu-jammy/install-prebuilt-packages.sh:35-41
Timestamp: 2025-05-06T09:48:55.408Z
Learning: For installation scripts in the CLP project, prefer explicit error handling over automatic dependency resolution (like `apt-get install -f`) when installing packages to give users more control over their system.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#1036
File: components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py:204-211
Timestamp: 2025-07-03T12:58:18.407Z
Learning: In the CLP codebase, the validate_and_cache_dataset function in components/clp-py-utils/clp_py_utils/clp_metadata_db_utils.py uses in-place updates of the existing_datasets set parameter rather than returning a new set, as preferred by the development team.
Learnt from: haiqi96
PR: y-scope/clp#569
File: components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py:380-392
Timestamp: 2024-11-15T20:07:22.256Z
Learning: The current implementation assumes single-threaded execution, so race conditions in functions like `is_target_extracted` in `components/job-orchestration/job_orchestration/scheduler/query/query_scheduler.py` are not currently a concern.
Learnt from: jackluo923
PR: y-scope/clp#1054
File: components/core/tools/docker-images/clp-env-base-musllinux_1_2-aarch64/build.sh:3-5
Timestamp: 2025-07-07T17:43:04.349Z
Learning: In CLP project build scripts (specifically build.sh files in docker-images directories), maintain consistency with the established pattern of using separate `set -eu` and `set -o pipefail` commands rather than combining them into `set -euo pipefail`, to ensure uniform script structure across all platform build scripts.
Learnt from: gibber9809
PR: y-scope/clp#955
File: components/core/src/clp_s/search/sql/CMakeLists.txt:8-26
Timestamp: 2025-06-02T18:22:24.060Z
Learning: In the clp project, ANTLR code generation at build time is being removed by another PR. When reviewing CMake files, be aware that some temporary suboptimal configurations may exist to reduce merge conflicts between concurrent PRs, especially around ANTLR_TARGET calls.
Learnt from: haiqi96
PR: y-scope/clp#523
File: components/core/src/clp/clp/FileCompressor.hpp:58-78
Timestamp: 2024-10-24T14:25:17.978Z
Learning: When reviewing legacy code refactors, avoid suggesting changes that would extend the scope of the PR.
Learnt from: Bill-hbrhbr
PR: y-scope/clp#831
File: components/job-orchestration/job_orchestration/scheduler/compress/compression_scheduler.py:0-0
Timestamp: 2025-04-17T16:55:23.796Z
Learning: In the CLP project, SQL queries should use parameterized queries with placeholders (%s) and pass values as a tuple to `db_cursor.execute()` to prevent SQL injection, rather than directly interpolating values into the query string.
components/clp-py-utils/clp_py_utils/clp_config.py (2)
Learnt from: haiqi96
PR: y-scope/clp#743
File: components/clp-py-utils/clp_py_utils/clp_config.py:339-365
Timestamp: 2025-04-21T16:21:07.371Z
Learning: When validating enum values in Pydantic models with a pre=True root_validator, convert string values to enum instances using EnumClass(string_value) within a try/except block before comparison, as direct comparison between strings and enum instances will always return False.
Learnt from: haiqi96
PR: y-scope/clp#594
File: components/clp-package-utils/clp_package_utils/scripts/native/del_archives.py:104-110
Timestamp: 2024-11-15T16:21:52.122Z
Learning: In `clp_package_utils/scripts/native/del_archives.py`, when deleting archives, the `archive` variable retrieved from the database is controlled and is always a single string without path components. Therefore, it's acceptable to skip additional validation checks for directory traversal in this context.
🧬 Code Graph Analysis (1)
components/clp-py-utils/clp_py_utils/clp_config.py (1)
components/clp-py-utils/clp_py_utils/core.py (2)
make_config_path_absolute
(24-35)get_config_value
(7-21)
🪛 GitHub Actions: clp-lint
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
[error] 136-159: Black formatting check failed. File would be reformatted. Run 'black --write' to fix code style issues.
🔇 Additional comments (11)
components/clp-py-utils/clp_py_utils/clp_config.py (6)
3-3
: LGTM!The addition of
List
to the typing imports is necessary for the new runnable components functionality.
46-68
: LGTM!The runnable components lists are well-defined and logically sound. Excluding query-related components for Presto is correct since external query processing is handled by Presto itself.
75-78
: LGTM!The QueryEngine enum follows the established pattern using LowercaseStrEnum and provides the necessary values for native and Presto query engines.
98-98
: LGTM!The new query_engine field with default value "native" maintains backward compatibility while enabling the new functionality.
109-117
: LGTM!The query_engine validator follows the established pattern and is consistent with other validators in the Package class.
156-158
: LGTM!The formatting improvements enhance readability and follow Python conventions without affecting functionality.
Also applies to: 179-181, 223-225, 409-415, 488-490, 511-513, 667-671, 676-678, 687-689, 691-693, 694-697, 750-752, 780-780, 795-800, 823-825, 827-829, 831-833
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (5)
28-28
: LGTM!The QueryEngine import is necessary for the new conditional component execution logic.
1172-1185
: LGTM!The runnable components logic is well-implemented and follows the approach suggested in previous reviews. The early exit for unavailable components prevents unnecessary processing.
1262-1320
: LGTM!The conditional execution logic is consistent and follows the established pattern. This approach is cleaner than negative conditionals and addresses the concerns raised in previous reviews.
943-943
: LGTM!The query engine setting is correctly propagated to the web UI client settings, enabling the UI to adapt based on the configured query engine.
141-143
: LGTM!The formatting improvements enhance readability and follow Python conventions without affecting functionality.
Also applies to: 151-153, 182-192, 379-381, 424-426, 430-433, 442-444, 447-453, 467-472, 508-510, 528-532, 534-538, 551-556, 667-669, 823-825, 880-882, 915-923, 925-927, 930-932, 1028-1030, 1146-1148, 1172-1174, 1276-1284, 1290-1293, 1294-1302, 1304-1307, 1309-1312, 1314-1317, 1319-1320
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
Outdated
Show resolved
Hide resolved
@gibber9809 @haiqi96 - I changed per ur comment |
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
…lp.py Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
…lp.py Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
Co-authored-by: kirkrodrigues <2454684+kirkrodrigues@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (5)
components/clp-py-utils/clp_py_utils/clp_config.py (3)
158-178
: Fix enum comparisons in validator.The validator incorrectly compares string values with enum members, which will always evaluate to False. Use
.value
for proper comparison.@root_validator def validate_query_engine_package_compatibility(cls, values): query_engine = values.get("query_engine") storage_engine = values.get("storage_engine") - if query_engine in [QueryEngine.CLP, QueryEngine.CLP_S]: + if query_engine in [QueryEngine.CLP.value, QueryEngine.CLP_S.value]: if query_engine != storage_engine: raise ValueError( f"query_engine '{query_engine}' is only compatible with " f"storage_engine '{query_engine}'." ) - elif query_engine == QueryEngine.PRESTO: - if storage_engine != StorageEngine.CLP_S: + elif query_engine == QueryEngine.PRESTO.value: + if storage_engine != StorageEngine.CLP_S.value: raise ValueError( - f"query_engine '{QueryEngine.PRESTO}' is only compatible with " - f"storage_engine '{StorageEngine.CLP_S}'." + f"query_engine '{QueryEngine.PRESTO.value}' is only compatible with " + f"storage_engine '{StorageEngine.CLP_S.value}'." ) else: raise ValueError(f"Unsupported query_engine '{query_engine}'.") return values
860-864
: Fix enum comparison and use predefined component groups.The method has an enum comparison issue and should use the component groups as suggested in past reviews.
def get_runnable_components(self) -> Set[str]: - if QueryEngine.PRESTO == self.package.query_engine: + if QueryEngine.PRESTO.value == self.package.query_engine: return COMPRESSION_COMPONENTS | UI_COMPONENTS else: return ALL_COMPONENTS
902-908
: Apply the suggested refactor for clarity.As suggested in past reviews, this can be simplified.
def get_components_for_target(target: str) -> Set[str]: if target in TARGET_TO_COMPONENTS: return TARGET_TO_COMPONENTS[target] elif target in ALL_COMPONENTS: return {target} else: return set()components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2)
111-119
: Use explicit length check for clarity.As suggested in past reviews, use explicit length comparison for better readability.
- if len(components_to_start) == 0: + if len(components_to_start) == 0:Note: The suggestion is already correctly implemented in the code.
197-200
: Verify DB availability before creating tables.The special handling for CONTROLLER_TARGET_NAME could attempt to create tables when the DB component isn't running. Consider checking if the DB container is actually running before attempting table creation.
#!/bin/bash # Check if there's any validation that DB is running before table creation rg -A 5 "create_db_tables" --type py
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
(4 hunks)components/clp-py-utils/clp_py_utils/clp_config.py
(7 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
get_components_for_target
(902-908)get_runnable_components
(860-864)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: ubuntu-jammy-static-linked-bins
🔇 Additional comments (8)
components/clp-py-utils/clp_py_utils/clp_config.py (4)
3-3
: LGTM!The Set import is correctly added for the new methods that return sets of components.
30-50
: Component groupings are well-structured.Good use of set operations to build component hierarchies. The reuse of GENERAL_SCHEDULING_COMPONENTS in both COMPRESSION_COMPONENTS and QUERY_COMPONENTS follows DRY principles.
56-63
: Target mapping is correctly defined.The mapping appropriately associates targets with their component sets, with the controller target including the necessary scheduling infrastructure.
113-116
: LGTM!QueryEngine enum properly defined following the same pattern as StorageEngine.
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (4)
25-25
: LGTM!Import correctly added for the new component filtering functionality.
884-884
: LGTM!Query engine setting correctly added to web UI client configuration.
194-195
: LGTM!Component start correctly checks membership in the filtered set.
202-234
: Component filtering consistently applied.All component starts correctly check membership in the filtered set. Note that results cache indices creation has the same special controller handling as DB tables.
will test building package again before merging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🔭 Outside diff range comments (1)
components/webui/client/src/config/index.ts (1)
1-1
: Remove redundant import of CLP_QUERY_ENGINESThe
CLP_QUERY_ENGINES
import fromcommon/index.js
is unnecessary since you're defining it locally in this file. This creates a naming conflict.Apply this diff to remove the redundant import:
-import {CLP_QUERY_ENGINES} from "../../../common/index.js"; import {settings} from "../settings";
♻️ Duplicate comments (4)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
831-831
: Fix the enum comparisonThe comparison between
QueryEngine.PRESTO
(enum) andself.package.query_engine
(string) will always be False.Apply this diff to fix the comparison:
- if QueryEngine.PRESTO == self.package.query_engine: + if QueryEngine.PRESTO.value == self.package.query_engine:
132-143
: Consider consistent enum value comparisons in the validatorThe validator mixes string and enum comparisons. While functional due to string enums, using consistent comparison approaches would improve clarity.
Apply this diff for more consistent comparisons:
- if query_engine in [QueryEngine.CLP, QueryEngine.CLP_S]: + if query_engine in [QueryEngine.CLP.value, QueryEngine.CLP_S.value]: if query_engine != storage_engine: raise ValueError( f"query_engine '{query_engine}' is only compatible with " f"storage_engine '{query_engine}'." ) - elif query_engine == QueryEngine.PRESTO: - if storage_engine != StorageEngine.CLP_S: + elif query_engine == QueryEngine.PRESTO.value: + if storage_engine != StorageEngine.CLP_S.value: raise ValueError( - f"query_engine '{QueryEngine.PRESTO}' is only compatible with " - f"storage_engine '{StorageEngine.CLP_S}'." + f"query_engine '{QueryEngine.PRESTO.value}' is only compatible with " + f"storage_engine '{StorageEngine.CLP_S.value}'." )components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2)
1120-1120
: Use explicit length comparison for clarityWhile
if not components_to_start
is Pythonic, usingif len(components_to_start) == 0
would be more explicit and consistent with the team's style preferences.Apply this diff:
- if len(components_to_start) == 0: + if len(components_to_start) == 0:Wait, the code already uses the explicit form. No change needed.
1201-1204
: Simplify controller target special handling logicThe special conditions for creating DB tables and cache indices when
CONTROLLER_TARGET_NAME
is targeted add unnecessary complexity. Thecomponents_to_start
set already contains the correct filtered components.Apply this diff to simplify the logic:
- if ( - target == CONTROLLER_TARGET_NAME and DB_COMPONENT_NAME in runnable_components - ) or DB_COMPONENT_NAME in components_to_start: + if DB_COMPONENT_NAME in components_to_start: create_db_tables(instance_id, clp_config, container_clp_config, mounts) # ... other code ... - if ( - target == CONTROLLER_TARGET_NAME and RESULTS_CACHE_COMPONENT_NAME in runnable_components - ) or RESULTS_CACHE_COMPONENT_NAME in components_to_start: + if RESULTS_CACHE_COMPONENT_NAME in components_to_start: create_results_cache_indices(instance_id, clp_config, container_clp_config, mounts)Also applies to: 1215-1218
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py
(4 hunks)components/clp-py-utils/clp_py_utils/clp_config.py
(7 hunks)components/webui/client/src/config/index.ts
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{cpp,hpp,java,js,jsx,tpp,ts,tsx}
⚙️ CodeRabbit Configuration File
- Prefer
false == <expression>
rather than!<expression>
.
Files:
components/webui/client/src/config/index.ts
🧬 Code Graph Analysis (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (1)
components/clp-py-utils/clp_py_utils/clp_config.py (2)
get_components_for_target
(872-878)get_runnable_components
(830-834)
components/webui/client/src/config/index.ts (1)
components/webui/common/index.ts (1)
CLP_QUERY_ENGINES
(133-133)
🪛 Biome (2.1.2)
components/webui/client/src/config/index.ts
[error] 16-16: Shouldn't redeclare 'CLP_QUERY_ENGINES'. Consider to delete it or rename it.
'CLP_QUERY_ENGINES' is defined here:
(lint/suspicious/noRedeclare)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
- GitHub Check: ubuntu-jammy-static-linked-bins
- GitHub Check: ubuntu-jammy-dynamic-linked-bins
- GitHub Check: centos-stream-9-dynamic-linked-bins
- GitHub Check: centos-stream-9-static-linked-bins
- GitHub Check: ubuntu-jammy-lint
- GitHub Check: lint-check (ubuntu-24.04)
- GitHub Check: build-macos (macos-15, true)
- GitHub Check: build-macos (macos-15, false)
🔇 Additional comments (2)
components/clp-package-utils/clp_package_utils/scripts/start_clp.py (2)
1115-1122
: LGTM! Clean implementation of component filteringThe implementation correctly computes the intersection of target components and runnable components, with proper early exit when no components can be started. The error message clearly indicates the issue to the user.
1198-1238
: LGTM! Consistent component start logicAll component startup calls now properly check membership in
components_to_start
, providing uniform behaviour across all components. This is a clean and maintainable approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feat(clp-package)!: Add query engine option to support starting only compression components when using the Presto query engine.
@gibber9809's review has been addressed but they may not have time to re-review and approve.
builds now. Also tested package startup for clp, and presto, and did not encounter issues |
re: title it also starts the UI components right |
True. How about:
I tested the tasks to build the package tars and they worked (with the hotfix for the sed issue). |
Description
Background:
The UI team is working on making the package compatible with the Presto connector.
This PR disables unnecessary components when using Presto and passes a corresponding flag to the Web UI.
Changes:
Introduced a new query_engine option in the package config, which can be set to either "native" or "presto".
"native" refers to the current behavior of the package.
"presto" disables internal query-related components, as querying will now be handled by Presto.
Note:
The stop-clp.sh script has not been modified, as it already gracefully handles the case where components are not running (i.e., it does not fail if it attempts to stop components that were never started).
Checklist
breaking change.
No docs for this until fully implemented
Validation performed
Started and stopped the package with both presto and native options
Summary by CodeRabbit
New Features
Chores