Skip to content

Add interface for NoSQL storage #214

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 163 commits into from
Mar 9, 2025
Merged

Add interface for NoSQL storage #214

merged 163 commits into from
Mar 9, 2025

Conversation

mcopik
Copy link
Collaborator

@mcopik mcopik commented Jul 26, 2024

We want to add NoSQL storage and a test benchmark implementing simple CRUD API - emulate a shopping cart, like on Amazon.

  • Add Python CRUD benchmark.
  • Don't create GCP/Azure envs for tables - when not needed
  • cherry-pick workflow commits fixing env update on cold start
  • Cherry-pick Azure fixes for envs from workflows branch.
  • enforcing cold start on Azure - is the current update is sufficient?
  • Support DynamoDB on AWS
  • Support CosmosDB on Azure
  • Support Firestore/Datastore on Google
  • Create a nice and consistent way of deploying both Minio & ScyllaDB simultaneously
  • Unify OpenWhisk/Local storage resources
  • Support ScyllaDB on local deployment
  • Verify caching of Minio/ScyllaDB on local
  • Verify it works with OpenWhisk
  • Verify regression still works (CLI init).
  • Add documentation on NoSQL design and configuration.
  • Add documentation on the new storage CLI.

Summary by CodeRabbit

Updated Release Notes

  • New Features

    • Introduced enhanced multi-cloud support for storage and NoSQL operations to streamline deployments across AWS, Azure, GCP, and self-hosted environments.
    • Added new CLI options and configuration parameters for modular storage management.
    • Added support for ScyllaDB as a NoSQL storage option.
    • Added new classes and methods for managing configurations and interactions with various NoSQL databases.
    • Added new configuration files for MinIO and ScyllaDB, defining settings for storage services.
    • Introduced new benchmark entry "130.crud-api" to enhance performance testing capabilities.
    • Added new methods for handling NoSQL storage configurations and implementations.
  • Refactor

    • Unified resource and environment variable handling for storage and NoSQL services.
    • Streamlined the management of storage configurations and improved the structure of related classes.
    • Enhanced the input preparation process and storage retrieval logic across multiple classes.
  • Documentation

    • Expanded supported platforms section and updated permission guidelines.
    • Added new documentation for storage configuration and deployment.
  • Chores

    • Updated dependencies and Docker build arguments for improved architecture support.

Copy link

coderabbitai bot commented Jul 26, 2024

Walkthrough

This pull request encompasses widespread refactoring and feature additions across the project. Major updates include enhancements to benchmark configurations (e.g., new “modules” keys and updated function signatures with an added nosql parameter), improvements to storage and NoSQL wrappers across multiple cloud providers, and revamped resource management in the SeBS core. New abstract classes, enums, and utilities have been introduced for more modular and explicit configuration handling. Additionally, Docker build scripts and documentation are updated to accommodate architecture-specific builds and improved dependency management.

Changes

Files/Paths Change Summary
.gitignore, .mypy.ini, benchmarks/.../config.json, benchmarks/.../input.py Appended new ignore entries; added "modules" keys in JSON configs; removed legacy functions; updated function signatures to include nosql_func across multiple benchmark webapps (dynamic-html, uploader, crud-api, thumbnailer, video-processing, compression, image-recognition, graph benchmarks, dna-visualisation).
benchmarks/wrappers/.../nosql.py, benchmarks/wrappers/.../handler.py Introduced new nosql classes for AWS, Azure, GCP, local and OpenWhisk; added conditional initialization logic based on environment variables; updated storage class constructors and instance retrieval methods.
config/storage.json, config/systems.json, dockerfiles/..., docs/platforms.md, requirements.*.txt Added new configuration file for MinIO and ScyllaDB; restructured system config for base images and deployments; introduced TARGET_ARCHITECTURE arg in Dockerfiles; updated documentation with supported platforms and added new dependency entries.
sebs.py, sebs/aws/..., sebs/azure/..., sebs/benchmark.py, sebs/cache.py, sebs/config.py, sebs/experiments/... Refactored core SeBS CLI to accept multiple storage configuration files; introduced system_resources property; added new classes (e.g., AWSSystemResources, DynamoDB, CosmosDBAccount) and methods for resource and environment variable management; updated benchmark input preparation to use system_resources instead of direct storage.
sebs/faas/..., sebs/gcp/..., sebs/local/..., sebs/openwhisk/... Added abstract base class NoSQLStorage and SystemResources; refactored cloud provider integrations (GCP, Azure, OpenWhisk) to use new resource managers; removed deprecated MinIO storage support from OpenWhisk and local modules; updated function deployment and update flows accordingly.
sebs/storage/..., sebs/types.py, sebs/utils.py Introduced new storage configuration abstraction via PersistentStorageConfig and NoSQLStorageConfig with new implementations such as ScyllaDBConfig; added new resource manager classes (SelfHostedResources, SelfHostedSystemResources); new enum classes BenchmarkModule and NoSQLStorage added; utility function append_nested_dict implemented for merging nested dictionaries.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/CLI
    participant SeBS as SeBS Core
    participant SysRes as SystemResources
    participant Storage as Persistent Storage
    participant NoSQL as NoSQL Storage

    User->>SeBS: Provide storage configuration(s) via CLI
    SeBS->>SysRes: Load and merge multiple storage configs
    SysRes->>Storage: get_storage(replace_existing)
    Storage-->>SysRes: Return storage instance
    SysRes->>NoSQL: get_nosql_storage()
    NoSQL-->>SysRes: Return NoSQL storage instance
    SysRes->>SeBS: Supply complete resource instances
    SeBS->>Function: Update function configuration (env vars, storage settings)
    Function-->>SeBS: Confirm update
Loading

Poem

Hippity-hop, I code all day,
Adding modules in a bouncy way.
Storage and NoSQL now take the stage,
Cloud and local join the page.
Configurations merge with a friendly cheer,
CodeRabbit sings—change is here! 🐇

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@mcopik mcopik changed the title [system] Add basic interface for allocating NoSQL storage Add interface for allocating NoSQL storage Jul 26, 2024
@mcopik mcopik changed the title Add interface for allocating NoSQL storage Add interface for NoSQL storage Jul 26, 2024
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
sebs/storage/scylladb.py (2)

293-296: ⚠️ Potential issue

Use the correct table name when waiting for creation.

You are passing name instead of the actual table_name to waiter.wait(TableName=...), which may cause the waiter to fail if the new table name differs from the logical name.

- waiter.wait(TableName=name)
+ waiter.wait(TableName=table_name)

305-310: ⚠️ Potential issue

Return the actual table name for consistency.

When the table already exists, the code logs the real table_name but returns the logical name instead. Returning the actual table name is more consistent and avoids downstream mismatches.

- return name
+ return table_name
🧹 Nitpick comments (22)
sebs/aws/resources.py (2)

42-49: Remove outdated references to “benchmark” and “buckets” in the docstring.

These parameters are no longer part of the method signature, causing confusion for anyone reading or using the code. Consider removing or updating these references to match the current implementation.

 """
- Create a client instance for cloud storage. When benchmark and buckets
- parameters are passed, then storage is initialized with required number
- of buckets. Buckets may be created or retrieved from cache.
+ Create a client instance for cloud storage.
 :param replace_existing: replace existing files in cached buckets?
 :return: storage client
 """

70-82: Set logging handlers for DynamoDB for better consistency.

In get_storage(), the S3 storage uses logging_handlers; however, the DynamoDB instance currently does not. Consistency in logging is helpful for debugging and operation visibility.

 def get_nosql_storage(self) -> NoSQLStorage:
     if not self._nosql_storage:
         assert self._session is not None
         self.logging.info("Initialize DynamoDB NoSQL instance.")
         self._nosql_storage = DynamoDB(
             self._session,
             self._cache_client,
             self.config.resources,
             self.config.region,
             access_key=self.config.credentials.access_key,
             secret_key=self.config.credentials.secret_key,
         )
+        self._nosql_storage.logging_handlers = self._logging_handlers
     return self._nosql_storage
sebs/azure/system_resources.py (2)

41-50: Remove or revise references to creating buckets in the docstring.

The method no longer generates buckets per se; it initializes a BlobStorage instance. Unused references to “buckets” or related parameters can confuse readers.

 """
     Create wrapper object for Azure blob storage.
     First ensure that storage account is created and connection string
-    is known. Then, create wrapper and create request number of buckets.
+    is known. Then, create BlobStorage wrapper instance.
 
     Requires Azure CLI instance in Docker to obtain storage account details.
 
     :param replace_existing: when true, replace existing files in input buckets
     :return: Azure storage instance
 """

76-105: Declare _cli_instance_stop in the constructor for clarity.

Introducing _cli_instance_stop midway through the code makes its usage less discoverable. Consider initializing it in the constructor to keep class attributes organized and maintain a clear, readable structure.

 def __init__(
     self,
     system_config: SeBSConfig,
     config: AzureConfig,
     cache_client: Cache,
     docker_client: docker.client,
     logger_handlers: LoggingHandlers,
 ):
     super().__init__(config, cache_client, docker_client)
     self._logging_handlers = logger_handlers
     self._storage: Optional[BlobStorage] = None
     self._nosql_storage: Optional[CosmosDB] = None
     self._cli_instance: Optional[AzureCLI] = None
+    self._cli_instance_stop: bool = False
     self._system_config = system_config

 def cli_instance(self) -> AzureCLI:
     if self._cli_instance is None:
         self._cli_instance = AzureCLI(self._system_config, self._docker_client)
-        self._cli_instance_stop = True
+        self._cli_instance_stop = True  # reuse the declared attribute
     ...
sebs/faas/nosql.py (1)

112-118: Clarify the return values of clear_table and remove_table.

Both methods declare a string return type but do not specify what the string represents (e.g., a success message, the table name, or an error code). Providing a brief description in the docstring or code comments will improve readability.

sebs/azure/cloud_resources.py (3)

6-10: Consider expanding the docstring for future Azure resources
Providing additional context here about other potential resources would address past feedback regarding this file’s generic name.


32-55: Add robust error handling for Azure CLI failures
If retrieving the URL or credentials fails, consider catching and converting the error into a more descriptive exception at this level for better troubleshooting.


66-77: Ensure secure handling of primary keys
Make sure the primary key is never logged or stored in plaintext beyond this scope, to avoid security or compliance issues.

sebs/gcp/datastore.py (5)

20-26: Validate config input in deserialize
Although straightforward, consider validating the presence of required keys in config to avoid potential KeyErrors.


47-53: get_tables returns an empty dict
For debugging or feature completeness, consider enumerating kinds in the Datastore for more transparency.


76-81: Ensure consistent cache client usage
Here, we see self._cache_client referenced. Other files use self.cache_client; maintaining a single reference pattern prevents confusion.


85-116: write_to_table asserts secondary key
Currently, a missing secondary key is treated as impossible. Consider raising an explicit exception or supporting the no-secondary-key case.


176-181: Unimplemented table methods
Add docstrings or TODOs explaining how these methods will be implemented in the future.

sebs/azure/cosmosdb.py (5)

22-28: Serialization
Simple and clear. Consider minimal input validation on config keys for extra safety.


48-54: get_tables returning empty dict
Similar to other providers, consider enumerating containers for debugging or feature completeness.


119-142: write_to_table requires secondary key
Asserting the presence of secondary_key could be replaced with a more descriptive exception or optional logic if the design eventually allows a single key.


143-195: Creating database and container
Implementation checks existence by reading the container and conditionally creating it. This is a suitable approach, though parallel creation attempts may need synchronization.


196-201: Stubs for clear_table and remove_table
Documenting intent or timeline for these features would help future contributors.

sebs/storage/scylladb.py (4)

108-121: Consider using Docker health checks for container readiness.

The manual loop with nodetool status works but a native Docker health check would be more idiomatic and potentially more reliable. You can define a health check command in the image (or via run parameters) and poll the container’s health status directly.


110-110: Fix typos in log messages.

Several log messages include minor typographical errors such as “succesfully” and “ScyllaBD.” Correcting these will ensure clearer logs.

- self.logging.info("Started ScyllaDB succesfully!")
+ self.logging.info("Started ScyllaDB successfully!")

- self.logging.error("Failed to launch ScyllaBD!")
+ self.logging.error("Failed to launch ScyllaDB!")

- raise RuntimeError("Starting ScyllaDB storage unsuccesful")
+ raise RuntimeError("Starting ScyllaDB storage unsuccessful")

Also applies to: 117-117, 128-128


125-125: Raise exceptions using the raise ... from e pattern for clarity.

When re-raising exceptions within except clauses, attach the original exception object (e.g., raise RuntimeError("...") from e) to clearly preserve the exception chain. This improvement addresses static analysis concerns and clarifies error origins.

 except docker.errors.APIError as e:
     ...
-    raise RuntimeError("Starting ScyllaDB storage unsuccesful")
+    raise RuntimeError("Starting ScyllaDB storage unsuccessful") from e

Also applies to: 128-128, 199-199, 312-312

🧰 Tools
🪛 Ruff (0.8.2)

125-125: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


130-130: Remove code duplication with MinIO logic.

The comment indicates a “FIXME” for duplicated code from your MinIO storage implementation. Refactoring and centralizing shared functionality would simplify maintenance and reduce potential inconsistencies.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 69e83aa and 6d8229f.

📒 Files selected for processing (12)
  • benchmarks/100.webapps/130.crud-api/input.py (1 hunks)
  • benchmarks/wrappers/gcp/python/handler.py (1 hunks)
  • sebs/aws/resources.py (1 hunks)
  • sebs/azure/cloud_resources.py (1 hunks)
  • sebs/azure/cosmosdb.py (1 hunks)
  • sebs/azure/system_resources.py (1 hunks)
  • sebs/benchmark.py (9 hunks)
  • sebs/faas/nosql.py (1 hunks)
  • sebs/faas/resources.py (1 hunks)
  • sebs/gcp/datastore.py (1 hunks)
  • sebs/regression.py (15 hunks)
  • sebs/storage/scylladb.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • benchmarks/wrappers/gcp/python/handler.py
  • sebs/faas/resources.py
  • benchmarks/100.webapps/130.crud-api/input.py
  • sebs/benchmark.py
  • sebs/regression.py
🧰 Additional context used
🪛 Ruff (0.8.2)
sebs/gcp/datastore.py

164-164: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

sebs/storage/scylladb.py

125-125: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


128-128: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


199-199: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


312-312: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (24)
sebs/faas/nosql.py (1)

61-78: Consider concurrency-safety for table creation.

If multiple threads attempt to create the same benchmark table concurrently, there may be a race condition. Recommend clarifying the concurrency model or adding synchronization to guarantee only one creation attempt.

sebs/azure/cloud_resources.py (4)

1-3: Imports are well-structured
No issues found; all imports appear necessary and relevant.


13-31: Class properties look solid
The read-only properties and constructor are straightforward and clear.


56-65: CosmosDB endpoint retrieval
Implementation for query_url is correct. The returned endpoint is properly extracted from CLI output.


78-87: Serialization logic
The serialization and deserialization methods are concise and effective.

sebs/gcp/datastore.py (8)

1-3: Core imports
Looks good; usage is consistent with the rest of the file.


12-19: BenchmarkResources dataclass
The field definitions are clear and well-aligned with Datastore usage.


28-46: Class Datastore initialization
Static methods and constructor strongly align with NoSQLStorage; no issues noted.


54-63: _get_table_name
Straightforward logic verifying existence in benchmark_resources; no issues found.


64-75: Cache retrieval flow
Implementation aligns well with the cache-based design patterns in this codebase.


82-84: benchmark_database
Simple database name accessor; appear consistent with the rest of the code.


117-161: Database creation logic
This section properly checks for Firestore existence and allocates a new database when needed. Error handling for "NOT_FOUND" is clear.


165-175: Benchmark resource registration
Resources and database clients are memoized correctly here after creation; flow is logical.

sebs/azure/cosmosdb.py (11)

1-3: Imports
No issues identified; all are relevant to Cosmos DB usage.


14-21: BenchmarkResources dataclass
Defines database and containers for Cosmos DB context. Straightforward approach.


30-38: CosmosDB class definition
Follows the NoSQLStorage contract with matching method signatures.


39-46: Constructor usage of cache client
Previous feedback on cache_client usage appears resolved; the references look consistent.


55-64: _get_table_name
Method checks membership in containers to confirm existence. No issues found.


65-76: retrieve_cache
Logic follows the established cache pattern; straightforward and effective.


77-82: update_cache
Consistent usage of self.cache_client to update NoSQL configuration.


83-96: Lazy initialization of Cosmos client
Prevents unnecessary connections until the client is needed. Implementation looks solid.


97-99: has_tables
Minimal check for a benchmark’s presence in _benchmark_resources. Fits the intended usage.


100-102: benchmark_database
Retrieves database name from _benchmark_resources. No concerns noted.


103-118: credentials
Ensures the CosmosDBAccount is fetched first; returning the triple (account name, URL, key) is straightforward.

Comment on lines +162 to +164
self.logging.error("Couldn't query Datastore instances!")
self.logging.error(e)
raise RuntimeError("Couldn't query Datastore instances!")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use exception chaining for clarity
Per B904, attach the original error as context to preserve the traceback and clarify debugging:

-    raise RuntimeError("Couldn't query Datastore instances!")
+    raise RuntimeError("Couldn't query Datastore instances!") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self.logging.error("Couldn't query Datastore instances!")
self.logging.error(e)
raise RuntimeError("Couldn't query Datastore instances!")
self.logging.error("Couldn't query Datastore instances!")
self.logging.error(e)
raise RuntimeError("Couldn't query Datastore instances!") from e
🧰 Tools
🪛 Ruff (0.8.2)

164-164: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (4)
sebs/aws/dynamodb.py (4)

97-97: Consider adding error handling for put_item operation.

The put_item operation may fail for various reasons (e.g., provisioned throughput exceeded, conditional check failed). Consider adding error handling to catch and properly handle these exceptions.

-        self.client.put_item(TableName=table_name, Item=serialized_data)
+        try:
+            self.client.put_item(TableName=table_name, Item=serialized_data)
+        except self.client.exceptions.ProvisionedThroughputExceededException as e:
+            self.logging.error(f"DynamoDB write capacity exceeded: {e}")
+            raise RuntimeError(f"Failed to write to DynamoDB table {table_name}: {e}") from e
+        except Exception as e:
+            self.logging.error(f"Error writing to DynamoDB table {table_name}: {e}")
+            raise RuntimeError(f"Failed to write to DynamoDB table {table_name}: {e}") from e

123-124: Type ignores should be fixed with proper typing.

Instead of using # type: ignore, consider properly typing these parameters or using a more specific type annotation.

-                AttributeDefinitions=definitions,  # type: ignore
-                KeySchema=key_schema,  # type: ignore
+                AttributeDefinitions=definitions,  # type: List[Dict[str, str]]
+                KeySchema=key_schema,  # type: List[Dict[str, str]]

You could also define proper type aliases at the top of the file to make the code more readable:

from typing import Dict, List, Optional, Tuple

AttributeDefinition = Dict[str, str]
KeySchemaElement = Dict[str, str]

169-169: Use exception chaining for better error traceability.

When raising a new exception from a caught exception, use raise ... from e to maintain the exception chain for better debugging.

-            raise RuntimeError(f"Creating DynamoDB failed, unknown reason! Error: {e}")
+            raise RuntimeError(f"Creating DynamoDB failed, unknown reason! Error: {e}") from e
🧰 Tools
🪛 Ruff (0.8.2)

169-169: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)


12-20: Missing docstring for class and static methods.

Adding class and method docstrings would improve code readability and maintainability.

 class DynamoDB(NoSQLStorage):
+    """
+    AWS DynamoDB NoSQL storage implementation for SeBS.
+    
+    This class provides methods to create, write to, and manage DynamoDB tables
+    for benchmark data storage.
+    """
     @staticmethod
     def typename() -> str:
+        """Return the type name for this storage provider."""
         return "AWS.DynamoDB"

     @staticmethod
     def deployment_name():
+        """Return the deployment name for this storage provider."""
         return "aws"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d8229f and 130c3d6.

📒 Files selected for processing (1)
  • sebs/aws/dynamodb.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
sebs/aws/dynamodb.py

169-169: Within an except clause, raise exceptions with raise ... from err or raise ... from None to distinguish them from errors in exception handling

(B904)

🔇 Additional comments (4)
sebs/aws/dynamodb.py (4)

34-36: AWS credentials are provided but might not be necessary.

The AWS credentials (access_key and secret_key) are passed explicitly to the client. Based on previous discussions in the PR, these might not be necessary if you're relying on the session's credentials. Consider whether explicitly passing them is required for your use case.

#!/bin/bash
# Check if boto3 client creation works without explicit credentials
# when using a session

python3 -c "
import boto3
# Test if creating a session and then a client works without explicit credentials
try:
    session = boto3.session.Session()
    # Creating a client from the session should inherit the session's credentials
    client = session.client('dynamodb', region_name='us-east-1')
    print('Client creation without explicit credentials works')
except Exception as e:
    print(f'Error: {e}')
"

171-175: Implement the missing table operations methods.

The clear_table and remove_table methods are not implemented. Based on previous discussions, these might be planned for a future PR (#242). Please ensure these are implemented or add a more descriptive comment about the future implementation.

I could help implement these methods if needed. For example, remove_table would use delete_table API, and clear_table would require scanning and deleting all items in the table.


39-40: Good explanation for the table mapping structure.

The comment explaining the mapping structure with an example is helpful for understanding the code. This is a good practice.


98-103: Good documentation of AWS-specific behavior.

The docstring explaining the difference in database object hierarchy between AWS, Azure, and GCP is valuable context that helps readers understand the implementation.

f"Using existing DynamoDB table {table_name} for benchmark {benchmark}"
)
self._tables[benchmark][name] = table_name
return name
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Return value inconsistency.

You're returning name here when the function generally returns the AWS table name (table_name). This could lead to unexpected behavior.

-                return name
+                return table_name
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return name
return table_name


def update_cache(self, benchmark: str):

self._cache_client.update_nosql(
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix inconsistent cache client reference.

You're using self._cache_client here but it's defined as self.cache_client in the constructor and used as such elsewhere.

-        self._cache_client.update_nosql(
+        self.cache_client.update_nosql(
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self._cache_client.update_nosql(
self.cache_client.update_nosql(

f"Using existing DynamoDB table {table_name} for benchmark {benchmark}"
)
self._tables[benchmark][name] = table_name
return name
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Return value inconsistency.

Similar to line 154, you're returning name here when the function should be returning table_name.

-                return name
+                return table_name
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
return name
return table_name

Comment on lines +156 to +168
if "being created" in e.response["Error"]["Message"]:

self.logging.info(f"Waiting for the existing table {table_name} to be created")
waiter = self.client.get_waiter("table_exists")
waiter.wait(TableName=table_name, WaiterConfig={"Delay": 1})
ret = self.client.describe_table(TableName=table_name)

self.logging.info(
f"Using existing DynamoDB table {table_name} for benchmark {benchmark}"
)
self._tables[benchmark][name] = table_name
return name

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Eliminate code duplication in exception handling.

The code in this "being created" exception block is almost identical to the "already exists" block above. Consider consolidating these blocks to reduce duplication.

-            if "being created" in e.response["Error"]["Message"]:
-
-                self.logging.info(f"Waiting for the existing table {table_name} to be created")
-                waiter = self.client.get_waiter("table_exists")
-                waiter.wait(TableName=table_name, WaiterConfig={"Delay": 1})
-                ret = self.client.describe_table(TableName=table_name)
-
-                self.logging.info(
-                    f"Using existing DynamoDB table {table_name} for benchmark {benchmark}"
-                )
-                self._tables[benchmark][name] = table_name
-                return name
-
-            raise RuntimeError(f"Creating DynamoDB failed, unknown reason! Error: {e}")
+            if "already exists" in e.response["Error"]["Message"] or "being created" in e.response["Error"]["Message"]:
+                self.logging.info(f"Waiting for the existing table {table_name} to be created")
+                waiter = self.client.get_waiter("table_exists")
+                waiter.wait(TableName=table_name, WaiterConfig={"Delay": 1})
+                ret = self.client.describe_table(TableName=table_name)
+
+                self.logging.info(
+                    f"Using existing DynamoDB table {table_name} for benchmark {benchmark}"
+                )
+                self._tables[benchmark][name] = table_name
+                return table_name
+
+            raise RuntimeError(f"Creating DynamoDB failed, unknown reason! Error: {e}") from e
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if "being created" in e.response["Error"]["Message"]:
self.logging.info(f"Waiting for the existing table {table_name} to be created")
waiter = self.client.get_waiter("table_exists")
waiter.wait(TableName=table_name, WaiterConfig={"Delay": 1})
ret = self.client.describe_table(TableName=table_name)
self.logging.info(
f"Using existing DynamoDB table {table_name} for benchmark {benchmark}"
)
self._tables[benchmark][name] = table_name
return name
if "already exists" in e.response["Error"]["Message"] or "being created" in e.response["Error"]["Message"]:
self.logging.info(f"Waiting for the existing table {table_name} to be created")
waiter = self.client.get_waiter("table_exists")
waiter.wait(TableName=table_name, WaiterConfig={"Delay": 1})
ret = self.client.describe_table(TableName=table_name)
self.logging.info(
f"Using existing DynamoDB table {table_name} for benchmark {benchmark}"
)
self._tables[benchmark][name] = table_name
return table_name
raise RuntimeError(f"Creating DynamoDB failed, unknown reason! Error: {e}") from e

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
docs/platforms.md (2)

15-19: Enhance the "Supported platforms" header formatting.
The new section clearly lists the supported platforms. For better visual hierarchy and consistency with the rest of the document (which uses markdown headings for other sections), consider converting this plain text line into a markdown header (e.g., using ## Supported platforms).

Proposed diff:

-Supported platforms
+## Supported platforms

144-145: Improve sentence clarity and punctuation for GCP permissions.
The combined sentence reading "You should have at least write access to Cloud Functions (Cloud Functions Admin) and Logging Furthermore, SeBS needs the permissions to create Firestore databases through..." can be enhanced by adding proper punctuation to separate the requirements. Splitting into two sentences or adding a period after Logging would improve readability.

Proposed diff:

-You should have at least write access to **Cloud Functions** (`Cloud Functions Admin`) and **Logging** Furthermore, SeBS needs the permissions to create Firestore databases through
+You should have at least write access to **Cloud Functions** (`Cloud Functions Admin`) and **Logging**. Furthermore, SeBS needs the permissions to create Firestore databases through
sebs/config.py (2)

29-35: Add error handling and documentation to the deployment_module_packages method.

The method looks good functionally, but it's missing two important elements:

  1. Error handling for when the "module_packages" key doesn't exist in the configuration
  2. A docstring explaining the method's purpose, parameters, and return value

Consider updating the implementation to safely handle missing keys:

def deployment_module_packages(
    self, deployment_name: str, language_name: str
) -> Dict[str, str]:
+    """
+    Retrieves module packages for a specific deployment and language.
+    
+    Args:
+        deployment_name: The name of the deployment system
+        language_name: The programming language name
+        
+    Returns:
+        Dictionary mapping module names to their package names
+    """
-    return self._system_config[deployment_name]["languages"][language_name]["deployment"][
-        "module_packages"
-    ]
+    try:
+        return self._system_config[deployment_name]["languages"][language_name]["deployment"][
+            "module_packages"
+        ]
+    except KeyError:
+        return {}

70-72: Avoid duplication and add documentation to the version method.

The method correctly uses .get() with a default value, but it duplicates logic already present in benchmark_image_tag (line 103). This could lead to maintenance issues if one implementation is updated but not the other.

Consider extracting the common logic to avoid duplication:

def version(self) -> str:
+    """
+    Returns the SeBS version from the configuration.
+    
+    Returns:
+        The version string or "unknown" if not specified
+    """
    return self._system_config["general"].get("SeBS_version", "unknown")

Then update benchmark_image_tag to use this method:

def benchmark_image_tag(
    self,
    system: str,
    benchmark: str,
    language_name: str,
    language_version: str,
    architecture: str,
) -> str:
    tag = f"function.{system}.{benchmark}.{language_name}-{language_version}-{architecture}"
    if self.image_tag_prefix:
        tag = f"{tag}-{self.image_tag_prefix}"
-    sebs_version = self._system_config["general"].get("SeBS_version", "unknown")
+    sebs_version = self.version()
    tag = f"{tag}-{sebs_version}"
    return tag
sebs/benchmark.py (1)

783-785: Consider documenting the NoSQL function parameters.

The signature for the nosql_func callback is complex, and a brief comment explaining the parameters would improve maintainability.

 nosql_func: Optional[
-            Callable[[str, str, dict, Tuple[str, str], Optional[Tuple[str, str]]], None]
+            Callable[[str, str, dict, Tuple[str, str], Optional[Tuple[str, str]]], None]  # (table_name, item_id, item_data, primary_key, secondary_key)
tools/build_docker_images.py (1)

68-68: Architecture-specific Docker builds introduced

The code now explicitly accesses the x64 key in the base images dictionary, indicating a shift toward architecture-specific Docker builds.

This change supports the NoSQL integration by ensuring the correct architecture-specific images are built. Consider the following recommendations:

  1. Ensure other architectures (if needed) are handled properly.
  2. Update documentation to reflect this architecture-specific build approach.
  3. Add error handling for when the "x64" key is missing.
-    if "base_images" in language_config:
-        for version, base_image in language_config["base_images"]["x64"].items():
+    if "base_images" in language_config:
+        arch = "x64"  # Default architecture
+        if arch not in language_config["base_images"]:
+            print(f"Warning: Architecture {arch} not found in base_images. Skipping.")
+            return
+        for version, base_image in language_config["base_images"][arch].items():
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5a6b01e and 084e018.

📒 Files selected for processing (5)
  • docs/platforms.md (3 hunks)
  • sebs/benchmark.py (10 hunks)
  • sebs/config.py (2 hunks)
  • sebs/regression.py (15 hunks)
  • tools/build_docker_images.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • sebs/regression.py
🔇 Additional comments (7)
sebs/benchmark.py (6)

26-28: Well-structured implementation of module support.

The addition of modules parameter to BenchmarkConfig enables extension points for different storage types, aligning with the PR's NoSQL integration goals. The implementation properly handles the modules in initialization, property access, and deserialization.

Also applies to: 32-32, 54-56, 67-67


153-164: Good addition of tracking properties.

These properties provide a clear API for determining if a benchmark uses different storage types. This will be useful for conditional logic in deployment and test code.


235-242: Well-organized initialization of storage-related state.

The explicit initialization of storage state flags to False by default is good practice, making the behavior predictable and easy to understand.


362-378: Good modularization of deployment packages.

The refactored code now properly handles module-specific package installations, allowing dependencies to be included only when necessary for the specific storage types being used.


629-697: Excellent refactoring of the input preparation.

The prepare_input method refactoring shows good separation of concerns between object storage and NoSQL storage handling. The implementation:

  1. Takes a unified SystemResources parameter instead of a specific storage type
  2. Conditionally handles object storage and NoSQL based on module capabilities
  3. Updates tracking flags to reflect what storage types are actually used
  4. Properly calls cache update methods for both storage types

This design will scale well as more storage types are added in the future.


769-772: Well-defined module interface for NoSQL support.

Adding the abstract allocate_nosql method to the BenchmarkModuleInterface ensures that all implementing benchmarks will provide the necessary configuration for NoSQL tables in a consistent way.

tools/build_docker_images.py (1)

41-41:

❓ Verification inconclusive

Docker tag format changed from dot to hyphen

The separator between the image tag and the SeBS version has been changed from a dot (.) to a hyphen (-). This changes the format of Docker image tags from repository:type.system.language.version.sebs_version to repository:type.system.language.version-sebs_version.

Make sure this change is consistent with how images are referenced elsewhere in the codebase and in deployment scripts.


🏁 Script executed:

#!/bin/bash
# Check if there are any references to the old image tag format with dot-separated SeBS version
rg 'docker.*\..*SeBS_version|docker.*\.[^"]*unknown' --type py

Length of output: 64


Docker Tag Format Update Consistency Check

The update in tools/build_docker_images.py—changing the concatenation to use a hyphen (target += "-" + sebs_version)—appears to reflect the new Docker tag format (repository:type.system.language.version-sebs_version). Our initial search in Python files did not reveal any references still using the old dot separator. However, because the initial verification did not cover non-Python files (such as deployment scripts), please verify that all image references across the entire codebase adhere to this updated format.

  • File Updated: tools/build_docker_images.py now correctly uses the hyphen separator.
  • Action Required: Manually confirm that deployment scripts and any other Docker image references (potentially in non-Python files) have been updated accordingly.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
sebs/azure/azure.py (1)

296-296: 🛠️ Refactor suggestion

Avoid mutable default arguments.

Using {} as a default value for env_variables can lead to unexpected behavior if the same dictionary is modified across multiple calls.

-    def update_envs(self, function: Function, code_package: Benchmark, env_variables: dict = {}):
+    def update_envs(self, function: Function, code_package: Benchmark, env_variables: Optional[dict] = None):
+        if env_variables is None:
+            env_variables = {}
🧰 Tools
🪛 Ruff (0.8.2)

296-296: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🧹 Nitpick comments (6)
sebs/azure/system_resources.py (4)

41-50: Fix docstring placement.

The multi-line docstring for get_storage is incorrectly placed outside the method. According to PEP 257, docstrings should be placed immediately after the method definition.

-    """
-        Create wrapper object for Azure blob storage.
-        First ensure that storage account is created and connection string
-        is known. Then, create wrapper and create request number of buckets.
-
-        Requires Azure CLI instance in Docker to obtain storage account details.
-
-        :param replace_existing: when true, replace existing files in input buckets
-        :return: Azure storage instance
-    """

    def get_storage(self, replace_existing: Optional[bool] = None) -> BlobStorage:
+        """
+        Create wrapper object for Azure blob storage.
+        First ensure that storage account is created and connection string
+        is known. Then, create wrapper and create request number of buckets.
+
+        Requires Azure CLI instance in Docker to obtain storage account details.
+
+        :param replace_existing: when true, replace existing files in input buckets
+        :return: Azure storage instance
+        """

66-71: Add docstring for the NoSQL storage method.

The get_nosql_storage method is missing a docstring to explain its purpose and behavior.

    def get_nosql_storage(self) -> CosmosDB:
+        """
+        Create wrapper object for Azure Cosmos DB.
+        Initialize Cosmos DB instance with the CLI instance, cache client, and configuration.
+        
+        :return: CosmosDB instance for NoSQL storage operations
+        """
        if self._nosql_storage is None:
            self._nosql_storage = CosmosDB(
                self.cli_instance, self._cache_client, self.config.resources, self.config.region
            )
        return self._nosql_storage

83-90: Improve error messages with more specific guidance.

The current error messages provide basic information but could be enhanced with more specific guidance for the user.

        subscriptions = json.loads(output)
        if len(subscriptions) == 0:
-            raise RuntimeError("Didn't find any valid subscription on Azure!")
+            raise RuntimeError("No valid Azure subscriptions found. Please verify your Azure credentials and ensure you have at least one active subscription.")
        if len(subscriptions) > 1:
-            raise RuntimeError("Found more than one valid subscription on Azure - not supported!")
+            raise RuntimeError("Multiple Azure subscriptions detected. Please set a default subscription using 'az account set --subscription <subscription_id>' before running this operation.")

        self.config.credentials.subscription_id = subscriptions[0]["id"]

102-108: Document expected behavior of initialize_cli.

The initialize_cli method lacks documentation to clarify its relationship with the auto-initialized CLI from the cli_instance property.

    def initialize_cli(self, cli: AzureCLI, login: bool = False):
+        """
+        Set an existing CLI instance instead of auto-creating one.
+        This is useful when sharing a CLI instance across multiple components.
+        
+        :param cli: Existing AzureCLI instance to use
+        :param login: Whether to perform login after setting the CLI instance
+        """
        self._cli_instance = cli
        self._cli_instance_stop = False

        if login:
            self._login_cli()
sebs/azure/azure.py (2)

297-313: Consider refactoring NoSQL environment variable setup.

The current code has multiple environment variable sets with hardcoded prefixes. This could be refactored to be more maintainable.

-        envs = {}
+        envs = env_variables.copy()
         if code_package.uses_nosql:
 
             nosql_storage = cast(CosmosDB, self._system_resources.get_nosql_storage())
 
             # If we use NoSQL, then the handle must be allocated
             _, url, creds = nosql_storage.credentials()
             db = nosql_storage.benchmark_database(code_package.benchmark)
-            envs["NOSQL_STORAGE_DATABASE"] = db
-            envs["NOSQL_STORAGE_URL"] = url
-            envs["NOSQL_STORAGE_CREDS"] = creds
+            # Group related NoSQL environment variables
+            nosql_env_vars = {
+                "NOSQL_STORAGE_DATABASE": db,
+                "NOSQL_STORAGE_URL": url,
+                "NOSQL_STORAGE_CREDS": creds
+            }
+            envs.update(nosql_env_vars)
 
             for original_name, actual_name in nosql_storage.get_tables(
                 code_package.benchmark
             ).items():
                 envs[f"NOSQL_STORAGE_TABLE_{original_name}"] = actual_name

369-374: Remove commented-out code.

The commented-out code about sleeping should be removed if it's no longer needed, or converted to a clearly marked commented section if it's kept for reference.

-                # if we don't do that, next invocation might still see old values
-                # Disabled since we swapped the order - we first update envs, then we publish.
-                #self.logging.info(
-                #    "Sleeping for 10 seconds - Azure needs more time to propagate changes. "
-                #    "Otherwise, functions might not see new variables and fail unexpectedly."
-                #)
+                # Note: Environment variable propagation is now handled by updating envs first, then publishing
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6887ae2 and bd8d6fb.

📒 Files selected for processing (4)
  • config/systems.json (8 hunks)
  • sebs/azure/azure.py (8 hunks)
  • sebs/azure/system_resources.py (1 hunks)
  • sebs/regression.py (15 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • config/systems.json
  • sebs/regression.py
🧰 Additional context used
🪛 Ruff (0.8.2)
sebs/azure/azure.py

296-296: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

🔇 Additional comments (3)
sebs/azure/system_resources.py (1)

35-40:

❓ Verification inconclusive

Add error handling for initialization.

The properties are initialized without checking for potential initialization errors that might occur later. Consider adding validation or error handling.


🏁 Script executed:

#!/bin/bash
# Check if there are any error handling patterns in similar classes
rg -A 3 -B 3 "def __init__.*SystemResources" --type py

Length of output: 56


Action Required: Incorporate Error Handling in SystemResources Initializer

The initialization of properties in sebs/azure/system_resources.py (lines 35–40) currently does not include any error handling or input validation. Given that similar classes in the repository don’t appear to implement this pattern, it’s not a consistency issue—but proactively validating or catching initialization errors can prevent unexpected runtime exceptions. Please consider:

  • Wrapping assignments (e.g., setting _logging_handlers or _system_config) within try/except blocks to catch and log unexpected errors.
  • Validating inputs (such as ensuring system_config is not None or malformed) before assignment.
  • Aligning any new error handling logic with established patterns in other parts of the repository (manual verification is recommended to confirm consistency).
sebs/azure/azure.py (2)

339-349:

❓ Verification inconclusive

Refine Azure system variable detection logic.

The current approach to detecting Azure system variables relies on hardcoded prefixes, which could break if Azure changes their naming conventions.


🏁 Script executed:

#!/bin/bash
# Check if there are any other places in the codebase that filter Azure environment variables
rg "FUNCTIONS_|WEBSITE_|APPINSIGHTS_|Azure" --type py

Length of output: 17040


Improve Azure Environment Variable Detection Logic

The current logic relies on a hardcoded list of prefixes (i.e., "FUNCTIONS_", "WEBSITE_", "APPINSIGHTS_", "Azure") to identify Azure system variables. This approach is brittle since any change in Azure's naming conventions could break it. Consider refactoring to mitigate future issues:

  • Centralize or Externalize Prefixes: Move these hardcoded prefixes to a dedicated constants file or configuration setting to ease future updates.
  • Consult Official Documentation: Explore any available Azure documentation or API endpoints that provide an authoritative list of system variable prefixes.
  • Maintain Backward Compatibility: Ensure that any refinements do not inadvertently overwrite manually provided variable values.

Please review this refactoring approach for suitability against the broader usage across the codebase.


589-592:

❓ Verification inconclusive

Clarify the cold start enforcement mechanism.

The commented-out line and FIXME comment suggest uncertainty about whether the current approach is sufficient for enforcing cold starts.


🏁 Script executed:

#!/bin/bash
# Let's look for other examples of cold start enforcement in the codebase
rg -A 5 "enforce_cold_start" --type py

Length of output: 5816


Action Required: Clarify Azure Cold Start Enforcement

The current implementation in _enforce_cold_start updates the function’s environment variables (via update_envs) using the ForceColdStart counter, then pauses execution for 20 seconds. However, the commented-out call to update_function and the accompanying FIXME note raise uncertainty about whether this approach fully enforces a cold start. For comparison, other providers (e.g., AWS and GCP) update function configurations explicitly to trigger cold starts. Could you please clarify whether updating the environment is sufficient in the Azure context or if the alternative update method should be reconsidered? This clarification or additional documentation will help ensure consistency in our cold start enforcement strategy.

Comment on lines +357 to +366
env_string = ""
for k, v in envs.items():
env_string += f" {k}={v}"

self.logging.info(f"Exporting environment variables for function {function.name}")
self.cli_instance.execute(
f"az functionapp config appsettings set --name {function.name} "
f" --resource-group {resource_group} "
f" --settings {env_string} "
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve environment variable command construction.

The current approach of concatenating environment variables into a string can have issues with escaping special characters or spaces in values.

-                env_string = ""
-                for k, v in envs.items():
-                    env_string += f" {k}={v}"
+                # Use Azure CLI's argument format for settings to handle escaping properly
+                settings_args = " ".join([f"{k}='{v}'" for k, v in envs.items()])

                 self.logging.info(f"Exporting environment variables for function {function.name}")
                 self.cli_instance.execute(
                     f"az functionapp config appsettings set --name {function.name} "
                     f" --resource-group {resource_group} "
-                    f" --settings {env_string} "
+                    f" --settings {settings_args} "
                 )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
env_string = ""
for k, v in envs.items():
env_string += f" {k}={v}"
self.logging.info(f"Exporting environment variables for function {function.name}")
self.cli_instance.execute(
f"az functionapp config appsettings set --name {function.name} "
f" --resource-group {resource_group} "
f" --settings {env_string} "
)
# Use Azure CLI's argument format for settings to handle escaping properly
settings_args = " ".join([f"{k}='{v}'" for k, v in envs.items()])
self.logging.info(f"Exporting environment variables for function {function.name}")
self.cli_instance.execute(
f"az functionapp config appsettings set --name {function.name} "
f" --resource-group {resource_group} "
f" --settings {settings_args} "
)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
sebs/gcp/gcp.py (1)

450-450: ⚠️ Potential issue

Avoid using a mutable default argument for env_variables.

Using a mutable default argument can lead to unexpected behavior if the function is called multiple times.

-    def update_function_configuration(
-        self, function: Function, code_package: Benchmark, env_variables: dict = {}
-    ):
+    def update_function_configuration(
+        self, function: Function, code_package: Benchmark, env_variables: Optional[dict] = None
+    ):
+        if env_variables is None:
+            env_variables = {}
🧰 Tools
🪛 Ruff (0.8.2)

450-450: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

sebs/azure/azure.py (2)

296-296: ⚠️ Potential issue

Avoid mutable default argument in update_envs.

Using a mutable object like an empty dictionary as a default argument can lead to unexpected behavior.

-def update_envs(self, function: Function, code_package: Benchmark, env_variables: dict = {}):
+def update_envs(self, function: Function, code_package: Benchmark, env_variables: Optional[dict] = None):
+    if env_variables is None:
+        env_variables = {}
🧰 Tools
🪛 Ruff (0.8.2)

296-296: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)


357-366: 🛠️ Refactor suggestion

Improve environment variable command construction.

The current string concatenation approach for environment variables can have issues with escaping special characters or values containing spaces.

-                env_string = ""
-                for k, v in envs.items():
-                    env_string += f" {k}={v}"
+                # Use Azure CLI's argument format for settings to handle escaping properly
+                settings_args = " ".join([f"{k}='{v}'" for k, v in envs.items()])

                 self.logging.info(f"Exporting environment variables for function {function.name}")
                 self.cli_instance.execute(
                     f"az functionapp config appsettings set --name {function.name} "
                     f" --resource-group {resource_group} "
-                    f" --settings {env_string} "
+                    f" --settings {settings_args} "
                 )
🧹 Nitpick comments (8)
sebs/benchmark.py (5)

67-67: Consider validating module values

The deserialization directly converts JSON strings to BenchmarkModule objects without validation. Consider adding validation to ensure that only supported module types are accepted.


363-365: Remove extraneous f prefix

The f-string doesn't contain any variables, so the f prefix is unnecessary.

- destination_file = f"requirements.txt.{self._language_version}"
+ destination_file = "requirements.txt.{}".format(self._language_version)
- destination_file = f"requirements.txt"
+ destination_file = "requirements.txt"
🧰 Tools
🪛 Ruff (0.8.2)

365-365: f-string without any placeholders

Remove extraneous f prefix

(F541)


391-393: Remove extraneous f prefix

Similar to the previous comment, this f-string doesn't contain any interpolation.

- package_config = os.path.join(output_dir, f"package.json.{self._language_version}")
+ package_config = os.path.join(output_dir, "package.json.{}".format(self._language_version))
- package_config = os.path.join(output_dir, f"package.json")
+ package_config = os.path.join(output_dir, "package.json")
🧰 Tools
🪛 Ruff (0.8.2)

393-393: f-string without any placeholders

Remove extraneous f prefix

(F541)


777-780: Clarify if allocate_nosql is optional

Since this is a new abstract method in the BenchmarkModuleInterface, it's not clear if all existing benchmark modules need to implement this method. Consider adding a docstring to explain the purpose and usage.

    @staticmethod
    @abstractmethod
    def allocate_nosql() -> dict:
-        pass
+        """
+        Optional method to define NoSQL tables needed by the benchmark.
+        
+        Returns:
+            dict: A dictionary where keys are table names and values are dictionaries 
+                  with 'primary_key' and optional 'secondary_key' fields.
+        """
+        pass

783-794: Consider simplifying the type annotation for nosql_func

The type annotation for the nosql_func parameter is complex and might be hard for developers to understand. Consider creating a TypeAlias for this function type.

# Add at the top of the file with other imports
+ from typing import TypeAlias
+ 
+ # Define a type alias for the NoSQL function type
+ NoSQLWriteFunction: TypeAlias = Callable[[str, str, dict, Tuple[str, str], Optional[Tuple[str, str]]], None]

# Then update the method signature
    @staticmethod
    @abstractmethod
    def generate_input(
        data_dir: str,
        size: str,
        benchmarks_bucket: Optional[str],
        input_paths: List[str],
        output_paths: List[str],
        upload_func: Optional[Callable[[int, str, str], None]],
-        nosql_func: Optional[
-            Callable[[str, str, dict, Tuple[str, str], Optional[Tuple[str, str]]], None]
-        ],
+        nosql_func: Optional[NoSQLWriteFunction],
sebs/gcp/gcp.py (2)

435-448: Consider designing for future environment variable needs.

The implementation effectively handles NoSQL database variables, but as mentioned in previous discussions, other services like Redis might need similar handling in the future.

Consider extending this method to accommodate future environment variable types by using a more generic approach or adding in a comment about future extensions:

def _generate_function_envs(self, code_package: Benchmark) -> dict:

    envs = {}
+   # Handle NoSQL-related environment variables
    if code_package.uses_nosql:

        db = (
            cast(GCPSystemResources, self._system_resources)
            .get_nosql_storage()
            .benchmark_database(code_package.benchmark)
        )
        envs["NOSQL_STORAGE_DATABASE"] = db

+   # TODO: Add support for Redis and other future services here

    return envs

266-285: Improve error handling in IAM policy retry mechanism.

The current implementation catches all HttpError instances without examining specific error types, which might mask unexpected errors.

Consider enhancing the error handling to log the specific error and only retry for specific error codes:

while counter < MAX_RETRIES:
    try:
        allow_unauthenticated_req.execute()
        break
-    except HttpError:
+    except HttpError as e:
+        # Only retry for specific error codes (e.g., 429 Too Many Requests, 503 Service Unavailable)
+        if e.resp.status in [429, 503]:
+            self.logging.info(
+                f"Sleeping for 5 seconds because of HTTP error {e.resp.status}: {str(e)}"
+            )
+            time.sleep(5)
+            counter += 1
+        else:
+            # For other errors, we should not retry
+            raise RuntimeError(
+                f"Failed to configure function {full_func_name} "
+                f"for unauthenticated invocations: {str(e)}"
+            )

        self.logging.info(
            "Sleeping for 5 seconds because the created functions is not yet available!"
        )
        time.sleep(5)
        counter += 1
sebs/azure/azure.py (1)

370-374: Remove commented-out code.

The commented sleep code is no longer necessary since you've changed the order of operations (updating environment variables before publishing). Keeping commented code can lead to confusion.

-                # if we don't do that, next invocation might still see old values
-                # Disabled since we swapped the order - we first update envs, then we publish.
-                # self.logging.info(
-                #    "Sleeping for 10 seconds - Azure needs more time to propagate changes. "
-                #    "Otherwise, functions might not see new variables and fail unexpectedly."
-                # )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bd8d6fb and d0f98c5.

📒 Files selected for processing (3)
  • sebs/azure/azure.py (8 hunks)
  • sebs/benchmark.py (11 hunks)
  • sebs/gcp/gcp.py (13 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
sebs/azure/azure.py

296-296: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

sebs/gcp/gcp.py

450-450: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

sebs/benchmark.py

365-365: f-string without any placeholders

Remove extraneous f prefix

(F541)


393-393: f-string without any placeholders

Remove extraneous f prefix

(F541)

🔇 Additional comments (6)
sebs/benchmark.py (5)

26-32: Great enhancement to support NoSQL modules

The addition of the modules parameter to BenchmarkConfig constructor aligns well with the PR objective of adding NoSQL storage capabilities.


54-56: LGTM - Clean property implementation

The property implementation for modules follows the same pattern as the other properties in this class.


153-164: Good addition of state tracking properties

These properties provide clear access to the benchmark's state regarding input processing and storage usage, which is essential for the NoSQL functionality.


637-705: Well-structured implementation of prepare_input for NoSQL

The refactored prepare_input method now handles both object storage and NoSQL storage in a clean and structured way. The conditional checks for the presence of methods in the input module are a good approach to maintain backward compatibility.

However, consider adding more error handling for edge cases:

        """
            Handle key-value storage.
            This part is optional - only selected benchmarks implement this.
        """
        if hasattr(self._benchmark_input_module, "allocate_nosql"):

-            nosql_storage = system_resources.get_nosql_storage()
+            try:
+                nosql_storage = system_resources.get_nosql_storage()
+            except Exception as e:
+                self.logging.error(f"Failed to initialize NoSQL storage: {str(e)}")
+                raise RuntimeError(f"NoSQL storage initialization failed: {str(e)}")

376-382: 🛠️ Refactor suggestion

Add safeguard for missing module packages

The code assumes that bench_module.value exists in module_packages, but there's no check to confirm this. This could lead to a KeyError exception.

for bench_module in self._benchmark_config.modules:
-    if bench_module.value in module_packages:
-        for package in module_packages[bench_module.value]:
-            out.write(package)
+    if bench_module.value in module_packages:
+        for package in module_packages[bench_module.value]:
+            out.write(package)
+    else:
+        self.logging.debug("No packages found for module %s", bench_module.value)

Likely an incorrect or invalid review comment.

sebs/azure/azure.py (1)

589-593:

❓ Verification inconclusive

Consider adding a comment about cold start implementation.

The commented line suggests uncertainty about whether the current implementation is sufficient for enforcing cold starts.

    self.update_envs(function, code_package, {"ForceColdStart": str(self.cold_start_counter)})

-    # FIXME: is this sufficient to enforce cold starts?
-    # self.update_function(function, code_package, False, "")
+    # We enforce cold starts by updating an environment variable that changes on each cold start request.
+    # This causes the function container to be redeployed, triggering a cold start.

It would be beneficial to verify if this approach consistently forces cold starts on Azure:


🏁 Script executed:

#!/bin/bash
# Verify the cold start enforcement on Azure functions

# Check the function's environment variables before and after enforcing cold start
echo "Fetching environment variables for the function before cold start..."
az functionapp config appsettings list --name YOUR_FUNCTION_APP --resource-group YOUR_RESOURCE_GROUP

# Measure initial response time
echo "Invoking function to measure warm response time..."
time curl -s YOUR_FUNCTION_URL

# Force a cold start
echo "Enforcing cold start..."
# (Script would execute the enforce_cold_start function here)

# Measure response time after cold start
echo "Invoking function to verify cold start..."
time curl -s YOUR_FUNCTION_URL

Length of output: 743


Review Update: Clarify Cold Start Mechanism and Recommend Manual Verification

The update to the comment now clearly explains that cold starts are triggered by updating an environment variable that changes with each request. However, the test script output was inconclusive because the environment lacked the Azure CLI ("az" command not found). Please ensure that manual verification is performed in an environment with the necessary tooling (i.e., Azure CLI configured) to confirm that this approach consistently forces cold starts.

Required actions:

  • Update the comment at sebs/azure/azure.py (lines 589–593) as follows:
        self.update_envs(function, code_package, {"ForceColdStart": str(self.cold_start_counter)})
    
    -    # FIXME: is this sufficient to enforce cold starts?
    -    # self.update_function(function, code_package, False, "")
    +    # We enforce cold starts by updating an environment variable that changes on each cold start request.
    +    # This causes the function container to be redeployed, triggering a cold start.
  • Manually test the cold start behavior on an environment where the Azure CLI is available to ensure the mechanism works as expected.

Comment on lines +235 to +242

self._benchmark_data_path = find_benchmark(self._benchmark, "benchmarks-data")
self._benchmark_input_module = load_benchmark_input(self._benchmark_path)

# Check if input has been processed
self._input_processed: bool = False
self._uses_storage: bool = False
self._uses_nosql: bool = False
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for benchmark input loading

The code loads the benchmark input module but doesn't have error handling if the module is not found or fails to load.

- self._benchmark_input_module = load_benchmark_input(self._benchmark_path)
+ try:
+     self._benchmark_input_module = load_benchmark_input(self._benchmark_path)
+ except (ImportError, FileNotFoundError) as e:
+     raise RuntimeError(f"Failed to load input module for benchmark {self._benchmark}: {str(e)}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
self._benchmark_data_path = find_benchmark(self._benchmark, "benchmarks-data")
self._benchmark_input_module = load_benchmark_input(self._benchmark_path)
# Check if input has been processed
self._input_processed: bool = False
self._uses_storage: bool = False
self._uses_nosql: bool = False
self._benchmark_data_path = find_benchmark(self._benchmark, "benchmarks-data")
try:
self._benchmark_input_module = load_benchmark_input(self._benchmark_path)
except (ImportError, FileNotFoundError) as e:
raise RuntimeError(f"Failed to load input module for benchmark {self._benchmark}: {str(e)}")
# Check if input has been processed
self._input_processed: bool = False
self._uses_storage: bool = False
self._uses_nosql: bool = False

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
docs/benchmarks.md (1)

45-49: Improve Grammatical Clarity in CRUD API Description

In the CRUD API description, the phrase “add new item (PUT)” would read more clearly if it were “add a new item (PUT)”. This minor adjustment will enhance the readability of the documentation.

-It offers three basic methods: add new item (`PUT`), get an item (`GET`), and query all items in a cart.
+It offers three basic methods: add a new item (`PUT`), get an item (`GET`), and query all items in a cart.
🧰 Tools
🪛 LanguageTool

[uncategorized] ~47-~47: You might be missing the article “a” here.
Context: ...art. It offers three basic methods: add new item (PUT), get an item (GET), and ...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 73ec8e1 and dcf7fc1.

📒 Files selected for processing (1)
  • docs/benchmarks.md (2 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/benchmarks.md

[uncategorized] ~47-~47: You might be missing the article “a” here.
Context: ...art. It offers three basic methods: add new item (PUT), get an item (GET), and ...

(AI_EN_LECTOR_MISSING_DETERMINER_A)

🔇 Additional comments (1)
docs/benchmarks.md (1)

8-8: New Benchmark Row Added Successfully

The new table row for 130.crud-api is well formatted and clearly conveys the intended benchmark details, including the type, supported language, architectures, and a concise description. This aligns nicely with the PR's objective of integrating NoSQL storage capabilities.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
docs/storage.md (1)

81-82: Consider using "inaccessible" for better readability.

The phrases "not accessible" could be replaced with "inaccessible" to improve readability.

-By default, the container runs on the Docker bridge network with an address (e.g., `172.17.0.2`) that is not accessible from outside the host.
-Even when deploying both OpenWhisk and storage on the same system, the local bridge network is not accessible from the Kubernetes cluster. 
+By default, the container runs on the Docker bridge network with an address (e.g., `172.17.0.2`) that is inaccessible from outside the host.
+Even when deploying both OpenWhisk and storage on the same system, the local bridge network is inaccessible from the Kubernetes cluster. 
🧰 Tools
🪛 LanguageTool

[style] ~81-~81: Consider using “inaccessible” to avoid wordiness.
Context: ...an address (e.g., 172.17.0.2) that is not accessible from outside the host. Even when deploy...

(NOT_ABLE_PREMIUM)


[style] ~82-~82: Consider using “inaccessible” to avoid wordiness.
Context: ...ame system, the local bridge network is not accessible from the Kubernetes cluster. To make i...

(NOT_ABLE_PREMIUM)

sebs/storage/config.py (1)

83-87: Dictionary key lookup can be optimized.

The code is using .keys() when checking for keys in a dictionary, which is unnecessary.

-        for key in ScyllaDBConfig.__dataclass_fields__.keys():
+        for key in ScyllaDBConfig.__dataclass_fields__:
            cache.update_config(val=getattr(self, key), keys=[*path, key])
🧰 Tools
🪛 Ruff (0.8.2)

85-85: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

sebs.py (2)

163-169: Use context manager for file operations.

When opening files, it's better to use context managers to ensure proper resource cleanup.

-    if storage_configuration is not None:
-
-        for cfg_f in storage_configuration:
-            sebs_client.logging.info(f"Loading storage configuration from {cfg_f}")
-
-            cfg = json.load(open(cfg_f, 'r'))
-            append_nested_dict(config_obj, ["deployment", deployment, "storage"], cfg)
+    if storage_configuration is not None:
+
+        for cfg_f in storage_configuration:
+            sebs_client.logging.info(f"Loading storage configuration from {cfg_f}")
+
+            with open(cfg_f, 'r') as f:
+                cfg = json.load(f)
+            append_nested_dict(config_obj, ["deployment", deployment, "storage"], cfg)
🧰 Tools
🪛 Ruff (0.8.2)

168-168: Use a context manager for opening files

(SIM115)


397-399: Use context manager for file operations.

When opening files, it's better to use context managers to ensure proper resource cleanup.

-    user_storage_config = json.load(open(config, 'r'))
+    with open(config, 'r') as f:
+        user_storage_config = json.load(f)
🧰 Tools
🪛 Ruff (0.8.2)

398-398: Use a context manager for opening files

(SIM115)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dcf7fc1 and 664d03e.

📒 Files selected for processing (5)
  • docs/platforms.md (5 hunks)
  • docs/storage.md (1 hunks)
  • sebs.py (12 hunks)
  • sebs/storage/config.py (2 hunks)
  • sebs/storage/minio.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • sebs/storage/minio.py
🧰 Additional context used
🪛 LanguageTool
docs/storage.md

[style] ~81-~81: Consider using “inaccessible” to avoid wordiness.
Context: ...an address (e.g., 172.17.0.2) that is not accessible from outside the host. Even when deploy...

(NOT_ABLE_PREMIUM)


[style] ~82-~82: Consider using “inaccessible” to avoid wordiness.
Context: ...ame system, the local bridge network is not accessible from the Kubernetes cluster. To make i...

(NOT_ABLE_PREMIUM)

🪛 Ruff (0.8.2)
sebs.py

168-168: Use a context manager for opening files

(SIM115)


398-398: Use a context manager for opening files

(SIM115)

sebs/storage/config.py

85-85: Use key in dict instead of key in dict.keys()

Remove .keys()

(SIM118)

🔇 Additional comments (25)
docs/storage.md (5)

1-15: Well-structured introduction to storage configuration.

The introduction clearly explains the purpose of storage in SeBS benchmarks and introduces Minio and ScyllaDB as the solutions for object storage and NoSQL database, respectively. This helps users understand the overall storage architecture.


16-32: Clear CLI command examples for starting storage services.

The documentation provides explicit examples for starting object storage, NoSQL storage, or both, which makes it easy for users to understand the different options available.


34-68: Comprehensive configuration example.

The JSON configuration example is detailed and includes all necessary parameters for both storage types. This helps users understand the expected configuration format.


93-108: Good practice: HTTP validation example.

Including a concrete example with curl to validate the Minio configuration is excellent as it provides users with a practical way to verify their setup is working correctly.


111-123: Clear lifecycle management instructions.

The section on lifecycle management clearly explains how storage containers are retained and how to stop them when needed, completing the operational understanding of the storage services.

sebs/storage/config.py (6)

10-19: Well-defined abstract base class with clear contract.

The PersistentStorageConfig abstract base class defines a clear contract with the serialize and envs abstract methods that concrete implementations must provide.


22-32: Updated MinioConfig with additional attributes.

The MinioConfig class is now properly inheriting from PersistentStorageConfig and includes the new version and data_volume attributes which are necessary for managing storage instances.


51-59: Implementation of abstract methods in MinioConfig.

The serialize method now simply returns self.__dict__ and the envs method provides the necessary environment variables for Minio.


62-67: NoSQL abstract class definition.

The abstract class for NoSQL storage configurations defines the contract for serialization, providing a consistent approach across different types of storage.


69-82: Comprehensive ScyllaDB configuration class.

The ScyllaDBConfig class includes all necessary attributes for ScyllaDB management, including address, ports, credentials, and resource settings.


88-99: Well-implemented ScyllaDB serialization methods.

The deserialize and serialize methods in the ScyllaDBConfig class follow a consistent pattern compared to MinioConfig, making the code more maintainable.

docs/platforms.md (4)

15-20: Helpful platform summary with navigation links.

Adding a concise list of supported platforms with links to their respective sections improves navigation within the document.


21-23: Clear reference to storage documentation.

The new "Storage Configuration" section provides a clear reference to the separate storage documentation, ensuring users are aware of the importance of storage configuration for their deployments.


148-149: Updated GCP permissions requirements.

The enhanced documentation now clearly specifies the required permissions for Google Cloud Functions, including the Firestore Service Agent role needed for creating Firestore databases.


185-185: Helpful cross-reference to storage documentation.

Adding a reference to the storage documentation in the OpenWhisk section ensures users understand the specific storage requirements for OpenWhisk deployments.

sebs.py (10)

9-9: Updated import to include NoSQL storage types.

The import has been updated to include the NoSQLStorage types, which is necessary for the new NoSQL storage functionality.


138-138: Updated type annotation for storage configuration.

The type annotation for storage_configuration has been changed from Optional[str] to Optional[List[str]] to support multiple storage configuration files.


233-233: Updated CLI option to support multiple storage configurations.

The --storage-configuration CLI option now supports multiple values, allowing users to specify multiple storage configuration files.


390-392: Enhanced storage command with more options.

The storage start command now accepts different storage types ("object", "nosql", or "all") and reads configuration from a JSON file, providing more flexibility.


400-416: Well-structured handling of object storage.

The code for starting object storage is well-structured and includes proper initialization, configuration, and serialization of the storage instance.


418-434: Well-structured handling of NoSQL storage.

The code for starting NoSQL storage follows the same pattern as object storage, maintaining a consistent approach that makes the code more maintainable.


446-447: Consistent interface for stopping storage.

The storage stop command follows the same pattern as storage start, accepting the storage type and input JSON path.


454-464: Proper shutdown of object storage.

The code for stopping object storage properly deserializes the configuration, shuts down the storage instance, and logs the process.


466-476: Proper shutdown of NoSQL storage.

The code for stopping NoSQL storage follows the same pattern as object storage, maintaining a consistent approach.


707-708: Improved error handling for unsupported resources.

The code now raises a NotImplementedError for unsupported resource types, providing better error feedback to users.

@mcopik mcopik merged commit 3266d2d into master Mar 9, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants