-
Notifications
You must be signed in to change notification settings - Fork 15
41526 Storage [Backend] Create a file service #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ion API - Add FileAttachment.access_type field with account/restricted options - Add FileAttachment.file_id field for unique file identification - Create FileAttachmentPermission model for user-specific file access - Implement AttachmentService.check_user_permission method - Add AttachmentsViewSet with check-permission endpoint - Add Redis caching for public template authentication - Add response_forbidden method to BaseResponseMixin - Include comprehensive test coverage for permission checking This enables fine-grained file access control with two access types: - account: accessible by all users in the same account - restricted: accessible only by users with explicit permissions
…integration - Add FastAPI-based file upload and download endpoints with streaming support - Implement Clean Architecture with domain entities, use cases, and repositories - Add authentication middleware with JWT token validation and Redis caching - Integrate Google Cloud Storage S3-compatible API for file storage - Add comprehensive error handling with custom exceptions and HTTP status codes - Implement file access permissions validation through external HTTP service - Add database models and Alembic migrations for file metadata storage - Include Docker containerization with docker-compose for local development - Add comprehensive test suite with unit, integration, and e2e tests - Configure pre-commit hooks with ruff, mypy, and pytest for code quality
…e_access_rights' into 41526__сreate_file_service # Conflicts: # backend/src/processes/enums.py # backend/src/processes/models/__init__.py # backend/src/processes/models/workflows/attachment.py # backend/src/processes/services/attachments.py # backend/src/processes/tests/test_services/test_attachments.py
… and implement caching for template data
| ) | ||
|
|
||
| # Upload file | ||
| await s3.put_object( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both StorageService.upload_file (https://github.com/pneumaticapp/pneumaticworkflow/pull/53/files#diff-8073b0f56943dafd3f23e27584a1233a203192e84be5f7a0685233c6ceccc7f1R1) and the put_object call in storage/src/infra/repositories/storage_service.py:124 forward a content_type that can be None directly to aioboto3 as ContentType=None, causing a ParamValidationError instead of a ClientError and bypassing StorageError wrapping. Consider conditionally including ContentType only when content_type is not None or defaulting it to a safe MIME type (e.g., application/octet-stream).
- await s3.put_object(
- Bucket=bucket_name,
- Key=file_path,
- Body=file_content,
- ContentType=content_type,
- )
+ params = {
+ 'Bucket': bucket_name,
+ 'Key': file_path,
+ 'Body': file_content,
+ }
+ if content_type is not None:
+ params['ContentType'] = content_type
+ await s3.put_object(**params)🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
| """Authenticate public token""" | ||
| try: | ||
| token = PublicAuthService.get_token(raw_token) | ||
| if token: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
authenticate_public_token passes the raw token into PublicAuthService.get_token(...), but get_token expects a header-like value in the form "Token <value>". This causes valid cookie/header values containing just the token to fail parsing and the method to silently return None.
Consider constructing the header-formatted string using PublicAuthService.HEADER_PREFIX before calling get_token, or alternatively consider extending get_token to accept raw tokens as well and document the accepted formats.
- token = PublicAuthService.get_token(raw_token)
+ token = PublicAuthService.get_token(f"{PublicAuthService.HEADER_PREFIX} {raw_token}")🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
| error_code = e.response['Error']['Code'] | ||
| if error_code == 'NoSuchKey': | ||
| raise StorageError( | ||
| f'File {file_path} not found in bucket {bucket_name}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StorageError is being initialized with formatted or plain message strings as the error_code_key in multiple methods—create_bucket, upload_file, and download_file. Since the first positional argument is used to look up DOMAIN_ERROR_CODES, this will raise a KeyError and hide the original error. Use the dedicated classmethods (e.g., StorageError.bucket_create_failed, StorageError.upload_failed, StorageError.file_not_found_in_storage) or pass a valid error code key with the message in details to preserve context.
- raise StorageError(
- f'File {file_path} not found in bucket {bucket_name}'
- )
+ raise StorageError.file_not_found_in_storage(file_path, bucket_name)🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
….toml to dedicated config files - Move mypy configuration from pyproject.toml to mypy.ini for better separation of concerns - Simplify ruff.toml configuration by removing extensive rule selections and using "ALL" selector - Update ruff target version from py37 to py311 to match project Python version - Remove redundant ruff configuration from pyproject.toml to avoid duplication - Apply code formatting fixes across entire codebase - Standardize import statements and code style according to new linting rules - Update test files to comply with new formatting standards
| ), | ||
| 'STORAGE_OPERATION_FAILED': ErrorCode( | ||
| code='STORAGE_001', | ||
| message=MSG_STORAGE_001, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DOMAIN_ERROR_CODES['STORAGE_OPERATION_FAILED'] uses MSG_STORAGE_001 (upload-specific), which misleads for non-upload failures. Consider using a neutral message (e.g., a generic string or a new constant) so the fallback isn’t tied to uploads.
- message=MSG_STORAGE_001,
+ message='Storage operation failed',🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue on line in backend/src/processes/services/attachments.py:69:
check_user_permission() depends on file_id, but attachments are created without one, so queries with None can match unrelated rows or deny valid access. Consider generating and assigning a unique, non-null file_id on create/clone, and short-circuit when file_id is None.
- def _create_attachment(
- self,
- name: str,
- url: str,
- size: int,
- thumbnail_url: Optional[str] = None,
- ) -> FileAttachment:
+ def _create_attachment(
+ self,
+ name: str,
+ url: str,
+ size: int,
+ file_id: str,
+ thumbnail_url: Optional[str] = None,
+ ) -> FileAttachment:
+ file_id=file_id,
+ file_id = get_salt(30)
+ file_id=file_id,
+ file_id=get_salt(30),
+ if file_id is None:
+ return False🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
| exc: ValidationError, | ||
| ) -> JSONResponse: | ||
| """Handle Pydantic validation errors""" | ||
| error_code = VALIDATION_ERROR_CODES['MISSING_REQUIRED_FIELD'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
validation_exception_handler always returns MISSING_REQUIRED_FIELD for any pydantic.ValidationError, which misreports common cases (e.g., wrong type, extra fields). Consider deriving the code from exc.errors() (e.g., use MISSING_REQUIRED_FIELD only when all errors are missing, otherwise a generic VALIDATION_FAILED).
- error_code = VALIDATION_ERROR_CODES['MISSING_REQUIRED_FIELD']
- details = str(exc.errors())
+ errors = exc.errors()
+ missing_only = all(err.get('type') == 'missing' for err in errors)
+ error_code = VALIDATION_ERROR_CODES['MISSING_REQUIRED_FIELD'] if missing_only else VALIDATION_ERROR_CODES['VALIDATION_FAILED']
+ details = str(errors)🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
…ore rule in ruff configuration - Update docstrings across various modules to ensure consistency and clarity. - Remove unused "D" rule from ruff.toml configuration. - Enhance readability and maintainability of the codebase.
| """ | ||
| # Read file content | ||
| file_content = await file.read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
upload_file reads the whole payload into RAM before checking MAX_FILE_SIZE, which can exhaust memory on oversized uploads. Consider streaming in chunks and abort once the limit is exceeded, then assemble the bytes only if within the limit.
- file_content = await file.read()
- file_size = len(file_content)
-
- if file_size > settings.MAX_FILE_SIZE:
- raise FileSizeExceededError(file_size, settings.MAX_FILE_SIZE)
-
+ total_size = 0
+ chunks: list[bytes] = []
+ while True:
+ chunk = await file.read(settings.CHUNK_SIZE)
+ if not chunk:
+ break
+ total_size += len(chunk)
+ if total_size > settings.MAX_FILE_SIZE:
+ raise FileSizeExceededError(total_size, settings.MAX_FILE_SIZE)
+ chunks.append(chunk)
+ file_content = b"".join(chunks)
+ file_size = total_size
🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
| from collections.abc import AsyncGenerator | ||
| from typing import Annotated | ||
|
|
||
| from fastapi import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filename (and content_type) can be None/empty in both upload_file and entity_to_orm, which forwards null into non-nullable DB columns and triggers an IntegrityError on flush. Consider validating and rejecting requests with missing metadata (e.g., return 400) or applying safe defaults before constructing the UploadFileCommand or mapping to the ORM so you never insert null into non-nullable columns.
- from fastapi import (
- APIRouter,
- Depends,
- File,
- UploadFile,
- )
+ from fastapi import (
+ APIRouter,
+ Depends,
+ File,
+ UploadFile,
+ HTTPException,
+ )
+ if not file.filename or not file.filename.strip():
+ raise HTTPException(status_code=400, detail='filename is required')
🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
…ling for consistency - Adjust import paths in test files to ensure they reference the correct locations. - Replace instances of FileNotFoundError with DomainFileNotFoundError for better clarity in exception handling. - Streamline fixture definitions and improve code readability across various test modules.
… configuration - Update docstrings across various test files for consistency and clarity. - Add new linting rules in ruff.toml for improved code quality. - Enhance readability and maintainability of the codebase by refining fixture definitions and mock implementations.
…line permission handling - Refactor the AuthenticationMiddleware to enhance error handling and response formatting. - Update permission classes to use direct Request type hints instead of string annotations. - Consolidate permission checks into FastAPI dependency wrappers for better clarity and usability. - Remove unused exception classes and error messages to clean up the codebase. - Adjust test cases to reflect changes in authentication and permission handling.
| async for chunk in file_stream: | ||
| yield chunk | ||
|
|
||
| return StreamingResponse( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Content-Disposition uses file_record.filename unescaped, so quotes or CR/LF in uploads can break the header or enable header injection. Consider sanitizing the filename (e.g., strip CR/LF and quotes, or use RFC 6266 filename* encoding) before building the header.
+ safe_name = (file_record.filename or '').replace('"', '').replace('\r', '').replace('\n', '')
- 'Content-Disposition': (
- f'attachment; filename="{file_record.filename}"'
- ),
+ 'Content-Disposition': f'attachment; filename="{safe_name}"',🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
…exception tests - Remove unused infrastructure error codes from error_codes.py to streamline the codebase. - Update the AuthenticationMiddleware constructor to use direct FastAPI type hints for clarity. - Add new tests for validation exceptions, including file size and storage errors, to improve coverage and ensure accurate error handling.
…achment model and create FileAttachmentPermission model
| """ | ||
| self.auth_type = auth_user.auth_type | ||
| self.user_id = auth_user.user_id | ||
| self.account_id: int = auth_user.account_id # type: ignore[assignment] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# type: ignore[assignment] hides a possible None for account_id, so AuthenticatedUser.account_id: int can be None at runtime. Consider validating in __init__ and raising AuthenticationError if account_id is None instead of ignoring the type.
- self.account_id: int = auth_user.account_id # type: ignore[assignment]
+ account_id = auth_user.account_id
+ if account_id is None:
+ raise AuthenticationError(details='User not authenticated')
+ self.account_id = account_id🚀 Want me to fix this? Reply ex: "fix it for me".
| except httpx.RequestError as e: | ||
| raise HttpClientError( | ||
| url=self.base_url, | ||
| details=MSG_EXT_014.format(details=str(e)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
details is already formatted with MSG_EXT_014 before passing to HttpClientError, which formats its own message again. This doubles the text (e.g., "... failed: HTTP request failed: ..."). Consider passing the raw exception string so it’s formatted once.
| details=MSG_EXT_014.format(details=str(e)), | |
| details=str(e), |
🚀 Want me to fix this? Reply ex: "fix it for me".
| token: Optional token string. | ||
| """ | ||
| if token: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if token: treats "" as falsy, generating a new token instead of validating length. Consider if token is not None: so empty strings are validated and raise ValueError.
| if token: | |
| if token is not None: |
🚀 Want me to fix this? Reply ex: "fix it for me".
| # Convert all kwargs values to strings for ErrorResponse | ||
| str_kwargs = { | ||
| key: str(value) if value is not None else None | ||
| for key, value in kwargs.items() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to_response ignores self.kwargs, so context captured at construction (e.g., file_id, user_id) never reaches the response. Consider merging self.kwargs with method kwargs (with method args overriding) before building ErrorResponse.
- # Convert all kwargs values to strings for ErrorResponse
- str_kwargs = {
- key: str(value) if value is not None else None
- for key, value in kwargs.items()
- }
+ # Convert all kwargs values to strings for ErrorResponse
+ merged_kwargs = {**self.kwargs, **kwargs}
+ str_kwargs = {
+ key: str(value) if value is not None else None
+ for key, value in merged_kwargs.items()
+ }🚀 Want me to fix this? Reply ex: "fix it for me".
| def __getitem__(self, key: str) -> str | int | None: | ||
| """Get token payload item.""" | ||
| return self.payload[key] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
__getitem__’s return type doesn’t match payload’s dict[str, Any], and it raises KeyError for missing keys. Consider either documenting the dict-like KeyError behavior, or returning Optional[Any] via payload.get(key) to align the type and avoid surprises.
| def __getitem__(self, key: str) -> str | int | None: | |
| """Get token payload item.""" | |
| return self.payload[key] | |
| def __getitem__(self, key: str) -> Any | None: | |
| """Get token payload item.""" | |
| return self.payload.get(key) |
🚀 Want me to fix this? Reply ex: "fix it for me".
| AccountBaseMixin, | ||
| ): | ||
| class Meta: | ||
| unique_together = ('user', 'attachment') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Soft delete conflicts with unique_together on ('user', 'attachment'). Consider including is_deleted, or use a conditional UniqueConstraint for non-deleted rows, so recreating permissions doesn’t raise IntegrityError.
| unique_together = ('user', 'attachment') | |
| unique_together = ('user', 'attachment', 'is_deleted') |
🚀 Want me to fix this? Reply ex: "fix it for me".
| def client(self) -> httpx.AsyncClient: | ||
| """Lazy client initialization.""" | ||
| if self._client is None: | ||
| self._client = httpx.AsyncClient() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timeout mismatch. HttpTimeoutError reports 30.0s, but the httpx.AsyncClient has no timeout set. Consider configuring the client with the same timeout (or derive it from the exception/client) so the message stays accurate.
| self._client = httpx.AsyncClient() | |
| self._client = httpx.AsyncClient(timeout=30.0) |
🚀 Want me to fix this? Reply ex: "fix it for me".
| return FileRecordORM( | ||
| file_id=entity.file_id, | ||
| filename=entity.filename, | ||
| content_type=entity.content_type, | ||
| size=entity.size, | ||
| user_id=entity.user_id, | ||
| account_id=entity.account_id, | ||
| created_at=entity.created_at, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entity_to_orm() can pass None for content_type/filename into non-nullable columns, causing a NOT NULL violation on persist. Consider failing fast here (e.g., raise if either is None) or document why None is acceptable and align the schema.
- return FileRecordORM(
- file_id=entity.file_id,
- filename=entity.filename,
- content_type=entity.content_type,
- size=entity.size,
- user_id=entity.user_id,
- account_id=entity.account_id,
- created_at=entity.created_at,
- )
+ if entity.content_type is None or entity.filename is None:
+ raise ValueError("`content_type` and `filename` must not be None when persisting")
+ return FileRecordORM(
+ file_id=entity.file_id,
+ filename=entity.filename,
+ content_type=entity.content_type,
+ size=entity.size,
+ user_id=entity.user_id,
+ account_id=entity.account_id,
+ created_at=entity.created_at,
+ )🚀 Want me to fix this? Reply ex: "fix it for me".
…h SeaweedFS integration in Docker configuration
…02:8002 to 8002:8000 in Docker configurations
…ame and clean up GCS S3 environment variable assignments
…l-storage' in Docker Compose files
… to Docker Compose files
…nt management, including models, serializers, views, and synchronization commands
| Gets cookie domain from settings. | ||
| Extracts domain from FRONTEND_URL. | ||
| """ | ||
| frontend_url = settings.FRONTEND_URL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accessing settings.FRONTEND_URL will raise if it’s missing. Consider using getattr(settings, 'FRONTEND_URL', None) so absent settings don’t crash this middleware.
| frontend_url = settings.FRONTEND_URL | |
| frontend_url = getattr(settings, 'FRONTEND_URL', None) |
🚀 Want me to fix this? Reply ex: "fix it for me".
| Attachment.objects.get_or_create( | ||
| file_id=old_attachment.file_id, | ||
| defaults={ | ||
| 'account': old_attachment.account, | ||
| 'access_type': old_attachment.access_type, | ||
| 'source_type': source_type, | ||
| 'task': self._get_task(old_attachment), | ||
| 'workflow': old_attachment.workflow, | ||
| 'template': None, | ||
| }, | ||
| ) | ||
| stats['created'] += 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In file_sync, only increment stats['created'] when Attachment.objects.get_or_create(...) returns created=True to avoid inflated counts.
- Attachment.objects.get_or_create(
+ _, created = Attachment.objects.get_or_create(
- stats['created'] += 1
+ stats['created'] += int(created)🚀 Want me to fix this? Reply ex: "fix it for me".
|
|
||
| # Save record to database | ||
| async with self._unit_of_work: | ||
| await self._file_repository.create(file_record) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integrity errors aren’t caught here. session.add() doesn’t raise; constraints fire on flush()/commit. Consider catching IntegrityError around await self._unit_of_work.commit() or calling flush() inside FileRecordRepository.create() so its try/except wraps the violation.
🚀 Want me to fix this? Reply ex: "fix it for me".
| # Get main domain (e.g., pneumatic.app) | ||
| parts = hostname.split('.') | ||
| if len(parts) >= 2: | ||
| return f'.{parts[-2]}.{parts[-1]}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The domain reduction picks only the last two labels, which breaks on multi‑part TLDs (e.g., example.co.uk) and can target a public suffix, so the cookie is rejected. Consider returning hostname directly (or a configured domain) instead of collapsing to two labels.
| return f'.{parts[-2]}.{parts[-1]}' | |
| return hostname |
🚀 Want me to fix this? Reply ex: "fix it for me".
| await self._unit_of_work.commit() | ||
|
|
||
| # Generate public download URL | ||
| public_url = f'{self._fastapi_base_url}/{file_record.file_id}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public_url can contain a double slash if self._fastapi_base_url ends with /. Consider stripping the trailing slash before concatenation.
| public_url = f'{self._fastapi_base_url}/{file_record.file_id}' | |
| public_url = f'{self._fastapi_base_url.rstrip("/")}/{file_record.file_id}' |
🚀 Want me to fix this? Reply ex: "fix it for me".
| user_id=int(guest_token['user_id']) | ||
| if guest_token['user_id'] | ||
| else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using truthiness for guest_token['user_id'] can drop a valid 0. Consider checking is not None before casting to keep 0 intact.
| user_id=int(guest_token['user_id']) | |
| if guest_token['user_id'] | |
| else None, | |
| user_id=int(guest_token['user_id']) | |
| if guest_token['user_id'] is not None | |
| else None, |
🚀 Want me to fix this? Reply ex: "fix it for me".
| return | ||
|
|
||
| # Assign permissions to template owners | ||
| template_owners = template.template_owners.all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In _assign_template_permissions, use template.owners and treat owners as UserModels; iterate template.owners.all() (or template.get_owners()) and assign perms directly to each owner to avoid AttributeError.
| template_owners = template.template_owners.all() | |
| template_owners = template.owners.all() |
🚀 Want me to fix this? Reply ex: "fix it for me".
| else: | ||
| # Parameters for Google Cloud Storage via S3 API | ||
| self._client_params = { | ||
| 'service_name': 's3', | ||
| 'endpoint_url': self._settings.GCS_S3_ENDPOINT, | ||
| 'aws_access_key_id': self._settings.GCS_S3_ACCESS_KEY, | ||
| 'aws_secret_access_key': self._settings.GCS_S3_SECRET_KEY, | ||
| 'region_name': self._settings.GCS_S3_REGION, | ||
| 'config': self._config, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else applies GCS params for any non-'local' type. Consider checking explicitly for 'google' and failing for unknown types to avoid misconfig/connection errors.
| else: | |
| # Parameters for Google Cloud Storage via S3 API | |
| self._client_params = { | |
| 'service_name': 's3', | |
| 'endpoint_url': self._settings.GCS_S3_ENDPOINT, | |
| 'aws_access_key_id': self._settings.GCS_S3_ACCESS_KEY, | |
| 'aws_secret_access_key': self._settings.GCS_S3_SECRET_KEY, | |
| 'region_name': self._settings.GCS_S3_REGION, | |
| 'config': self._config, | |
| } | |
| elif self._storage_type == 'google': | |
| # Parameters for Google Cloud Storage via S3 API | |
| self._client_params = { | |
| 'service_name': 's3', | |
| 'endpoint_url': self._settings.GCS_S3_ENDPOINT, | |
| 'aws_access_key_id': self._settings.GCS_S3_ACCESS_KEY, | |
| 'aws_secret_access_key': self._settings.GCS_S3_SECRET_KEY, | |
| 'region_name': self._settings.GCS_S3_REGION, | |
| 'config': self._config, | |
| } | |
| else: | |
| raise StorageError(f"Unsupported STORAGE_TYPE: {self._storage_type}") |
🚀 Want me to fix this? Reply ex: "fix it for me".
|
|
||
| if (auth_header.startswith('Bearer ') or | ||
| auth_header.startswith('Token ')): | ||
| return auth_header.split(' ')[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 54 can produce an empty token when Authorization is just Bearer or Token . Consider returning only a non-empty token (else None).
| return auth_header.split(' ')[1] | |
| token = auth_header.split(' ', 1)[1] | |
| if token.strip(): | |
| return token.strip() | |
| return None |
🚀 Want me to fix this? Reply ex: "fix it for me".
…t management, including model, serializer, and service tests
| users_set = set() | ||
|
|
||
| # Get task performers via intermediate model | ||
| task_performers = task.taskperformer_set.all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Permissions include deleted performers. Suggest filtering with task.taskperformer_set.exclude_directly_deleted() instead of .all() so soft-deleted users/groups don’t get access.
| task_performers = task.taskperformer_set.all() | |
| task_performers = task.taskperformer_set.exclude_directly_deleted() |
🚀 Want me to fix this? Reply ex: "fix it for me".
| def _generate_file_id(self) -> str: | ||
| """Generates unique file_id.""" | ||
| return get_salt(32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_generate_file_id returns a random ID without checking collisions. With unique=True, a rare collision would raise IntegrityError on save and abort the sync. Consider generating an unused ID (retry a few times, then fail) to avoid unexpected failures.
| def _generate_file_id(self) -> str: | |
| """Generates unique file_id.""" | |
| return get_salt(32) | |
| def _generate_file_id(self) -> str: | |
| """Generates unique file_id.""" | |
| for _ in range(5): | |
| candidate = get_salt(32) | |
| if not FileAttachment.objects.filter(file_id=candidate).exists(): | |
| return candidate | |
| raise ValueError("Could not generate unique file_id") |
🚀 Want me to fix this? Reply ex: "fix it for me".
| return | ||
|
|
||
| # Reload workflow from DB to get latest ManyToMany fields | ||
| workflow = Workflow.objects.prefetch_related( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refetching models via default managers excludes soft-deleted rows and can raise DoesNotExist. Suggest using _base_manager when soft-deleted records should be included, or guard with a null-safe lookup and early return.
| workflow = Workflow.objects.prefetch_related( | |
| workflow = Workflow._base_manager.prefetch_related( |
🚀 Want me to fix this? Reply ex: "fix it for me".
…ice and update tests for event and output properties
| for user in users_set: | ||
| assign_perm( | ||
| 'storage.view_file_attachment', | ||
| user, | ||
| self.instance, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Group performers aren’t granted permissions in _assign_workflow_permissions. Only users get access, so group-only performers can’t view restricted workflow attachments. Consider also assigning storage.view_file_attachment to each performer.group, similar to _assign_task_permissions.
| for user in users_set: | |
| assign_perm( | |
| 'storage.view_file_attachment', | |
| user, | |
| self.instance, | |
| ) | |
| for user in users_set: | |
| assign_perm( | |
| 'storage.view_file_attachment', | |
| user, | |
| self.instance, | |
| ) | |
| # Assign permissions to groups | |
| for task in tasks: | |
| for performer in task.taskperformer_set.all(): | |
| if performer.group: | |
| assign_perm( | |
| 'storage.view_file_attachment', | |
| performer.group, | |
| self.instance, | |
| ) |
🚀 Want me to fix this? Reply ex: "fix it for me".
Create a FastAPI file service and add Django storage app with attachment endpoints, permission checks, auth middleware, and object-level permissions, exposing ports 8001 and 8002
Add a new FastAPI service with upload/download APIs, S3 storage integration, auth middleware, centralized exceptions, DI, and Alembic migrations; extend Django with a
src.storageapp, models, serializers, views, object permissions, migrations, and endpoints for listing and checking attachment access; update docker-compose to run the file service stack and proxy via Nginx; add extensive unit, integration, and E2E tests; note two defects: missingVALIDATION_ERROR_CODES['MISSING_REQUIRED_FIELD']and a bad key inStorageError.bucket_create_failed.📍Where to Start
Start with the FastAPI app setup in
src.mainin storage/src/main.py, then review the API endpoints insrc.presentation.api.filesin storage/src/presentation/api/files.py, and the Django attachment viewset in backend/src/storage/views.py.📊 Macroscope summarized a040982. 42 files reviewed, 61 issues evaluated, 58 issues filtered, 1 comment posted
🗂️ Filtered Issues
backend/src/processes/models/workflows/attachment.py — 0 comments posted, 1 evaluated, 1 filtered
unique_togetherconstraint on('user', 'attachment')will causeIntegrityErrorwhen trying to create a new permission for the same user/attachment pair after soft-deleting the original. SinceSoftDeleteModelsetsis_deleted=Truerather than removing the row, the database constraint still sees the old record and blocks the insert. [ Already posted ]backend/src/processes/services/attachments.py — 0 comments posted, 1 evaluated, 1 filtered
check_user_permissionmethod usesprefetch_related('permissions')and accessesattachment.permissions.filter(user_id=user_id), but theFileAttachmentmodel has nopermissionsrelation defined. This will raise anAttributeErrorat runtime when attempting to access.permissionson aFileAttachmentinstance withaccess_type == FileAttachmentAccessType.RESTRICTED. [ Low confidence ]backend/src/processes/views/attachments.py — 0 comments posted, 1 evaluated, 1 filtered
AttachmentService()instantiation without any parameters incheck_permissionaction usessrc.processes.services.attachments.AttachmentServicewhich has acheck_user_permissionmethod that queriesFileAttachmentmodel with afile_idfield. However, the serializerFileAttachmentCheckPermissionSerializerhasmax_length=255while theAttachmentmodel instorage/models.pyhasfile_idwithmax_length=64. If the samefile_idvalues are expected to be used across both systems, there's a potential mismatch that could cause truncation or lookup failures. [ Low confidence ]backend/src/storage/middleware.py — 0 comments posted, 3 evaluated, 3 filtered
auth_header.split(' ')[1]will return an empty string if the header is exactly'Bearer 'or'Token '(with trailing space but no token value), causing an empty auth token to be set in the cookie. Consider validating the token is non-empty before using it. [ Already posted ]settings.FRONTEND_URLon line 69 will raiseAttributeErrorifFRONTEND_URLis not defined in Django settings. The check on line 71 (if not frontend_url:) only handlesNoneor empty string values, but not a missing attribute. Should usegetattr(settings, 'FRONTEND_URL', None)instead. [ Already posted ]f'.{parts[-2]}.{parts[-1]}') incorrectly handles multi-part TLDs likeexample.co.ukorapp.example.com.br, returning.co.ukor.com.br(public suffixes) instead of the intended domain. This would cause the cookie to be set on a public suffix domain, which browsers will reject, resulting in the cookie not being set. [ Already posted ]backend/src/storage/migrations/0002_migrate_file_attachments.py — 0 comments posted, 3 evaluated, 3 filtered
FileAttachmentrecords using a queryset in a for loop (line 21) without using.iterator(). For databases with large numbers of records, this loads all objects into memory at once, potentially causingMemoryErroror excessive memory consumption that could crash the migration process. [ Low confidence ]determine_source_typeandget_task_from_attachmentfor theoutputcase: whenattachment.outputexists butattachment.output.taskisNoneor thetaskattribute doesn't exist,determine_source_typereturns'Task'(line 60-61 in referenced code) butget_task_from_attachmentreturnsNone. This createsAttachmentrecords withsource_type='Task'buttask=None, which may violate data integrity constraints or cause issues in code that assumes task-sourced attachments have a valid task reference. [ Already posted ]ignore_conflicts=Trueon line 47 silently discards any records that conflict on unique constraints. If the migration is run multiple times or if there are unexpected duplicates, data loss could occur without any warning or logging, making debugging migration issues difficult. [ Low confidence ]backend/src/storage/models.py — 0 comments posted, 1 evaluated, 1 filtered
Attachmentmodel has no validation ensuring that thesource_typefield matches the non-null foreign key. For example, settingsource_type=SourceType.TASKwhiletask=Noneandworkflowis set would create an inconsistent record that could cause unexpected behavior when querying or processing attachments based onsource_type. [ Low confidence ]backend/src/storage/services/attachments.py — 1 comment posted, 8 evaluated, 7 filtered
Task.objects.get(id=task.id)usesBaseSoftDeleteManagerwhich filtersis_deleted=False. If the task is soft-deleted between the attachment creation and this permission assignment call (race condition), this will raiseTask.DoesNotExistexception that is not handled, causing the permission assignment to fail entirely. [ Already posted ]task.taskperformer_set.all()without filtering out performers withdirectly_status=DirectlyStatus.DELETED. TheTaskmodel has anexclude_directly_deleted_taskperformer_set()method specifically for this purpose. This could grant file access permissions to users or groups that were explicitly removed as performers. [ Already posted ]template.template_owners. TheTemplateOwnermodel definesrelated_name='owners'on itstemplateForeignKey, so the correct accessor istemplate.owners, nottemplate.template_owners. This will raiseAttributeError. [ Already posted ]Workflow.objects.get(id=workflow.id)usesBaseSoftDeleteManagerwhich filtersis_deleted=False. If the workflow is soft-deleted between the attachment creation and this permission assignment call (race condition), this will raiseWorkflow.DoesNotExistexception that is not handled, causing the permission assignment to fail entirely. [ Already posted ]_assign_workflow_permissions,workflow.tasks.all()is called at line 141 buttaskperformer_setis not prefetched for those tasks. Then line 143 callstask.taskperformer_set.all()for each task, causing N+1 database queries. While not a crash, this could cause significant performance degradation with many tasks. [ Low confidence ]bulk_createwithignore_conflicts=Truedoes not populate primary keys on returned objects in many database backends. When iterating overcreated_attachmentsat line 228, theattachmentobjects may haveid=None, causing_create_related()to potentially fail when assigning permissions to an attachment without a valid database ID. [ Already posted ]user.accountinget_user_attachmentswill raiseAttributeErrorif the user object doesn't have anaccountattribute, or could return incorrect results ifuser.accountisNone, matching attachments withaccount=NULL. [ Low confidence ]backend/src/storage/services/file_sync.py — 0 comments posted, 4 evaluated, 3 filtered
_generate_file_idmethod generates a random 32-character string usingget_salt(32), but thefile_idfield onFileAttachmenthasunique=True. If a collision occurs (unlikely but possible),attachment.save(update_fields=['file_id'])insync_all_fileswill raise anIntegrityErrorthat is not caught, causing the sync to fail unexpectedly. [ Already posted ]Attachment.objects.get_or_create()call on line 144 returns a tuple(instance, created_bool), but the return value is ignored. The code unconditionally incrementsstats['created']on line 155 even when the record already existed and was just retrieved (i.e., whencreated_boolisFalse). This causes incorrect statistics reporting. [ Already posted ]sync_to_new_attachment_model, thestats['created']counter is incremented regardless of whetherget_or_createactually created a new record or returned an existing one. This results in inaccurate statistics since existing records will be counted as created. [ Already posted ]backend/src/storage/utils.py — 0 comments posted, 3 evaluated, 3 filtered
source.descriptiononTaskmodel. TheTaskmodel hasdescription_templateandclear_descriptionfields but nodescriptionattribute, causingAttributeErrorat runtime. [ Already posted ]source.descriptiononWorkflowmodel. TheWorkflowmodel does not have adescriptionattribute, causingAttributeErrorat runtime. [ Already posted ]source.descriptiononTemplatemodel. TheTemplatemodel does not have adescriptionattribute, causingAttributeErrorat runtime. [ Already posted ]backend/src/storage/views.py — 0 comments posted, 2 evaluated, 2 filtered
get_queryset,AttachmentServiceis instantiated withuser=self.request.user, but theAttachmentServiceclass instorage/services/attachments.pyinherits fromBaseModelServiceand does not define__init__to accept auserparameter. IfBaseModelServicedoes not handle arbitrary keyword arguments, this will raise aTypeError. Even if it silently ignores the kwarg,self.userwon't be set on the service instance. [ Low confidence ]check_permission,AttachmentServiceis instantiated withuser=request.user, but similar to the issue above, ifBaseModelServicedoesn't properly set this attribute, thecheck_user_permissionmethod's optimization path (hasattr(self, 'user') and self.user and self.user.id == user_id) will fail to work correctly, falling back to an extra database query unnecessarily, or worse,self.usermay not be properly initialized. [ Already posted ]storage/src/application/use_cases/file_upload.py — 0 comments posted, 3 evaluated, 3 filtered
UploadFileUseCase.execute(), the file is uploaded to S3 first (lines 68-73), then the database record is created (lines 76-78). If the database commit fails (e.g., constraint violation, connection error), the file remains orphaned in S3 with no cleanup. On retry, a newfile_idis generated viauuid4(), leaving the original orphan file permanently in storage with no record pointing to it. [ Already posted ]FileRecordRepository.create()method wrapssession.add()in try-except forIntegrityError, butsession.add()is a synchronous operation that doesn't immediately flush to the database. Constraint violations are only raised duringflush()orcommit(). SinceUploadFileUseCasecallsawait self._unit_of_work.commit()outside the repository's try-except block, anyIntegrityErrorwill bubble up as a raw SQLAlchemy exception rather than being caught and wrapped inDatabaseConstraintError. [ Already posted ]self._fastapi_base_urlends with a trailing slash, the generatedpublic_urlat line 81 will contain a double slash (e.g.,https://example.com//file-id). While many HTTP clients handle this, it produces non-canonical URLs that could cause issues with URL matching or caching. [ Already posted ]storage/src/infra/http_client.py — 0 comments posted, 3 evaluated, 3 filtered
httpx.AsyncClient()is created without a timeout configuration, meaning HTTP requests can hang indefinitely. Whenhttpx.TimeoutExceptionis caught, the code reportstimeout=30.0but this is misleading since no timeout is actually configured on the client. The exception would only be raised if the server itself closes the connection. [ Already posted ]user.auth_typeisUserType.ANONYMOUS, no authorization headers are set and the request proceeds silently. If the backend requires some form of identification for anonymous users, this will result in permission check failures that may be difficult to debug since no explicit handling exists for this auth type. [ Low confidence ]user.auth_typeisUserType.AUTHENTICATEDbutuser.tokenisNoneor an empty string, the condition on line 49 evaluates toFalseand noAuthorizationheader is sent. This could cause silent authentication failures where an authenticated user's request is treated as unauthenticated. [ Already posted ]storage/src/infra/repositories/file_record_repository.py — 0 comments posted, 1 evaluated, 1 filtered
create()is ineffective:session.add()only stages the ORM object in the session—it does not perform any database I/O. TheIntegrityError,OperationalError, andSQLAlchemyDatabaseErrorexceptions will only be raised whensession.flush()orsession.commit()is called (outside this method), so these except blocks will never catch database errors. [ Already posted ]storage/src/infra/repositories/storage_service.py — 0 comments posted, 8 evaluated, 8 filtered
_storage_typeis'google', the code correctly setssignature_version='s3'on line 30, but the else branch (lines 51-60) is used for any non-'local'storage type, not just'google'. If a different storage type value is configured, GCS-specific parameters (GCS_S3_ENDPOINT, etc.) will be used incorrectly, potentially causing connection failures. [ Already posted ]StorageErroris instantiated incorrectly on line 119-121. The constructor expects anerror_code_key(a key fromDOMAIN_ERROR_CODESdictionary), but a formatted message string is passed instead. This will cause aKeyErrorwhenDOMAIN_ERROR_CODES[error_code_key]is executed. Should useStorageError.bucket_create_failed(details=str(e))instead. [ Already posted ]content_typecan beNone(line 92), but it's passed directly tos3.put_object(ContentType=content_type)on line 128. Some S3-compatible backends may rejectNoneforContentType, causing aClientErrororTypeError. Consider omitting the parameter whencontent_typeisNoneor defaulting to'application/octet-stream'. [ Already posted ]StorageErroris instantiated incorrectly on line 131-133. The constructor expects anerror_code_key, butMSG_STORAGE_010.format(details=str(e))passes a formatted message string. This will raise aKeyErrorat runtime. Should useStorageError.upload_failed(details=str(e))instead. [ Already posted ]StorageErroris instantiated incorrectly on line 174. A plain error message string is passed instead of anerror_code_key. This will cause aKeyError. Should useStorageError.file_not_found_in_storage(file_path=file_path, bucket_name=bucket_name)instead. [ Already posted ]StorageErrorconstructor expects anerror_code_key(e.g.,'STORAGE_FILE_NOT_FOUND') as the first argument, but a formatted message string is passed instead. This will cause aKeyErrorwhenDOMAIN_ERROR_CODES[error_code_key]is accessed. Should useStorageError.file_not_found_in_storage(file_path, bucket_name)instead. [ Already posted ]StorageErroris instantiated incorrectly on line 175-176.MSG_STORAGE_011.format(details=str(e))passes a formatted message string instead of anerror_code_key. This will raise aKeyError. Should useStorageError.download_failed(details=str(e))instead. [ Already posted ]StorageErrorconstructor expects anerror_code_keyas the first argument, butMSG_STORAGE_011.format(details=str(e))passes a formatted message string. This will cause aKeyErrorwhen looking up the key inDOMAIN_ERROR_CODES. Should useStorageError.download_failed(str(e))instead. [ Already posted ]storage/src/presentation/api/files.py — 0 comments posted, 2 evaluated, 2 filtered
media_typeparameter is set tofile_record.content_typewhich can beNoneper theFileRecordentity. WhileStreamingResponsemay handleNonegracefully, this could result in responses without a properContent-Typeheader, potentially causing client-side issues when processing the file. [ Already posted ]Content-Dispositionheader is constructed usingfile_record.filenamewhich can beNoneaccording to theFileRecordentity definition. This will result in a malformed header valueattachment; filename="None"being sent to clients. [ Already posted ]storage/src/shared_kernel/auth/guest_token.py — 0 comments posted, 1 evaluated, 1 filtered
__getitem__method return type annotation isstr | int | None, butself.payloadis typed asdict[str, Any], so the actual return value could be any type. Additionally, accessing a non-existent key will raiseKeyErrorwith no indication in the signature, which may surprise callers expectingNonefor missing keys. [ Already posted ]storage/src/shared_kernel/auth/public_token.py — 0 comments posted, 1 evaluated, 1 filtered
if token:on line 22 treats an empty string""as falsy, causing a new token to be generated instead of validating it. If a caller passes an empty string, they likely expect aValueErrorfor incorrect length (0 != expected), not silent token generation. The condition should beif token is not None:to match the type hintstr | Noneand properly validate empty strings. [ Already posted ]storage/src/shared_kernel/auth/redis_client.py — 0 comments posted, 2 evaluated, 2 filtered
dict[str, Any] | Nonemay not match whatpickle.loads(value)actually returns. If Redis contains pickled data that isn't a dict (e.g., a list, string, or custom object), the method will return a value that doesn't match its declared type, potentially causingAttributeErrororTypeErrorin callers that expect a dict. [ Low confidence ]pickle.loads()can raise many exceptions beyondpickle.UnpicklingError, includingAttributeError,ImportError,ModuleNotFoundError,EOFError, andTypeError(e.g., when the pickled object's class is unavailable or the data is corrupted). These exceptions will propagate uncaught instead of being wrapped inRedisOperationError, causing unexpected crashes. The except block on line 50 should catch a broader set of exceptions, such asExceptionor at minimum add the common deserialization failures. [ Already posted ]storage/src/shared_kernel/auth/services.py — 0 comments posted, 1 evaluated, 1 filtered
authenticate_public_tokenmethod receives the raw cookie value fromrequest.cookies.get('public-token')and passes it directly toPublicAuthService.get_token(). However,get_token()expects a header format like"Token <actual_token>"(checking forparts[0] != cls.HEADER_PREFIXon line 29). Cookie values don't have this prefix, so the checklen(parts) != cls.HEADER_PARTS or parts[0] != cls.HEADER_PREFIXwill always returnNonefor cookie-based authentication. [ Already posted ]storage/src/shared_kernel/auth/token_auth.py — 0 comments posted, 1 evaluated, 1 filtered
encryptmethod is declared asasyncbut contains noawaitexpressions - it only performs synchronoushashlib.pbkdf2_hmac()operations. While this works, it unnecessarily wraps synchronous CPU-bound work in a coroutine, which could cause performance issues under load as it won't yield to the event loop during the potentially expensive PBKDF2 computation. [ Low confidence ]storage/src/shared_kernel/database/models.py — 0 comments posted, 2 evaluated, 2 filtered
FileRecordORM.content_typeis defined withnullable=FalsebutFileRecord.content_typeis typed asstr | None. Whenentity_to_ormmaps an entity withcontent_type=None, the database will raise anIntegrityErroron flush/commit. [ Already posted ]FileRecordORM.filenameis defined withnullable=FalsebutFileRecord.filenameis typed asstr | None. Whenentity_to_ormmaps an entity withfilename=None, the database will raise anIntegrityErroron flush/commit. [ Already posted ]storage/src/shared_kernel/exceptions/base_exceptions.py — 0 comments posted, 1 evaluated, 1 filtered
to_responsemethod accepts arbitrary**kwargsand passes them toErrorResponse, butErrorResponseis a dataclass that only accepts specific fields (timestamp,request_id). Passing any other keyword argument will raise aTypeErrorat runtime (e.g.,to_response(user_id=123)would fail withTypeError: __init__() got an unexpected keyword argument 'user_id'). [ Already posted ]storage/src/shared_kernel/exceptions/exception_handler.py — 0 comments posted, 1 evaluated, 1 filtered
validation_exception_handleralways usesVALIDATION_ERROR_CODES['MISSING_REQUIRED_FIELD']regardless of the actual validation error type. PydanticValidationErrorcan contain many different error types (type errors, value errors, constraint violations, etc.), so the error code and message returned may be misleading or incorrect for non-missing-field validation failures. [ Already posted ]storage/src/shared_kernel/middleware/auth_middleware.py — 0 comments posted, 4 evaluated, 4 filtered
authenticate_guest_token, accessingguest_token['user_id']andguest_token['account_id'](lines 122-127) will raiseKeyErrorif these keys are missing from the JWT payload, sinceGuestToken.__getitem__directly accessesself.payload[key]. While this is caught by theexceptblock, it means any JWT without these specific claims will silently fail authentication even if it's otherwise valid. [ Low confidence ]if guest_token['user_id'](line 123) uses truthiness to decide whether to convert to int. Ifuser_idin the JWT payload is a falsy but valid value like0or"0"(empty string after stripping), the user_id will incorrectly be set toNoneinstead of the actual value. [ Already posted ]authenticate_public_tokenmethod passesraw_tokendirectly toPublicAuthService.get_token(), butget_token()expects a header string in the format"Token <token>"(it splits the input and checksparts[0] != cls.HEADER_PREFIX). When authenticating via the'public-token'cookie (line 87), the cookie value is just the raw token without the"Token "prefix, causingget_token()to always returnNoneand public token cookie authentication to silently fail. [ Already posted ]to_dict()method is called on theErrorResponseobject returned byexc.to_response(), butErrorResponseis not defined in the visible code and may not have ato_dict()method. This would cause anAttributeErrorat runtime when authentication fails. [ Low confidence ]storage/src/shared_kernel/permissions.py — 0 comments posted, 1 evaluated, 1 filtered
CombinedPermissions.get_error_message()always returns a generic'Access denied'message, but when a permission check fails, the specific permission that failed has its ownget_error_message()which is more informative. The__call__method inBasePermissionusesself.get_error_message(), so whenCombinedPermissionsis called and fails, the user gets a generic message rather than the specific error from the failing permission (e.g.,'Authentication required'fromIsAuthenticated). [ Low confidence ]storage/src/shared_kernel/uow/unit_of_work.py — 0 comments posted, 1 evaluated, 1 filtered
UnitOfWork.commit(), callingawait self._session.commit()while inside thesession.begin()context manager commits the transaction. The subsequent__aexit__call from the context manager may behave unexpectedly if the session's transaction state tracking doesn't align with SQLAlchemy's expected lifecycle. This could cause issues in error scenarios where the explicit commit fails but__aexit__still runs. [ Already posted ]