Query Optimizations #482

JSv4 · 2025-09-29T03:58:18Z

Substantial Changes:

Built an annotation query optimizer to reduce load times for annotations.
Dropped per-obj permissions for annotations, as these were a) not used anywhere and b) introducing massive overhead for no reason.
Revamped permission-handling functions to follow new annotation permission conventions.
Better-documented permission conventions and pathways

- Added new environment variables for document processing services in `.envs/.test/.django`. - Removed unused `.claude/settings.local.json` file. - Updated `.gitignore` to include additional files. - Enhanced `README.md` with more detailed project description. - Updated various GitHub Actions workflows to use the latest `actions/checkout` version. - Introduced new GraphQL resolvers and optimized file handling in the backend. - Added performance optimizations and new test cases for document handling.

- Updated `docker-compose.yml` to improve service dependencies with health checks for `postgres` and `redis`. - Removed `celeryworker` from dependencies to prevent circular dependency issues. - Enhanced GraphQL resolvers for optimized document annotations, adding filter checks and improving permission handling. - Deleted unused progressive types and related GraphQL fields to streamline the schema.

- Introduced `created_by_analysis` and `created_by_extract` fields in the Annotation model to enforce privacy for annotations created by analyses and extracts. - Implemented validation to ensure an annotation cannot be created by both an analysis and an extract. - Enhanced the AnnotationQueryOptimizer to filter annotations based on user permissions, ensuring private annotations are only visible to users with the appropriate access. - Updated GraphQL resolvers to respect the new permission model, requiring both object and corpus permissions for visibility. - Added comprehensive tests to validate the new privacy model and permission checks for analyses and extracts. - Updated documentation to reflect changes in the permissioning system and the new annotation privacy features. - Created a debug script for testing permission checks in a controlled environment. - Optimized existing queries to improve performance and reduce unnecessary database hits.

- Added detailed permission checks for `RemoveAnnotation`, `RejectAnnotation`, `ApproveAnnotation`, and `AddRelationship` mutations to ensure users have the appropriate access rights based on the privacy model. - Introduced informative messages in mutation responses to clarify permission issues and operation outcomes. - Implemented comprehensive tests for annotation permission handling, ensuring that privacy rules are respected across different user roles. - Updated the `user_has_permission_for_obj` function to include special handling for annotations with privacy fields, enhancing permission validation logic.

- Replaced the deprecated `resolve_oc_model_queryset` function with a more secure and maintainable `visible_to_user` method across various models. - Updated GraphQL resolvers to utilize the new permission logic, ensuring users can only access objects they are permitted to see. - Enhanced the `BaseVisibilityManager` to provide consistent permission filtering for all models. - Removed legacy resolver code and improved test coverage for visibility checks, ensuring robust permission handling. - Updated documentation to reflect the changes in permissioning logic and the deprecation of the old resolver function.

- Removed IMAGE_PREFIX environment variable and replaced it with a step to convert the repository owner to lowercase. - Updated image tagging to use the lowercase repository owner, ensuring consistent naming conventions for built images. - Enhanced workflow clarity by streamlining the environment variable usage.

…ermission handling - Added a custom resolver for the annotations field in the CorpusType class to compute permissions accurately using AnnotationQueryOptimizer. - Introduced a new migration to modify constraints on the Annotation model, ensuring annotations can only be created by one source at a time. - Updated UserFeedbackManager to delegate visibility checks to the queryset's method - Enhanced UserFeedbackQuerySet to refine visibility logic for user feedback based on creator and public status. - Added comprehensive tests for permission filtering across corpuses, documents, annotations, and labels, ensuring robust access control.

…reuse - Updated PooledS3Boto3Storage to improve performance by reusing boto3 S3 clients within threads and configuring connection pool size for concurrent operations. - Added retry logic for resilience during S3 operations. - Removed the redundant OptimizedS3Boto3Storage class and its related functionality to streamline the codebase.

codecov · 2025-09-30T04:51:35Z

Codecov Report

❌ Patch coverage is 93.57457% with 190 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
opencontractserver/annotations/query_optimizer.py	77.31%	66 Missing ⚠️
...erver/tests/performance_optimizations/test_base.py	38.70%	38 Missing ⚠️
...encontractserver/tests/test_visibility_managers.py	78.01%	31 Missing ⚠️
opencontractserver/utils/storage_warming.py	36.84%	24 Missing ⚠️
opencontractserver/shared/Managers.py	83.60%	10 Missing ⚠️
opencontractserver/annotations/models.py	40.00%	6 Missing ⚠️
opencontractserver/tests/test_resolvers.py	91.11%	4 Missing ⚠️
...s/migrations/0039_add_annotation_privacy_fields.py	76.92%	3 Missing ⚠️
opencontractserver/analyzer/signals.py	60.00%	2 Missing ⚠️
opencontractserver/utils/permissioning.py	95.55%	2 Missing ⚠️
... and 4 more

📢 Thoughts on this report? Let us know!

…k mutations permission checks - Introduced a new COMMENT permission type, allowing users to comment on annotations and relationships based on document and corpus permissions. - Updated RejectAnnotation and ApproveAnnotation mutations to enforce COMMENT permission checks, ensuring only authorized users can provide feedback. - Enhanced permission handling in the AnnotationQueryOptimizer to support the new COMMENT permission logic, including a special mode for open commenting. - Added comprehensive tests to validate the new COMMENT permission system and its integration with existing feedback functionalities. - Updated documentation to reflect changes in permissioning logic and the introduction of the COMMENT permission.

- Added model-specific optimizations in the PermissionedTreeQuerySet for the "corpus" model, utilizing select_related and prefetch_related to enhance database query efficiency. - Introduced new tests to validate the optimizations applied in the Corpus and Document models, ensuring that the expected queryset behaviors are maintained. - Enhanced existing test coverage for visibility checks, confirming that optimizations do not affect permission logic.

- Introduced a new COMMENT permission for the Analysis model, allowing users to comment on analyses. - Created a migration to update the model options and include the new permission in the permissions list. - Ensured consistency with existing permission structures in the application.

…g migrations - Introduced COMMENT permissions for multiple models including GremlinEngine, Analyzer, AnnotationLabel, LabelSet, Note, Conversation, ChatMessage, CorpusQuery, CorpusAction, DocumentAnalysisRow, DocumentRelationship, Column, Datacell, Extract, Fieldset, UserFeedback, Assignment, UserExport, and UserImport. - Created migrations to update model options and include the new COMMENT permissions in the permissions list for each model. - Ensured consistency with existing permission structures throughout the application.

- Introduced a new test method to validate various annotation permission types including CREATE, CRUD, ALL, and unsupported permissions. - Ensured that permissions are correctly enforced for users with and without access, enhancing coverage of the permissioning logic.

…ive tests - Updated comments in BaseVisibilityManager to clarify the purpose of the top-level permission logic. - Introduced a new test suite for BaseVisibilityManager, ensuring full coverage of the visible_to_user method across various user scenarios and permissions. - Enhanced test cases to validate edge cases, including handling of anonymous users, superusers, and authenticated users with specific permissions.

…iltering - Introduced a new test suite for Query Optimizer methods, covering AnnotationQueryOptimizer, RelationshipQueryOptimizer, and ExtractQueryOptimizer. - Validated permission-based access control for various user roles, including owners, collaborators, strangers, and superusers. - Ensured thorough testing of extract visibility, relationship retrieval, and annotation summaries based on user permissions and document access. - Enhanced test coverage for edge cases, including handling of non-existent extracts and structural relationships.

Signed-off-by: JSIV <5049984+JSv4@users.noreply.github.com>

- Added time freezing to ensure consistent rate limit testing across corpuses, documents, and labelsets queries. - Improved assertions to include the number of requests made before hitting the rate limit, providing clearer feedback on test outcomes. - Refactored permission setting in query optimizer tests for better readability and consistency.

- Added failure messages to ensure clarity when expected rate limits are not hit. - Implemented time freezing in tests to maintain consistent request timing. - Enhanced assertions to validate that both heavy and light queries hit their respective rate limits within the expected window.

JSv4 added 10 commits September 26, 2025 22:14

Cleanup. Removed stray test script. Cleaned up code.

abc6788

Merge branch 'main' into JSv4/query-optimizations

ff9cd7c

JSv4 added 4 commits September 30, 2025 01:31

JSv4 linked an issue Oct 1, 2025 that may be closed by this pull request

[FEATURE] - PDF Caching #452

Closed

JSv4 and others added 6 commits September 30, 2025 23:10

Merge branch 'main' into JSv4/query-optimizations

2d9f057

Signed-off-by: JSIV <5049984+JSv4@users.noreply.github.com>

JSv4 merged commit 62fc4e5 into main Oct 3, 2025
12 checks passed

JSv4 deleted the JSv4/query-optimizations branch October 3, 2025 12:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Query Optimizations #482

Query Optimizations #482

Uh oh!

JSv4 commented Sep 29, 2025

Uh oh!

codecov bot commented Sep 30, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Query Optimizations #482

Query Optimizations #482

Uh oh!

Conversation

JSv4 commented Sep 29, 2025

Uh oh!

codecov bot commented Sep 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Sep 30, 2025 •

edited

Loading