Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(agents-api): fix list docs #1094

Merged
merged 2 commits into from
Jan 27, 2025
Merged

fix(agents-api): fix list docs #1094

merged 2 commits into from
Jan 27, 2025

Conversation

Ahmad-mtos
Copy link
Contributor

@Ahmad-mtos Ahmad-mtos commented Jan 25, 2025

PR Type

Bug fix


Description

  • Fixed a join condition in list_docs query.

  • Improved query logic for better data accuracy.


Changes walkthrough 📝

Relevant files
Bug fix
list_docs.py
Fix join condition in `list_docs` query                                   

agents-api/agents_api/queries/docs/list_docs.py

  • Added an additional condition to the join with docs_embeddings.
  • Ensures the index field is matched in the join.
  • +1/-0     

    Need help?
  • Type /help how to ... in the comments thread for any questions about Qodo Merge usage.
  • Check out the documentation for more information.

  • Important

    Fix join condition in list_docs.py to ensure accurate data retrieval by matching index field.

    • Bug Fix:
      • Fix join condition in list_docs.py by adding d.index = e.index to the query.
      • Ensures accurate data retrieval by matching index field in join with docs_embeddings.

    This description was created by Ellipsis for ba4ec16. It will automatically update as commits are pushed.

    Copy link
    Contributor

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 1 🔵⚪⚪⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Data Consistency

    The new join condition with index field could potentially filter out valid records if the indices don't match exactly between docs and docs_embeddings tables. Verify this is the intended behavior.

    AND d.index = e.index

    Copy link
    Contributor

    @ellipsis-dev ellipsis-dev bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    👍 Looks good to me! Reviewed everything up to 8367d6a in 8 seconds

    More details
    • Looked at 12 lines of code in 1 files
    • Skipped 0 files when reviewing.
    • Skipped posting 1 drafted comments based on config settings.
    1. agents-api/agents_api/queries/docs/list_docs.py:38
    • Draft comment:
      Ensure that the added condition AND d.index = e.index in the SQL join does not introduce logical errors or performance issues. Verify that index is a valid field for joining and that it aligns with the intended logic of the query.
    • Reason this comment was not posted:
      Confidence changes required: 50%
      The PR adds a condition to the SQL query to join on both doc_id and index. This change seems to be intended to ensure that the join is more precise by matching both fields. However, it's important to ensure that this change does not introduce any logical errors or performance issues.

    Workflow ID: wflow_mYihH7nWVqyP4wFo


    You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

    Copy link
    Contributor

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Security
    Prevent data leakage across developers

    Add a condition to ensure the join with docs_embeddings also matches on developer_id
    to prevent potential data leakage across different developers.

    agents-api/agents_api/queries/docs/list_docs.py [36-38]

     LEFT JOIN docs_embeddings e
         ON d.doc_id = e.doc_id
         AND d.index = e.index
    +    AND d.developer_id = e.developer_id
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: The suggestion addresses a critical security concern by preventing potential data leakage between different developers through proper join conditions, ensuring each developer can only access their own document embeddings.

    9

    Copy link
    Contributor

    qodo-merge-pro-for-open-source bot commented Jan 25, 2025

    CI Feedback 🧐

    (Feedback updated until commit ba4ec16)

    A test triggered by this PR failed. Here is an AI-generated analysis of the failure:

    Action: Typecheck

    Failed stage: Typecheck [❌]

    Failed test name: agents_api.common.protocol.models

    Failure summary:

    The pytype static type checker failed due to an attribute error in the file
    agents_api/common/protocol/models.py:

  • On line 47, the code attempts to call .load() on an object of type dict[str, Any], but dictionaries
    don't have a load method
  • The error occurs in the load_arguments function where self.arguments.load() is called
  • The type checker expected self.arguments to be either a RemoteObject or dict[str, Any], but trying
    to call load() on the dict type caused the failure

  • Relevant error logs:
    1:  ##[group]Operating System
    2:  Ubuntu
    ...
    
    1213:  [13/299] check agents_api.autogen.Entries
    1214:  [14/299] check agents_api.autogen.Tasks
    1215:  [15/299] check agents_api.autogen.Agents
    1216:  [16/299] check agents_api.common.utils.datetime
    1217:  [17/299] check agents_api.common.protocol.developers
    1218:  [18/299] check agents_api.common.utils.db_exceptions
    1219:  [19/299] check agents_api.exceptions
    1220:  [20/299] check agents_api.queries.utils
    1221:  ERROR:pytype.matcher Invalid type: <class 'pytype.abstract.function.ParamSpecMatch'>
    ...
    
    1232:  [31/299] check agents_api.worker.codec
    1233:  [32/299] check agents_api.queries.docs.search_docs_hybrid
    1234:  [33/299] check agents_api.queries.docs.search_docs_by_text
    1235:  [34/299] check agents_api.queries.docs.search_docs_by_embedding
    1236:  [35/299] check agents_api.queries.docs.mmr
    1237:  [36/299] check agents_api.dependencies.developer_id
    1238:  [37/299] check agents_api.queries.docs.create_doc
    1239:  [38/299] check agents_api.common.protocol.models
    1240:  FAILED: /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/common/protocol/models.pyi 
    1241:  /home/runner/work/julep/julep/agents-api/.venv/bin/python -m pytype.main --disable pyi-error --imports_info /home/runner/work/julep/julep/agents-api/.pytype/imports/agents_api.common.protocol.models.imports --module-name agents_api.common.protocol.models --platform linux -V 3.12 -o /home/runner/work/julep/julep/agents-api/.pytype/pyi/agents_api/common/protocol/models.pyi --analyze-annotated --nofail --none-is-not-bool --quick --strict-none-binding /home/runner/work/julep/julep/agents-api/agents_api/common/protocol/models.py
    1242:  /home/runner/work/julep/julep/agents-api/agents_api/common/protocol/models.py:47:30: error: in load_arguments: No attribute 'load' on dict[str, Any] [attribute-error]
    1243:  In Union[agents_api.worker.codec.RemoteObject, dict[str, Any]]
    1244:  self.arguments = self.arguments.load()
    1245:  ~~~~~~~~~~~~~~~~~~~
    1246:  For more details, see https://google.github.io/pytype/errors.html#attribute-error
    ...
    
    1410:  [202/299] check tests.sample_tasks.test_find_selector
    1411:  [203/299] check tests.test_messages_truncation
    1412:  [204/299] check agents_api.worker.__init__
    1413:  [205/299] check agents_api.rec_sum.entities
    1414:  [206/299] check agents_api.clients.__init__
    1415:  [207/299] check agents_api.activities.logger
    1416:  [208/299] check agents_api.metrics.__init__
    1417:  [209/299] check agents_api.workflows.__init__
    1418:  ninja: build stopped: cannot make progress due to previous errors.
    1419:  Computing dependencies
    1420:  Generated API key since not set in the environment: 59034420533241633964465082380108
    1421:  Sentry DSN not found. Sentry will not be enabled.
    1422:  Analyzing 289 sources with 0 local dependencies
    1423:  Leaving directory '.pytype'
    1424:  ##[error]Process completed with exit code 1.
    

    @creatorrr
    Copy link
    Contributor

    add a test for this please @Ahmad-mtos

    Copy link
    Contributor

    @ellipsis-dev ellipsis-dev bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    👍 Looks good to me! Incremental review on ba4ec16 in 10 seconds

    More details
    • Looked at 12 lines of code in 1 files
    • Skipped 0 files when reviewing.
    • Skipped posting 1 drafted comments based on config settings.
    1. agents-api/agents_api/queries/docs/get_doc.py:32
    • Draft comment:
      Ensure that all non-aggregated columns are included in the GROUP BY clause to prevent unexpected results. The current GROUP BY clause seems correct, but verify that the aggregation logic aligns with the intended data structure.
    • Reason this comment was not posted:
      Confidence changes required: 50%
      The query uses array_agg to aggregate content, indices, and embeddings. However, the GROUP BY clause should include all non-aggregated columns to ensure correct aggregation. The current GROUP BY clause seems correct, but it's important to ensure that the aggregation logic aligns with the intended data structure.

    Workflow ID: wflow_IUEMLInPIPnPVyCF


    You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

    @Ahmad-mtos Ahmad-mtos merged commit 1575f87 into dev Jan 27, 2025
    10 of 15 checks passed
    @Ahmad-mtos Ahmad-mtos deleted the x/list-docs branch January 27, 2025 11:02
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants