Skip to content

Conversation

@pomegranited
Copy link
Contributor

@pomegranited pomegranited commented Apr 16, 2025

Description

Adds child usage keys to the Container search document, so that basic info about Container children (count, usage key, block type) can be retrieved from the search index.

This change is a performance improvement for Library Authors, but does not alter their user experience.

Supporting information

Part of: openedx/frontend-app-authoring#1778
Addresses this comment: https://github.com/openedx/edx-platform/pull/36477/files#r2033737978
Private-ref: FAL-4139

Testing instructions

See openedx/frontend-app-authoring#1820

Deadline

ASAP

@openedx-webhooks
Copy link

openedx-webhooks commented Apr 16, 2025

Thanks for the pull request, @pomegranited!

This repository is currently maintained by @openedx/wg-maintenance-edx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.


Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Apr 16, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in Contributions Apr 16, 2025
@mphilbrick211 mphilbrick211 added the FC Relates to an Axim Funded Contribution project label Apr 23, 2025
@mphilbrick211 mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Apr 23, 2025
@pomegranited pomegranited force-pushed the jill/content-search-unit-blocks branch from 826a8f3 to 3aaea1a Compare April 29, 2025 07:12
to get_container + get_container_children API methods, to avoid
re-fetching this data during search indexing.
Stores the draft children + published children (if applicable)

* lib_api.get_container does not take a "user" arg
* fetch_customizable_fields_from_container does not need a "user" arg
because anything that appears in a library may be tagged.
@pomegranited pomegranited force-pushed the jill/content-search-unit-blocks branch from 3aaea1a to 0bc1a86 Compare April 29, 2025 07:43
@pomegranited pomegranited marked this pull request as ready for review April 30, 2025 04:48
@pomegranited pomegranited requested a review from rpenido April 30, 2025 04:48
Copy link
Contributor

@rpenido rpenido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍
Thank you for your work, @pomegranited!

  • I tested this using the instructions from the PR
  • I read through the code
  • I checked for accessibility issues
  • Includes documentation

Copy link
Contributor

@navinkarkera navinkarkera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pomegranited Nice work! 👍

  • I tested this: (Verified children usage_keys in index)
  • I read through the code
  • I checked for accessibility issues
  • Includes documentation


try:
container = lib_api.get_container(container_key)
container_obj = lib_api.get_container_from_key(container_key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for get_container_from_key says it's an "Internal method", but in #36504 we added it to __all__ so it's now also part of the public API.

The thing is, it doesn't make sense as a public API since both get_container and get_container_from_key take a container key as their first argument. I think we should rename it to _get_container_from_key, and keep it as internal (remove it from __all__) and create a separate public function called get_container_obj which wraps _get_container_from_key. This would be much cleaner and clearer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up being able to remove get_container_from_key from the public API without having to add a get_container_obj method. I had to add container_pk to ContainerMetadata (since ContainerLink needed it), but this seems cleaner to me too.

Basically, this sets the value of "upstream_display_name" on the downstream block.
"""
upstream = lib_api.get_container(LibraryContainerLocator.from_string(downstream.upstream), user)
upstream = lib_api.get_container(LibraryContainerLocator.from_string(downstream.upstream))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this even working before?! Thanks for fixing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank mypy :) It caught it when I changed get_container to require named parameters after the key.

Comment on lines 195 to 206
def get_container(
container_key: LibraryContainerLocator,
*,
include_collections=False,
container: Container | None = None,
) -> ContainerMetadata:
"""
Get a container (a Section, Subsection, or Unit).
"""
container = get_container_from_key(container_key)
if not container:
container = get_container_from_key(container_key)
assert container.key == container_key.container_id
Copy link
Contributor

@bradenmacdonald bradenmacdonald Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine to leave as-is, but I'm personally not a huge fan of this API style of taking an ID and optionally a pre-loaded object.

I would suggest instead:

def get_container(
    container_or_key: LibraryContainerLocator | Container,
    /, *,
    include_collections=False,
) -> ContainerMetadata:
    """
    Get metadata about a container (a Section, Subsection, or Unit).
    """
    container = container_or_key if isinstance(container_or_key, Container) else _get_container_from_key(container_or_key)

Otherwise, I don't think the messiness it adds to the code from the API consumer side is worth the two tiny database queries it saves.

The /, * will help ensure better compatibility with similar changes to the API in the future. And I think it's better to make this change now to bake it into the new Teak API contract, if others agree. Though all the containers APIs are considered unstable and subject to change anyways.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted this, cf 49557f5

API users must use get_container and ContainerMetadata.

Related changes:

* made set_library_item_collections take an entity_key string instead of
  a PublishableEntity instance, so we don't need to fetch a Container
  object to call it.
* added `entity_key_from_usage_key` to the xblock API to support ^
* added container_id
* changed ContainerLink.update_or_create to take the container PK
  instead of a Container object, since this is enough.
* added container_pk to ContainerMetadata to support ^
* reverted previous change that added an optional Container param to
  get_container and get_container_children
container.key,
)
container = lib_api.get_container(object_id, include_collections=True)
collections = container.collections
Copy link
Contributor

@bradenmacdonald bradenmacdonald May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a change for this PR: but in the future, I think we should just refactor collections to be an independent API like tagging. Right now we have get_container(object_id, include_collections=True) and container.collections and set_container_collections(...) etc. as well as similar things for components. But why not just get_collections(entity), add_to_collection(entity), etc. that works with any library item? I think that would be cleaner and simpler.

Edit: never mind, this just seems to be a problem with the REST API. I don't understand why we have LibraryBlockCollectionsView and LibraryContainerCollectionsView for updating specific item's collections when we have LibraryCollectionsView.update_items which does the same thing generically.

def update_or_create(
cls,
upstream_container: Container | None,
upstream_container: int | None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this working? I would think that below on line 322 you need to set upstream_container_pk to this value then, not upstream_container. But maybe the Django ORM is sorting it out?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah sorry, I had that on my list to fix before pushing, but forgot.
I've fixed it, but I would have expected tests to fail without this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember seeing a similar thing before where assigning an ID to an object field still worked silently. In any case while I'm testing this I'll make sure the ContainerLink is getting created.

component = api.get_component_from_usage_key(UsageKey.from_string(self.lib2_problem_block["id"]))

component_key = UsageKey.from_string(self.lib2_problem_block["id"])
entity_key = xblock_api.entity_key_from_usage_key(component_key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I kinda prefer this as it was, to keep the logic you copied into entity_key_from_usage_key contained within openedx-learning. Can't you leave it as it was and just use:

        component = api.get_component_from_usage_key(UsageKey.from_string(self.lib2_problem_block["id"]))
        entity_key = component.publishable_entity.key

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, I was uneasy about that too.. don't like duplicating critical logic like that. Reverted with 2e7be75.

Use get_component_from_usage_key, and get it from the component instead.

Also fixes nit re ContainerLink.update_or_create
Copy link
Contributor

@bradenmacdonald bradenmacdonald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good! I did some quick tests with the final version to confirm that the search index seems to be getting updated properly and the MFE is using the search index only.

@pomegranited pomegranited merged commit 5b3caa9 into openedx:master May 2, 2025
49 checks passed
@pomegranited pomegranited deleted the jill/content-search-unit-blocks branch May 2, 2025 01:17
@github-project-automation github-project-automation bot moved this from Waiting on Author to Done in Contributions May 2, 2025
pomegranited added a commit to open-craft/edx-platform that referenced this pull request May 2, 2025
…-0083] (openedx#36528)

* feat: store content.child_usage_keys in Container search document
  Stores the draft children + published children (if applicable)

Related fixes:
* fix: lib_api.get_container does not take a "user" arg
* refactor: fetch_customizable_fields_from_container does not need a "user" arg
* refactor: moves tags_count into LibraryItem
   because anything that appears in a library may be tagged.
* refactor: remove get_container_from_key from public API
   API users must use get_container and ContainerMetadata.
* refactor: made set_library_item_collections take an entity_key string instead of
  a PublishableEntity instance, so we don't need to fetch a Container object to call it.
* refactor: changed ContainerLink.update_or_create to take the container PK
  instead of a Container object, since this is enough.
  Added container_pk to ContainerMetadata to support this.

(cherry picked from commit 5b3caa9)
@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.

@edx-pipeline-bot
Copy link
Contributor

2U Release Notice: This PR has been deployed to the edX production environment.

navinkarkera pushed a commit that referenced this pull request May 2, 2025
…-0083] (#36528)

* feat: store content.child_usage_keys in Container search document
  Stores the draft children + published children (if applicable)

Related fixes:
* fix: lib_api.get_container does not take a "user" arg
* refactor: fetch_customizable_fields_from_container does not need a "user" arg
* refactor: moves tags_count into LibraryItem
   because anything that appears in a library may be tagged.
* refactor: remove get_container_from_key from public API
   API users must use get_container and ContainerMetadata.
* refactor: made set_library_item_collections take an entity_key string instead of
  a PublishableEntity instance, so we don't need to fetch a Container object to call it.
* refactor: changed ContainerLink.update_or_create to take the container PK
  instead of a Container object, since this is enough.
  Added container_pk to ContainerMetadata to support this.

(cherry picked from commit 5b3caa9)
UsamaSadiq pushed a commit that referenced this pull request May 14, 2025
…-0083] (#36528)

* feat: store content.child_usage_keys in Container search document
  Stores the draft children + published children (if applicable)

Related fixes:
* fix: lib_api.get_container does not take a "user" arg
* refactor: fetch_customizable_fields_from_container does not need a "user" arg
* refactor: moves tags_count into LibraryItem
   because anything that appears in a library may be tagged.
* refactor: remove get_container_from_key from public API
   API users must use get_container and ContainerMetadata.
* refactor: made set_library_item_collections take an entity_key string instead of
  a PublishableEntity instance, so we don't need to fetch a Container object to call it.
* refactor: changed ContainerLink.update_or_create to take the container PK
  instead of a Container object, since this is enough.
  Added container_pk to ContainerMetadata to support this.
marlonkeating pushed a commit that referenced this pull request Jul 15, 2025
…-0083] (#36528)

* feat: store content.child_usage_keys in Container search document
  Stores the draft children + published children (if applicable)

Related fixes:
* fix: lib_api.get_container does not take a "user" arg
* refactor: fetch_customizable_fields_from_container does not need a "user" arg
* refactor: moves tags_count into LibraryItem
   because anything that appears in a library may be tagged.
* refactor: remove get_container_from_key from public API
   API users must use get_container and ContainerMetadata.
* refactor: made set_library_item_collections take an entity_key string instead of
  a PublishableEntity instance, so we don't need to fetch a Container object to call it.
* refactor: changed ContainerLink.update_or_create to take the container PK
  instead of a Container object, since this is enough.
  Added container_pk to ContainerMetadata to support this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

FC Relates to an Axim Funded Contribution project open-source-contribution PR author is not from Axim or 2U

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

7 participants