feat: Import from modulestore APIs #36540

NiedielnitsevIvan · 2025-04-17T11:49:55Z

Description

This PR is addressed at adding the functionality to import modulestore-based content to the learning-core based learning package.

Testing instructions

So far, these are only python interfaces for creating content imports, so it can only be tested in the cms shell.

Create a course in the CMS and fill it with content.
Create a Content Library.
Go to the CMS shell.
Import the main APIs (create_import and import_course_staged_content_to_library)
Pass your course ID and user ID to create_import.
Check that a new entry has been created in the Import table in the admin panel (admin/import_from_modulestore/import/)
Check that a new entry has been created in the Staged Content table in the admin panel (/admin/content_staging/stagedcontent/), the number of created entries should be equal to the number of sections in the course.
Pass all the necessary data to import_course_staged_content_to_library.

The list of Usage Keys to be imported can be obtained from the Staged Content records. IMPORTANT: currently, import_course_staged_content_to_library accepts ONLY Usage Keys of the chapter level.

After completing the code, check that all the blocks from the specified sections have been created in the Content Library.

Deadline

ASAP

openedx-webhooks · 2025-04-17T11:49:59Z

Thanks for the pull request, @NiedielnitsevIvan!

This repository is currently maintained by @openedx/wg-maintenance-edx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
- This process (including the steps you'll need to take) is documented here.
If it doesn't, simply proceed with the next step.

🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

Dependencies

This PR must be merged before / after / at the same time as ...
Blockers

This PR is waiting for OEP-1234 to be accepted.
Timeline information

This PR must be merged by XX date because ...
Partner information

This is for a course on edx.org.
Supporting documentation
Relevant Open edX discussion forum threads

🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

The size and impact of the changes that it introduces
The need for product review
Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

kdmccormick

Still reviewing, but here are some comments in the meantime

cms/djangoapps/import_from_modulestore/api.py

cms/djangoapps/import_from_modulestore/constants.py

kdmccormick · 2025-04-21T20:31:18Z

cms/djangoapps/import_from_modulestore/data.py

+    VERTICAL = 'vertical'
+    XBLOCK = 'xblock'
+    COMPLICATED_LEVELS = [CHAPTER, SEQUENTIAL, VERTICAL]
+    FLAT_LEVELS = [XBLOCK]


Suggested change

FLAT_LEVELS = [XBLOCK]

COMPONENT = 'component'

UNIT = ContainerType.Unit.value

# Subsection- and Section-level composition to be added in the future as part of

# https://github.com/openedx/frontend-app-authoring/issues/1602

# SUBSECTION = ContainerType.Subsection.value

# SECTION = ContainerType.Section.value

COMPLEX_LEVELS = [

UNIT,

# SUBSECTION,

# SECTION,

]

A few suggestions here:

The container_types in learning core use the product names (component, unit, subsection, section) rather than the xblock tag names (..., vertical, sequential, chapter).

These levels are already defined in the ContainerType enum (just Unit, for now), so let's reference those names in order to keep them consistent.

Let's reverse the order so that, when compared using >, SECTION > SUBSECTION > UNIT > COMPONENT

Complex is often used to describe structures, whereas complicated usually implies "confusing".

Rather than include the levels that aren't supported yet, we can put them behind comments until they're ready to be used.

It is safe to assume that COMPONENT will always be the one and only flat level, so no need to define a FLAT_LEVELS list.

Thanks, I fixed the name COMPLICATED_LEVELS and removed the FLAT_LEVELS list.

But I can't agree with you about the other part, because these legacy names are tied to the functional part. They are necessary to check that the block we are importing is really vertical/sequential, etc. because this is the information stored in Staged Content OLX.
Otherwise, we need to do vertical -> unit, sequential -> subsection mapping to preserve functionality.

Also, I think that at the CompositionLevel level, we should not comment out CHAPTER and SEQUENTIAL, because it will require a lot of refactoring as a temporary thing. It seems to me more logical to comment out the corresponding mapping in ImportClient until Sections and Subsections are merged.

But I can't agree with you about the other part, because these legacy names are tied to the functional part. They are necessary to check that the block we are importing is really vertical/sequential, etc. because this is the information stored in Staged Content OLX.

OK, that's a good point. However, I think the API should use the new names rather than the legacy names, so we need some way of mapping between the old and new names. I have an idea that I've proposed to Dave and Braden in #content-libraries-relaunch-dev.

OK, we have added logic to openedx.core.content_libraries.api.containers.ContainerType: #36580

Please have the API use container_types (unit, subsection, section) and use the ContainerType interface to translate to/from the old OLX block types.

Thank you very much for your help with this!
Refactored to ContainerType usage.

kdmccormick

(still reviewing)

cms/djangoapps/import_from_modulestore/signals.py

openedx/core/djangoapps/content_libraries/tests/factories.py

kdmccormick · 2025-04-22T15:56:45Z

cms/djangoapps/import_from_modulestore/tasks.py

+    try:
+        import_event = Import.objects.get(uuid=import_uuid, status=ImportStatus.STAGED, user_id=user_id)
+    except Import.DoesNotExist:
+        log.info('Ready import from modulestore not found')


The error handling here is just hiding information from the developer/operator. If the import is missing, then the developer/operator will want to know that, and they will want to know what the uuid of the missing import is. The exception carries all of that information, but the log statement does not.

In the future we could do some more thoughtful error handling here, but we don't really have time, so please just remove the try-excepts. If the Import does not exist, then an Import.DoesNotExist exception will be raised, and that is a perfectly reasonable outcome for now.

Practically speaking, the REST API and Django Admin will each want to handle the not-found scenarios differently. The REST API probably should raise a 404/400, whereas the Django Admin should display an error banner. By allowing DoesNotExist to be raised, we make it very easy for the REST API and Django Admin to handle these situations using except Import.DoesNotExist--or, if we want to handle all not-found scenarios together, using except ObjectDoesNotExist. On the other hand, by hiding this exception and logging it, we make it impossible for the REST API or Django Admin to display a useful error message.

Fixed, in the future I'll avoid the error handling where it's not needed. Thank you!

kdmccormick · 2025-04-22T16:05:39Z

cms/djangoapps/import_from_modulestore/tasks.py

+    try:
+        target_learning_package = LearningPackage.objects.get(id=learning_package_id)
+    except Import.LearningPackage:
+        log.info('Target learning package not found')
+        return


Same comment here-- remove the try-except wrapper, and let the exception escape.

kdmccormick · 2025-04-22T16:08:39Z

cms/djangoapps/import_from_modulestore/tasks.py

+        except Exception as exc:  # pylint: disable=broad-except
+            import_event.set_status(ImportStatus.IMPORTING_FAILED)
+            raise exc from exc  # TODO: retest raise and describe change_log commitment on Exception


Suggested change

except Exception as exc: # pylint: disable=broad-except

import_event.set_status(ImportStatus.IMPORTING_FAILED)

raise exc from exc # TODO: retest raise and describe change_log commitment on Exception

except: # pylint: disable=broad-except

import_event.set_status(ImportStatus.IMPORTING_FAILED)

raise

You can use bare raise to re-raise the same exception that you caught

raise exc from exc does not make sense-- that would be saying that "exc caused itself"

Can you explain TODO: retest raise and describe change_log commitment on Exception? The bulk_draft_changes_for block will ensure that all DB changes are rolled back when an exception escapes. Is that what you wanted to make sure of?

Refactored.

Is that what you wanted to make sure of?

Yes, sorry, I didn't realize I hadn't removed this comment.
It was a comment for myself to explore the behavior of the new bulk_draft_changes_for context manager when Exception raises.

kdmccormick · 2025-04-22T16:12:33Z

cms/djangoapps/import_from_modulestore/tasks.py

+@shared_task
+@set_code_owner_attribute
+def import_staged_content_to_library_task(
+    usage_keys_string: list[str],


Suggested change

usage_keys_string: list[str],

usage_key_strings: list[str],

this is a list of strings, each of which is a usage key

usage_keys_string implies a single string made of many usage keys

kdmccormick · 2025-04-22T16:18:48Z

cms/djangoapps/import_from_modulestore/helpers.py

+parser = etree.XMLParser(strip_cdata=False)
+
+
+class ImportClient:


Since we're close to the deadline I will not ask you to make this change, but as feedback for future work: When a class's sole purpose is to have a single method executed, for example:

import_client = ImportClient( import_event, usage_key_string, target_learning_package, staged_content_item, composition_level, override, ) import_client.import_from_staged_content()

then the class does not need to exist; it could instead be a function:

import_from_staged_content( import_event, usage_key_string, target_learning_package, staged_content_item, composition_level, override, )

and the class's private methods (_process_import, etc) could instead be regular private helper functions.

The function is easier for other developers to understand because it has simpler control flow and it does not contain internal state. I know that in some languages like Java, the object-oriented style would be preferred, but in Python (especially in Open edX) the procederal style is preferred.

No problem, I moved it to a separate function.
Thank you for the detailed explanation.

kdmccormick · 2025-04-22T16:26:49Z

cms/djangoapps/import_from_modulestore/tasks.py

+    with authoring_api.bulk_draft_changes_for(learning_package_id=learning_package_id) as change_log:
+        try:
+            for usage_key_string in usage_keys_string:
+                if staged_content_item := import_event.get_staged_content_by_source_usage_key(usage_key_string):


Again, this is swallowing an exception, which is not a good thing. If the staged content is missing, then it is helpful for us to let StagedContentForImport.DoesNotExist be raised.

Suggested change

if staged_content_item := import_event.get_staged_content_by_source_usage_key(usage_key_string):

staged_content_item = import_event.staged_content_for_import.get(source_usage_key=usage_key_string):

This also means that we can delete def get_staged_content_by_source_usage_key.

get_staged_content_by_source_usage_key removed

kdmccormick · 2025-04-22T19:58:26Z

cms/djangoapps/import_from_modulestore/api.py

@@ -0,0 +1,43 @@
+"""


Thanks for adding so many type annotations! Please add cms/djangoapps/import_from_modulestore to mypy.ini so that they will be checked when mypy is executed.

Added, and type annotations have been improved a bit more.

kdmccormick

A few more comments. Still looking through helpers.py, but looking good otherwise.

kdmccormick · 2025-04-22T20:39:30Z

cms/djangoapps/import_from_modulestore/helpers.py

+def get_block_to_import(node, usage_key):
+    """
+    Get the block to import from a node.
+    """


Let's make it more obvious what this returns. Before I read the definition, I thought it returned an XBlock.

Suggested change

def get_block_to_import(node, usage_key):

"""

Get the block to import from a node.

"""

def get_node_for_usage_key(node: etree._Element, usage_key: UsageKey) -> etree._Element:

"""

Get the node in an XML tree which matches to the usage key.

"""

kdmccormick · 2025-04-22T20:41:35Z

cms/djangoapps/import_from_modulestore/helpers.py

+    Get the block to import from a node.
+    """
+
+    if node.get('url_name') == usage_key.block_id:


Suggested change

if node.get('url_name') == usage_key.block_id:

if node.tag == usage_key.block_type and node.get('url_name') == usage_key.block_id:

A block_id (a.k.a. url_name) exists within the namespace of a block_type. For example, the same course could have type@html+block@123 and type@problem+block@123. So, to properly identify a block in a course, check both the type and the ID.

Good point, thanks.

kdmccormick · 2025-04-22T21:14:42Z

cms/djangoapps/import_from_modulestore/data.py

+    VERTICAL = 'vertical'
+    XBLOCK = 'xblock'
+    COMPLICATED_LEVELS = [CHAPTER, SEQUENTIAL, VERTICAL]
+    FLAT_LEVELS = [XBLOCK]


But I can't agree with you about the other part, because these legacy names are tied to the functional part. They are necessary to check that the block we are importing is really vertical/sequential, etc. because this is the information stored in Staged Content OLX.

OK, that's a good point. However, I think the API should use the new names rather than the legacy names, so we need some way of mapping between the old and new names. I have an idea that I've proposed to Dave and Braden in #content-libraries-relaunch-dev.

…tore-apis

kdmccormick

Thank you for the hard work on this.

edx-pipeline-bot · 2025-04-23T19:43:57Z

2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.

edx-pipeline-bot · 2025-04-23T23:50:41Z

2U Release Notice: This PR has been deployed to the edX production environment.

edx-pipeline-bot · 2025-04-23T23:50:41Z

2U Release Notice: This PR has been deployed to the edX production environment.

This PR is addressed at adding the functionality to import modulestore-based content to the learning-core based learning package. Partof: openedx/frontend-app-authoring#1681

openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Apr 17, 2025

openedx-webhooks added this to Contributions Apr 17, 2025

github-project-automation bot moved this to Needs Triage in Contributions Apr 17, 2025

NiedielnitsevIvan changed the title ~~feat: Import from modulestore APIs~~ Draft: feat: Import from modulestore APIs Apr 17, 2025

NiedielnitsevIvan marked this pull request as draft April 17, 2025 11:59

NiedielnitsevIvan force-pushed the NiedielnitsevIvan/feat/import-from-modulestore-apis branch 3 times, most recently from d06b3bc to fbf93c8 Compare April 18, 2025 16:10

NiedielnitsevIvan marked this pull request as ready for review April 18, 2025 16:11

NiedielnitsevIvan changed the title ~~Draft: feat: Import from modulestore APIs~~ feat: Import from modulestore APIs Apr 18, 2025

NiedielnitsevIvan force-pushed the NiedielnitsevIvan/feat/import-from-modulestore-apis branch from fbf93c8 to 2f0b019 Compare April 21, 2025 08:19

kdmccormick mentioned this pull request Apr 21, 2025

Epic 11.5 - Import APIs #36389

Closed

NiedielnitsevIvan mentioned this pull request Apr 21, 2025

feat: add course to library Django admin action #36562

Closed

NiedielnitsevIvan force-pushed the NiedielnitsevIvan/feat/import-from-modulestore-apis branch from 2f0b019 to 4113467 Compare April 21, 2025 19:09

kdmccormick requested changes Apr 21, 2025

View reviewed changes

cms/djangoapps/import_from_modulestore/signals.py Outdated Show resolved Hide resolved

openedx/core/djangoapps/content_libraries/tests/factories.py Outdated Show resolved Hide resolved

NiedielnitsevIvan requested a review from kdmccormick April 22, 2025 15:31

kdmccormick requested changes Apr 22, 2025

View reviewed changes

kdmccormick reviewed Apr 22, 2025

View reviewed changes

kdmccormick requested changes Apr 22, 2025

View reviewed changes

This was referenced Apr 23, 2025

Draft: feat: REST APIs for import functionality #36541

Closed

[demonstration] Use django-user-tasks in import_from_modulestore #36582

Closed

NiedielnitsevIvan added 7 commits April 23, 2025 10:09

feat: create import client

958b903

feat: create celery tasks for import functionality

4213966

feat: add python APIs to create and process import

c03dcd8

style: fix pylint issues

a3ea142

test: add tests

9acd085

refactor: fix after rebase

ff2903d

refactor: fix after changed relations in the import models

cbfe82d

NiedielnitsevIvan and others added 11 commits April 23, 2025 10:09

style: remove extra imports

ee3f843

refactor: signal logic moved to the seprate util

3e31032

refactor: rename purpose

359b2bc

refactor: rename APIs

9abe4fc

test: avoid using factories for content_library (#2635)

3aa550e

refactor: refactor CompositionLevel enum

15fe9e5

fix: fix logic for not yet supported composition levels

5ff0620

fix: comment tests for not yet supported composition levels

8cce09b

refactor: fix issues after review

9112822

feat: adding support for importing sections and subsections

5f6a1e5

test: fix tests after refactoring

852a243

NiedielnitsevIvan force-pushed the NiedielnitsevIvan/feat/import-from-modulestore-apis branch from 7bd1ec5 to 852a243 Compare April 23, 2025 08:08

NiedielnitsevIvan added 3 commits April 23, 2025 11:23

style: fix naming and add type hints

9b96363

chore: add cms/djangoapps/import_from_modulestore to mypy.ini

227230b

refactor: refactor Composition Levels

2761d16

NiedielnitsevIvan requested a review from kdmccormick April 23, 2025 09:10

Andrii and others added 5 commits April 23, 2025 15:31

refactor: improve static typization to fix mypy errors

a5595f2

refactor: fix StagedContent import to prevent lint error

51658aa

refactor: fix pylint error for error raising

be631da

Merge branch 'master' into NiedielnitsevIvan/feat/import-from-modules…

9a881d8

…tore-apis

refactor: improve Import client container mapps

c49cdae

mphilbrick211 moved this from Needs Triage to In Eng Review in Contributions Apr 23, 2025

kdmccormick approved these changes Apr 23, 2025

View reviewed changes

kdmccormick merged commit 3f67f3c into openedx:master Apr 23, 2025
49 checks passed

github-project-automation bot moved this from In Eng Review to Done in Contributions Apr 23, 2025

-    FLAT_LEVELS = [XBLOCK]
+    COMPONENT = 'component'
+    UNIT = ContainerType.Unit.value
+    # Subsection- and Section-level composition to be added in the future as part of
+    # https://github.com/openedx/frontend-app-authoring/issues/1602
+    # SUBSECTION = ContainerType.Subsection.value
+    # SECTION = ContainerType.Section.value
+    COMPLEX_LEVELS = [
+        UNIT,
+        # SUBSECTION,
+        # SECTION,
+    ]

		parser = etree.XMLParser(strip_cdata=False)


		class ImportClient:

	if staged_content_item := import_event.get_staged_content_by_source_usage_key(usage_key_string):
	staged_content_item = import_event.staged_content_for_import.get(source_usage_key=usage_key_string):

	if node.get('url_name') == usage_key.block_id:
	if node.tag == usage_key.block_type and node.get('url_name') == usage_key.block_id:

feat: Import from modulestore APIs #36540

feat: Import from modulestore APIs #36540

Uh oh!

Conversation

NiedielnitsevIvan commented Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Testing instructions

Deadline

Uh oh!

openedx-webhooks commented Apr 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kdmccormick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kdmccormick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kdmccormick left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kdmccormick left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

edx-pipeline-bot commented Apr 23, 2025

NiedielnitsevIvan commented Apr 17, 2025 •

edited

Loading

openedx-webhooks commented Apr 17, 2025 •

edited

Loading