-
Notifications
You must be signed in to change notification settings - Fork 4.2k
feat: Import from modulestore APIs #36540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Import from modulestore APIs #36540
Conversation
|
Thanks for the pull request, @NiedielnitsevIvan! This repository is currently maintained by Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review. 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. Where can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources: When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
d06b3bc to
fbf93c8
Compare
fbf93c8 to
2f0b019
Compare
2f0b019 to
4113467
Compare
kdmccormick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reviewing, but here are some comments in the meantime
| VERTICAL = 'vertical' | ||
| XBLOCK = 'xblock' | ||
| COMPLICATED_LEVELS = [CHAPTER, SEQUENTIAL, VERTICAL] | ||
| FLAT_LEVELS = [XBLOCK] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| FLAT_LEVELS = [XBLOCK] | |
| COMPONENT = 'component' | |
| UNIT = ContainerType.Unit.value | |
| # Subsection- and Section-level composition to be added in the future as part of | |
| # https://github.com/openedx/frontend-app-authoring/issues/1602 | |
| # SUBSECTION = ContainerType.Subsection.value | |
| # SECTION = ContainerType.Section.value | |
| COMPLEX_LEVELS = [ | |
| UNIT, | |
| # SUBSECTION, | |
| # SECTION, | |
| ] |
A few suggestions here:
- The container_types in learning core use the product names (component, unit, subsection, section) rather than the xblock tag names (..., vertical, sequential, chapter).
- These levels are already defined in the ContainerType enum (just Unit, for now), so let's reference those names in order to keep them consistent.
- Let's reverse the order so that, when compared using
>, SECTION > SUBSECTION > UNIT > COMPONENT - Complex is often used to describe structures, whereas complicated usually implies "confusing".
- Rather than include the levels that aren't supported yet, we can put them behind comments until they're ready to be used.
- It is safe to assume that COMPONENT will always be the one and only flat level, so no need to define a FLAT_LEVELS list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I fixed the name COMPLICATED_LEVELS and removed the FLAT_LEVELS list.
But I can't agree with you about the other part, because these legacy names are tied to the functional part. They are necessary to check that the block we are importing is really vertical/sequential, etc. because this is the information stored in Staged Content OLX.
Otherwise, we need to do vertical -> unit, sequential -> subsection mapping to preserve functionality.
Also, I think that at the CompositionLevel level, we should not comment out CHAPTER and SEQUENTIAL, because it will require a lot of refactoring as a temporary thing. It seems to me more logical to comment out the corresponding mapping in ImportClient until Sections and Subsections are merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I can't agree with you about the other part, because these legacy names are tied to the functional part. They are necessary to check that the block we are importing is really vertical/sequential, etc. because this is the information stored in Staged Content OLX.
OK, that's a good point. However, I think the API should use the new names rather than the legacy names, so we need some way of mapping between the old and new names. I have an idea that I've proposed to Dave and Braden in #content-libraries-relaunch-dev.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, we have added logic to openedx.core.content_libraries.api.containers.ContainerType: #36580
Please have the API use container_types (unit, subsection, section) and use the ContainerType interface to translate to/from the old OLX block types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your help with this!
Refactored to ContainerType usage.
kdmccormick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(still reviewing)
| try: | ||
| import_event = Import.objects.get(uuid=import_uuid, status=ImportStatus.STAGED, user_id=user_id) | ||
| except Import.DoesNotExist: | ||
| log.info('Ready import from modulestore not found') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling here is just hiding information from the developer/operator. If the import is missing, then the developer/operator will want to know that, and they will want to know what the uuid of the missing import is. The exception carries all of that information, but the log statement does not.
In the future we could do some more thoughtful error handling here, but we don't really have time, so please just remove the try-excepts. If the Import does not exist, then an Import.DoesNotExist exception will be raised, and that is a perfectly reasonable outcome for now.
Practically speaking, the REST API and Django Admin will each want to handle the not-found scenarios differently. The REST API probably should raise a 404/400, whereas the Django Admin should display an error banner. By allowing DoesNotExist to be raised, we make it very easy for the REST API and Django Admin to handle these situations using except Import.DoesNotExist--or, if we want to handle all not-found scenarios together, using except ObjectDoesNotExist. On the other hand, by hiding this exception and logging it, we make it impossible for the REST API or Django Admin to display a useful error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, in the future I'll avoid the error handling where it's not needed. Thank you!
| try: | ||
| target_learning_package = LearningPackage.objects.get(id=learning_package_id) | ||
| except Import.LearningPackage: | ||
| log.info('Target learning package not found') | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here-- remove the try-except wrapper, and let the exception escape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
| except Exception as exc: # pylint: disable=broad-except | ||
| import_event.set_status(ImportStatus.IMPORTING_FAILED) | ||
| raise exc from exc # TODO: retest raise and describe change_log commitment on Exception |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| except Exception as exc: # pylint: disable=broad-except | |
| import_event.set_status(ImportStatus.IMPORTING_FAILED) | |
| raise exc from exc # TODO: retest raise and describe change_log commitment on Exception | |
| except: # pylint: disable=broad-except | |
| import_event.set_status(ImportStatus.IMPORTING_FAILED) | |
| raise |
- You can use bare
raiseto re-raise the same exception that you caught raise exc from excdoes not make sense-- that would be saying that "exc caused itself"- Can you explain
TODO: retest raise and describe change_log commitment on Exception? Thebulk_draft_changes_forblock will ensure that all DB changes are rolled back when an exception escapes. Is that what you wanted to make sure of?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored.
Is that what you wanted to make sure of?
Yes, sorry, I didn't realize I hadn't removed this comment.
It was a comment for myself to explore the behavior of the new bulk_draft_changes_for context manager when Exception raises.
| @shared_task | ||
| @set_code_owner_attribute | ||
| def import_staged_content_to_library_task( | ||
| usage_keys_string: list[str], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| usage_keys_string: list[str], | |
| usage_key_strings: list[str], |
this is a list of strings, each of which is a usage key
usage_keys_string implies a single string made of many usage keys
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
| parser = etree.XMLParser(strip_cdata=False) | ||
|
|
||
|
|
||
| class ImportClient: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're close to the deadline I will not ask you to make this change, but as feedback for future work: When a class's sole purpose is to have a single method executed, for example:
import_client = ImportClient(
import_event,
usage_key_string,
target_learning_package,
staged_content_item,
composition_level,
override,
)
import_client.import_from_staged_content()then the class does not need to exist; it could instead be a function:
import_from_staged_content(
import_event,
usage_key_string,
target_learning_package,
staged_content_item,
composition_level,
override,
)and the class's private methods (_process_import, etc) could instead be regular private helper functions.
The function is easier for other developers to understand because it has simpler control flow and it does not contain internal state. I know that in some languages like Java, the object-oriented style would be preferred, but in Python (especially in Open edX) the procederal style is preferred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, I moved it to a separate function.
Thank you for the detailed explanation.
| with authoring_api.bulk_draft_changes_for(learning_package_id=learning_package_id) as change_log: | ||
| try: | ||
| for usage_key_string in usage_keys_string: | ||
| if staged_content_item := import_event.get_staged_content_by_source_usage_key(usage_key_string): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is swallowing an exception, which is not a good thing. If the staged content is missing, then it is helpful for us to let StagedContentForImport.DoesNotExist be raised.
| if staged_content_item := import_event.get_staged_content_by_source_usage_key(usage_key_string): | |
| staged_content_item = import_event.staged_content_for_import.get(source_usage_key=usage_key_string): |
This also means that we can delete def get_staged_content_by_source_usage_key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_staged_content_by_source_usage_key removed
| @@ -0,0 +1,43 @@ | |||
| """ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding so many type annotations! Please add cms/djangoapps/import_from_modulestore to mypy.ini so that they will be checked when mypy is executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added, and type annotations have been improved a bit more.
kdmccormick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments. Still looking through helpers.py, but looking good otherwise.
| def get_block_to_import(node, usage_key): | ||
| """ | ||
| Get the block to import from a node. | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it more obvious what this returns. Before I read the definition, I thought it returned an XBlock.
| def get_block_to_import(node, usage_key): | |
| """ | |
| Get the block to import from a node. | |
| """ | |
| def get_node_for_usage_key(node: etree._Element, usage_key: UsageKey) -> etree._Element: | |
| """ | |
| Get the node in an XML tree which matches to the usage key. | |
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
| Get the block to import from a node. | ||
| """ | ||
|
|
||
| if node.get('url_name') == usage_key.block_id: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if node.get('url_name') == usage_key.block_id: | |
| if node.tag == usage_key.block_type and node.get('url_name') == usage_key.block_id: |
A block_id (a.k.a. url_name) exists within the namespace of a block_type. For example, the same course could have type@html+block@123 and type@problem+block@123. So, to properly identify a block in a course, check both the type and the ID.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, thanks.
| VERTICAL = 'vertical' | ||
| XBLOCK = 'xblock' | ||
| COMPLICATED_LEVELS = [CHAPTER, SEQUENTIAL, VERTICAL] | ||
| FLAT_LEVELS = [XBLOCK] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I can't agree with you about the other part, because these legacy names are tied to the functional part. They are necessary to check that the block we are importing is really vertical/sequential, etc. because this is the information stored in Staged Content OLX.
OK, that's a good point. However, I think the API should use the new names rather than the legacy names, so we need some way of mapping between the old and new names. I have an idea that I've proposed to Dave and Braden in #content-libraries-relaunch-dev.
7bd1ec5 to
852a243
Compare
kdmccormick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the hard work on this.
|
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
|
2U Release Notice: This PR has been deployed to the edX production environment. |
1 similar comment
|
2U Release Notice: This PR has been deployed to the edX production environment. |
This PR is addressed at adding the functionality to import modulestore-based content to the learning-core based learning package. Partof: openedx/frontend-app-authoring#1681
This PR is addressed at adding the functionality to import modulestore-based content to the learning-core based learning package. Partof: openedx/frontend-app-authoring#1681
Description
This PR is addressed at adding the functionality to import modulestore-based content to the learning-core based learning package.
Testing instructions
So far, these are only python interfaces for creating content imports, so it can only be tested in the cms shell.
create_importandimport_course_staged_content_to_library)import_course_staged_content_to_library.import_course_staged_content_to_libraryaccepts ONLY Usage Keys of the chapter level.Deadline
ASAP