Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read options from a manifest file in importcontent #9467

Merged

Conversation

dylanmccall
Copy link
Contributor

@dylanmccall dylanmccall commented May 31, 2022

Summary

This adds a --manifest option to the importcontent management command.

With this change, one can export content from one Kolibri instance…

kolibri manage exportcontent c9d7f950ab6b5a1199e3d6c10d7f0103 my_export_dir

And then import it in another Kolibri instance…

kolbri manage importchannel network c9d7f950ab6b5a1199e3d6c10d7f0103
kolibri manage importcontent --manifest=my_export_dir/content/manifest.json disk c9d7f950ab6b5a1199e3d6c10d7f0103

Or…

kolbri manage importchannel network c9d7f950ab6b5a1199e3d6c10d7f0103
kolibri manage importcontent disk c9d7f950ab6b5a1199e3d6c10d7f0103 my_export_dir

That second option changes some existing behaviour: it automatically detects if there is a file named manifest.json in the path we are importing from, and uses that for the default node_ids and exclude_node_ids (instead of the previous behaviour where it would attempt to import everything). This should be the same in practice, as manifest.json should select all of the content which has been exported. I add a --no_detect_manifest option to preserve the previous behaviour.

References

This is a counterpart to my changes in #9460, adding support for exporting a manifest file.

Reviewer guidance

Testing checklist

  • Contributor has fully tested the PR manually
  • If there are any front-end changes, before/after screenshots are included
  • Critical user journeys are covered by Gherkin stories
  • Critical and brittle code paths are covered by unit tests

PR process

  • PR has the correct target branch and milestone
  • PR has 'needs review' or 'work-in-progress' label
  • If PR is ready for review, a reviewer has been added. (Don't use 'Assignees')
  • If this is an important user-facing change, PR or related issue has a 'changelog' label
  • If this includes an internal dependency change, a link to the diff is provided

Reviewer checklist

  • Automated test coverage is satisfactory
  • PR is fully functional
  • PR has been tested for accessibility regressions
  • External dependency files were updated if necessary (yarn and pip)
  • Documentation is updated
  • Contributor is in AUTHORS.md

@github-actions
Copy link
Contributor

github-actions bot commented May 31, 2022

@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch from 3d82ce4 to fb1a9e2 Compare June 1, 2022 00:19
@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch from fb1a9e2 to 32a0e82 Compare June 23, 2022 22:01
@@ -185,7 +185,7 @@ def get_import_export_nodes( # noqa: C901
nodes_query = nodes_query.filter(renderable_contentnodes_q_filter)

# filter down the query to remove files associated with nodes we've specifically been asked to exclude
if exclude_node_ids:
if exclude_node_ids is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --node_ids and --exclude_node_ids defaults to [] in the exportcontent.py and importcontent.py commands, so it will never be None when calling from there. I don't know if I'm missing something, but when I test the export command with exclude_node_ids option, I don't get anything, because the nodes_query is empty, because in the previous if, the filter is done with an empty node_ids list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, this commit seems rather suspect. Thanks for pointing it out! If we want to select a whole channel, exportcontent should just set node_ids to the channel ID (the root node), so this case where node_ids could be None doesn't make sense and it's better if we treat None and [] as equivalent :)

I'll revert it, and add some unit tests to make sure things work as needed.

Copy link
Contributor Author

@dylanmccall dylanmccall Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, this made sense in the end. I did need to change some tests, but this ended up a useful distinction for the case where a manifest file needs to indicate that there is no content. Just needed a bit more documentation :)

So what's happening here is, if you run importcontent without specifying any node_ids, this function gets called with node_ids=None. That does the right thing and matches with all nodes in the channel. There was some funny behaviour before, where if you do importcontent --node_ids='', this function ends up being called with node_ids=[''], which also does the right thing (it matches with no nodes in the channel), but in the wrong way (it only matches because after an exhaustive search we have concluded there's no node with ID ''). But that's great, because it means cleaning this up so None and [] are different doesn't change the behaviour!

So in the end I didn't revert this, but I added an explanatory comment in get_import_export_nodes, and I added some unit tests for the specific cases where node_ids=[] and node_ids=None.

Comment on lines 633 to 636
if not options["directory"] and not options["manifest"]:
raise CommandError(
"Either a directory or a manifest file must be provided."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes impossible to import from network without a manifest. And there's an exception because options["directory"] is not an existing option when importing from network

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spotting! That's fixed in the newest push.

@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch 7 times, most recently from d75f539 to 64f6e0b Compare July 5, 2022 21:14
@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch 3 times, most recently from 7a76392 to 12ffeee Compare July 13, 2022 02:46
If node_ids is set to an empty string, read it as an empty list,
instead of a list with a single empty string.
@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch from 12ffeee to 34f0b2e Compare August 15, 2022 17:39
@dylanmccall dylanmccall marked this pull request as ready for review August 15, 2022 17:39
Copy link
Member

@rtibbles rtibbles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any issues on the face of the code, but it would be good to add some test coverage of this new functionality.

kolibri/core/content/management/commands/importcontent.py Outdated Show resolved Hide resolved
@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch from 2d52226 to 3772108 Compare September 1, 2022 23:57
@dylanmccall
Copy link
Contributor Author

Okay, I added some unit tests here, both for importcontent remote and importcontent disk with manifest files. I think they cover all the cases I've added here, including the --no-detect-manifest option and manifest files with mismatched channel versions.

@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch from 8467a32 to a7a647b Compare September 2, 2022 01:30
This resolves a compatibility issue with Python 2.
@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch from a7a647b to 90cff47 Compare September 2, 2022 16:07
Node IDs turn into unicode strings when read from the database, so
we need to explicitly check for unicode strings in assertions from
get_import_export_mock.
@dylanmccall dylanmccall requested a review from rtibbles September 6, 2022 16:07
# Split the comma separated string we get, into a list of strings
type=lambda x: x.split(","),
# Split the comma separated string we get, into a set of strings
type=lambda x: set(x.split(",")) if x else set(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do this coersion to a set inside the handle_async method instead of here? Currently this introduces a different in behaviour between command line invocations of the management command and calls to call_command - as you can see below in the tests you didn't modify, node_ids and exclude_node_ids are still being set as lists. If the content manifest behaviour in particular is sensitive to this, then this may cause issues.

Copy link
Contributor Author

@dylanmccall dylanmccall Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sure thing! In the newest commit, the argument is parsed into a list (an empty list if it is given an empty string), and we coerce it into a set inside _transfer(). I'm not 100% sure about doing it there, but it kind of felt right since we clean up some other inputs there as well.

@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch from e5b7251 to 063c993 Compare September 7, 2022 22:24
This makes the code easier to test, and better reflects that the order
of node_ids is not significant. Related code in content_manifest already
deals with node_ids as a set.
@dylanmccall dylanmccall force-pushed the importcontent-manifest-file branch from 063c993 to 1ca7065 Compare September 7, 2022 22:46
@rtibbles rtibbles merged commit 4b3f31c into learningequality:develop Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tracking of content resource visibility in exported channels
3 participants