Skip to content

Conversation

adamziel
Copy link
Contributor

🚧 Work in progress 🚧

This PR migrates the initial detection of authors from a batch operation that requires loading the entire WXR file into memory into a streaming operation that consumes a chunk of the data using the streaming importer class.

The next step is making the rest of the import pipeline work with streaming to enable pausing, resuming, and avoid backtracking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant