Skip to content

Add api endpoint to duplicate document #570

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 24, 2025

Conversation

sampaccoud
Copy link
Member

@sampaccoud sampaccoud commented Jan 21, 2025

Purpose

We want users to be able to duplicate a document.

resolves #335 and #567

Proposal

  • Refactor the way access control is done on attachment files to play well with document duplication
  • Add a "duplicate" action to the document API endpoint.
  • Extract attachment keys from content when a document is saved, and add them to the list of readable attachments on the current document but only if they were already readable on another document by the editing user...

While doing this work I was brought to refactor the way the media-auth and collaboration-auth actions were factorizing code and add missing tests for these actions.

@sampaccoud sampaccoud self-assigned this Jan 21, 2025
@sampaccoud sampaccoud force-pushed the add-api-endpoint-to-duplicate-document branch from a138321 to 95241cd Compare January 28, 2025 07:44
@sampaccoud sampaccoud requested a review from lunika January 28, 2025 07:44
@sampaccoud
Copy link
Member Author

@lunika done implementing your feedbacks 🙏

Copy link
Collaborator

@AntoLC AntoLC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have existing data the last migration does not succeed.

I got a bit confused about the with_accesses as a query string instead of from the payload.
Except that works well! I used it frontend side, I challenged it a bit, works like excepted.

from core import enums


def base64_yjs_to_xml(base64_string):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see it used ^^

@sampaccoud sampaccoud force-pushed the add-api-endpoint-to-duplicate-document branch from bddfd20 to 88c1cbb Compare February 28, 2025 08:18
@AntoLC AntoLC self-requested a review February 28, 2025 09:14
Copy link
Collaborator

@AntoLC AntoLC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't retry all the process, but the migration works fine now.

"""Extract text from base64 yjs document."""

blocknote_structure = base64_yjs_to_xml(base64_string)
soup = BeautifulSoup(blocknote_structure, "html.parser")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look if the lxml parser can be used. It seems faster.

@lunika lunika self-assigned this Mar 18, 2025
@sampaccoud sampaccoud force-pushed the add-api-endpoint-to-duplicate-document branch from 88c1cbb to 5fa6784 Compare March 23, 2025 21:50
@sampaccoud sampaccoud requested a review from lunika March 23, 2025 21:56
@sampaccoud
Copy link
Member Author

@lunika ready for final call.

@lunika lunika force-pushed the add-api-endpoint-to-duplicate-document branch from 5fa6784 to d4b5552 Compare March 24, 2025 09:27
@lunika lunika force-pushed the add-api-endpoint-to-duplicate-document branch from d4b5552 to 8fd1ec4 Compare March 24, 2025 09:31
Documents content is stored in the Ydoc format. We need a util
to extract it as xml/text.
These 2 actions had factorized code but a few iterations lead to
spaghetti code where factorized code includes "if" clauses.

Refactor abstractions so that code factorization really works.
Tests were forgotten. While writing the tests, I fixed
a few edge cases like the possibility to connect to the
collaboration server for an anonymous user.
These methods were involved in a bug that was fixed without first
evidencing the error in a test:
#556

Fixes #567
Migration tests should not import and use factories or models
directly from the code because they would not be in sync with
the database in the state that each state needs to test it.

Instead the migrator object passed as argument allows us to
retrieve a minimal version of the models in sync with the state
of the database that we are testing. What we get is a minimal
model and we need to simulate all the methods that we could have
on the real model and that are needed for testing.
We took this opportunity to refactor the way access is controlled on
media attachments. We now add the media key to a list on the document
instance each time a media is uploaded to a document. This list is
passed along when a document is duplicated, allowing us to grant
access to readers on the new document, even if they don't have or
lost access to the original document.

We also propose an option to reproduce the same access rights on the
duplicate document as what was in place on the original document.
This can be requested by passing the "with_accesses=true" option in
the query string.

The tricky point is that we need to extract attachment keys from the
existing documents and set them on the new "attachments" field that is
now used to track access rights on media files.
We can't prevent document editors from copy/pasting content to from one
document to another. The problem is that copying content, will copy the
urls pointing to attachments but if we don't do anything, the reader of
the document to which the content is being pasted, may not be allowed to
access the attachment files from the original document.

Using the work from the previous commit, we can grant access to the readers
of the target document by extracting the attachment keys from the content and
adding themto the target document's "attachments" field. Before doing this,
we check that the current user can indeed access the attachment files extracted
from the content and that they are allowed to edit the current document.
The idea behind wrapping choices in `lazy` function was to allow
overriding the list of languages in tests with `override_settings`.
This was causin makemigrations to keep on including the field in
migrations when it is not needed. Since we finally don't override
the LANGUAGES setting in tests, we can remove it to fix the problem.
@lunika lunika force-pushed the add-api-endpoint-to-duplicate-document branch from 8fd1ec4 to ff8b7e4 Compare March 24, 2025 09:32
@lunika lunika enabled auto-merge (rebase) March 24, 2025 09:32
@lunika lunika merged commit c882f13 into main Mar 24, 2025
25 of 26 checks passed
@lunika lunika deleted the add-api-endpoint-to-duplicate-document branch March 24, 2025 09:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duplicate docs
4 participants