Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync version fails when two version end up with the same slug #4846

Open
humitos opened this issue Nov 1, 2018 · 15 comments
Open

Sync version fails when two version end up with the same slug #4846

humitos opened this issue Nov 1, 2018 · 15 comments
Labels
Accepted Accepted issue on our roadmap Bug A bug Needed: replication Bug replication is required

Comments

@humitos
Copy link
Member

humitos commented Nov 1, 2018

When sync_versions is called if there are two verbose_name that endup with the same slug this code breaks:

https://github.com/rtfd/readthedocs.org/blob/ba7623d2ff83b2c88cecbbe55800317f46116d0d/readthedocs/restapi/utils.py#L76-L82

As example,

In [11]: VersionSlugField(populate_from='').slugify('example/')                                                                                                                               
Out[11]: 'example-'

In [12]: VersionSlugField(populate_from='').slugify('/example/')                                                                                                                              
Out[12]: 'example-'

In [13]: VersionSlugField(populate_from='').slugify('example-')                                                                                                                               
Out[13]: 'example-'

Sentry: https://sentry.io/read-the-docs/readthedocscom/issues/746103998/

@humitos humitos added the Bug A bug label Nov 1, 2018
@humitos humitos added this to the Build stability milestone Nov 1, 2018
@dojutsu-user
Copy link
Member

@humitos
I would like to work on this issue.
My suggestion:
We can modify the slugify method in VersionSlugField in a way so that it generates a unique hash for each slug and append it in last of the generated slug.

These lines

https://github.com/rtfd/readthedocs.org/blob/104c3a625978f31971a11766023516fc2de2aa8f/readthedocs/builds/version_slug.py#L84-L94

can be modified in this way

def slugify(self, content):
        if not content:
            return ''

        hashed = abs(hash(content)) % (10 ** 8)
        slugified = content.lower()
        slugified = self.invalid_chars_re.sub(self.placeholder, slugified)
        slugified = self.leading_punctuation_re.sub('', slugified)
        slugified += f'-{hashed}'

        if not slugified:
            return self.fallback_slug
        return slugified

Result:

>>> VersionSlugField(populate_from='').slugify('example/')
'example--81377426'
>>> VersionSlugField(populate_from='').slugify('/example/')
'example--24251688'
>>> VersionSlugField(populate_from='').slugify('example-')
'example--14556457'
>>> VersionSlugField(populate_from='').slugify('abcd')
'abcd-19060746'

@stsewd
Copy link
Member

stsewd commented Nov 1, 2018

@dojutsu-user
Copy link
Member

@stsewd When i test the test cases manually, they give the wrong result (as described in the issue)

>>> VersionSlugField(populate_from='').slugify('1!0')
'1-0'
>>> VersionSlugField(populate_from='').slugify('1%0')
'1-0'
>>> VersionSlugField(populate_from='').slugify('1?0')
'1-0'

@stsewd
Copy link
Member

stsewd commented Nov 1, 2018

@humitos
Copy link
Member Author

humitos commented Nov 6, 2018

VersionSlugField(populate_from='').slugify('abcd')
'abcd-19060746'

We can't generate slugs like this ones because all the URLs will become very ugly. Example, https://example.readthedocs.io/en/abcd-19060746/ (version) or https://abcd-19060746.readthedocs.io/en/latest/ (project).

@stsewd stsewd added the Needed: replication Bug replication is required label Nov 26, 2018
@stsewd
Copy link
Member

stsewd commented Nov 26, 2018

I can't replicate even with the same verbose name from sentry

        version = Version.objects.create(
            verbose_name='raven/fix-start',
            project=self.pip
        )
        self.assertEqual(version.slug, 'raven-fix-start')

        version = Version.objects.create(
            verbose_name='raven/fix-start',
            project=self.pip
        )
        self.assertEqual(version.slug, 'raven-fix-start_a')

@stsewd
Copy link
Member

stsewd commented Nov 26, 2018

duplicate key value violates unique constraint "builds_version_project_id_xxx_uniq"
DETAIL:  Key (project_id, slug)=(xxx, raven-fix-start) already exists.

I don't think the problem is in the sluggify step

@stsewd
Copy link
Member

stsewd commented Nov 26, 2018

Here is the real problem

https://github.com/rtfd/readthedocs.org/blob/c37b00995c1bbc5ee51d3552ef176546373bb912/readthedocs/builds/models.py#L118-L118

Investigating

But it doesn't make sense anyway, slug should be unique in each project :/

@stale
Copy link

stale bot commented Jan 10, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: stale Issue will be considered inactive soon label Jan 10, 2019
@stsewd
Copy link
Member

stsewd commented Jan 10, 2019

We still need to investigate more here, bot. Replication is needed.

@stale stale bot removed the Status: stale Issue will be considered inactive soon label Jan 10, 2019
@stale
Copy link

stale bot commented Feb 24, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: stale Issue will be considered inactive soon label Feb 24, 2019
@stsewd
Copy link
Member

stsewd commented Feb 25, 2019

Still valid, maybe a race condition in some part of the code

@stale stale bot removed the Status: stale Issue will be considered inactive soon label Feb 25, 2019
@stale
Copy link

stale bot commented Apr 11, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: stale Issue will be considered inactive soon label Apr 11, 2019
@dojutsu-user
Copy link
Member

Still valid bot.

@stale stale bot removed the Status: stale Issue will be considered inactive soon label Apr 11, 2019
@humitos
Copy link
Member Author

humitos commented Apr 27, 2019

We actually call to create_slug which does the uniquifying

@stsewd If I understand correctly, we only call create_slug when bool(slug) == False. That's probably the issue: the uniquification does not happen when creating Version object and passing a slug.

@humitos humitos added the Accepted Accepted issue on our roadmap label Apr 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted Accepted issue on our roadmap Bug A bug Needed: replication Bug replication is required
Projects
None yet
Development

No branches or pull requests

3 participants