Skip to content

Replace SplitStringAtByteN with EllipsisStringWholeWord #31822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

KN4CK3R
Copy link
Member

@KN4CK3R KN4CK3R commented Aug 12, 2024

Fixes #31780

The fix is in services/feed/action.go, the other changed places are SplitStringAtByteN call which discard the remainer.

This does only fix new texts. Existing action feed entries are not altered.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Aug 12, 2024
@github-actions github-actions bot added modifies/api This PR adds API routes or modifies them modifies/go Pull requests that update Go code labels Aug 12, 2024
steps[i] = &ActionTaskStep{
Name: name,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

runner := &actions_model.ActionRunner{
UUID: gouuid.New().String(),
Name: name,
Name: base.EllipsisStringWholeWord(req.Msg.Name, 255),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we changing an API response in the first place?
Let's leave it to the caller to handle that case.

}
}
act.Content = fmt.Sprintf("%d|%s", issue.Index, truncatedContent)
act.Content = fmt.Sprintf("%d|%s", issue.Index, base.EllipsisStringWholeWord(comment.Content, 200))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

@wolfogre
Copy link
Member

Now I see the whole story.

If this PR is to fix #31780, I agree with @delvh, it should let the frontend code to handle it.

And some SplitStringAtByteN are used for protecting the database, it's not related to #31780, so it doesn't really need a better formatted string, SplitStringAtByteN is good enough.

assert.Equal(t, "foo bar…", EllipsisStringWholeWord("foo bar foo", 10))
assert.Equal(t, "foo bar foo", EllipsisStringWholeWord("foo bar foo", 11))

assert.Equal(t, "测试文本…", EllipsisStringWholeWord("测试文本一二三四", 4))
Copy link
Contributor

@wxiaoguang wxiaoguang Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The truncation should be done at "byte" level, but not "rune" level.

The problem of "rune" level is that CJK chars will be too much longer than expected.

assert.Equal(t, "foo…", EllipsisStringWholeWord("foo bar", 4))
assert.Equal(t, "测试文本…", EllipsisStringWholeWord("测试文本一二三四", 4))

See the difference. That's why the old function is "SplitStringAtByteN"

(History: #17928 (review) )

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it was intended to be at rune level because EllipsisString works on rune level too. Didn't thought about the database restriction here because I would expect a hard cut.

Copy link
Contributor

@wxiaoguang wxiaoguang Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is usually no "database restriction", because VARCHAR(10) could store 10 CJK runes or emoji runes (say, 30 - 40bytes) if its charset/collation is UTF-8 based.

I think the real problem is that CJK chars render much longer than ASCII chars with the same rune length on the UI ......... for example, assume one ASCII char's width is about W.

  • 9 CJK runes: about W * 9 * 2 = W * 18
  • 9 bytes CJK chars: W * 3 * 2 = W * 6
  • 9 bytes ASCII chars: W * 9

So "9 CJK runes" exceeds the width limitation too much on the UI, while usually "9 bytes CJK chars" won't exceed the expected "9 bytes ASCII" width.


(I don't mean blocking for this change, just share the backgrounds, maybe it's worth to document the details no matter what method is finally chosen)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> Refactor "string truncate" #32984

@wxiaoguang
Copy link
Contributor

-> Do not render truncated links in markdown #32980

@wxiaoguang
Copy link
Contributor

@wxiaoguang wxiaoguang closed this Dec 25, 2024
@go-gitea go-gitea locked as resolved and limited conversation to collaborators Mar 25, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. modifies/api This PR adds API routes or modifies them modifies/go Pull requests that update Go code type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Links on dashboard summary are malformed
5 participants