Skip to content

Commit 560890f

Browse files
committed
fix: limit chapter title length to 256 characters in pdf_split_handle.py
--bug=1054363 --user=刘瑞斌 【知识库】导入PDF文档,分段标题长度超长时,没有自动截断 https://www.tapd.cn/57709429/s/1681044
1 parent 675adee commit 560890f

File tree

1 file changed

+4
-3
lines changed

1 file changed

+4
-3
lines changed

apps/common/handle/impl/pdf_split_handle.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -173,14 +173,15 @@ def handle_toc(doc, limit):
173173

174174
# Null characters are not allowed.
175175
chapter_text = chapter_text.replace('\0', '')
176-
176+
# 限制标题长度
177+
real_chapter_title = chapter_title[:256]
177178
# 限制章节内容长度
178179
if 0 < limit < len(chapter_text):
179180
split_text = PdfSplitHandle.split_text(chapter_text, limit)
180181
for text in split_text:
181-
chapters.append({"title": chapter_title, "content": text})
182+
chapters.append({"title": real_chapter_title, "content": text})
182183
else:
183-
chapters.append({"title": chapter_title, "content": chapter_text if chapter_text else chapter_title})
184+
chapters.append({"title": real_chapter_title, "content": chapter_text if chapter_text else real_chapter_title})
184185
# 保存章节内容和章节标题
185186
return chapters
186187

0 commit comments

Comments
 (0)