Skip to content

Commit

Permalink
Add whitespace between text from different pages
Browse files Browse the repository at this point in the history
  • Loading branch information
pcheng17 committed Mar 27, 2023
1 parent ea32c2a commit 53943ef
Showing 1 changed file with 1 addition and 3 deletions.
4 changes: 1 addition & 3 deletions services/file.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,9 +45,7 @@ def extract_text_from_file(file: BufferedReader, mimetype: str) -> str:
if mimetype == "application/pdf":
# Extract text from pdf using PyPDF2
reader = PdfReader(file)
extracted_text = ""
for page in reader.pages:
extracted_text += page.extract_text()
extracted_text = " ".join([page.extract_text() for page in reader.pages])
elif mimetype == "text/plain" or mimetype == "text/markdown":
# Read text from plain text file
extracted_text = file.read().decode("utf-8")
Expand Down

0 comments on commit 53943ef

Please sign in to comment.