-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does it support inserting Chinese text? #329
Comments
Yes it does! And it works for
Using these means your PDF will not need or contain extra fonts, resp. fontfiles. |
hi, First thank you for your reply! But it does not work. As following picture, the red circle position should be my chinese text,but they have missed.
…------------------ 原始邮件 ------------------
发件人: "Jorj X. McKie"<notifications@github.com>;
发送时间: 2019年7月22日(星期一) 晚上8:05
收件人: "pymupdf/PyMuPDF"<PyMuPDF@noreply.github.com>;
抄送: "蜗牛快跑"<931880645@qq.com>;"Author"<author@noreply.github.com>;
主题: Re: [pymupdf/PyMuPDF] Does it support inserting Chinese text? (#329)
Yes it does! And it works for insertText() as well as for insertTextbox().
For a recent discussion (an example using a Thai font) see #319.
PyMuPDF comes with built-in fonts for traditional and simplified Chinese fonts. Use:
fontname="china-s" or fontname="china-ss" for simplified Chinese
fontname="china-t" or fontname="china-ts" for traditional Chinese
Using these means your PDF will not need or contain extra fonts, resp. fontfiles.
If you want to use a special font however, you can also do this. You must then choose a fontname different from every of the above and also specify the filename of a fontfile on your system.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
There was no picture ... Please let me see your script. Here is my example: Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>> import fitz
>>> doc = fitz.open()
>>> page = doc.newPage()
>>> text = "你好!hello!Hallo! 我很喜欢德国!德国是个好地方!"
>>> page.insertText((100,100), text, fontname="china-ss")
1
>>> doc.save("test.pdf")
>>> It leads to this PDF: |
My code as following:
'''
def write_text_to_pdf(pdf_key, position_x, position_y, page_num, insert_text=""):
pdf_url = get_qiniu_url(pdf_key)
pdf_content = requests.get(pdf_url).content
doc = fitz.open("type", pdf_content)
page = doc[page_num-1]
page_height = page.rect.height
p = fitz.Point(position_x, page_height-position_y) # start point of 1st line
if not insert_text:
insert_text = datetime.date.today().strftime("%Y-%m-%d")
page.insertText(p,
insert_text,
fontname="china-ss",
fontsize=14,
rotate=0,
)
tem_name = str(uuid.uuid4()).replace('-', '') + ".pdf"
pdf_path = TEMP_DIR + tem_name
doc.save(pdf_path)
'''
when the insert_text="2018年11月12日" , the chinese texts (年 月 日) missed, as the attachment showing.
------------------ 原始邮件 ------------------
发件人: "Jorj X. McKie"<notifications@github.com>;
发送时间: 2019年7月23日(星期二) 晚上8:53
收件人: "pymupdf/PyMuPDF"<PyMuPDF@noreply.github.com>;
抄送: "蜗牛快跑"<931880645@qq.com>;"Author"<author@noreply.github.com>;
主题: Re: [pymupdf/PyMuPDF] Does it support inserting Chinese text? (#329)
There was no picture ... Please let me see your script. Here is my example:
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license()" for more information. >>> import fitz >>> doc = fitz.open() >>> page = doc.newPage() >>> text = "你好!hello!Hallo! 我很喜欢德国!德国是个好地方!" >>> page.insertText((100,100), text, fontname="china-ss") 1 >>> doc.save("test.pdf") >>>
It leads to this PDF:
test.pdf
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
You forgot to attach the PDF again. The code looks okay so far.
|
The file you sent was just an image, not a PDF, so I am unble to track down what happened. I hope you have tried the script I sent you. What were the results? |
I have tried your script, as following: |
finally we are making progress! # -*- coding: utf-8 -*-
import fitz
doc = fitz.open() # new PDF
text = u"你好!hello!Hallo! 我很喜欢德国!德国是个好地方"
pnt = fitz.Point(50, 72) # start point of text insertion
page = doc.newPage() # crete a page
page.insertText(pnt, text, fontname="china-t")
doc.save(__file__ + ".pdf") The resulting PDF was correct. |
good to see it works now. |
Could you give some hints about a nicer font for mixed text?
but there is no 'cjk' font in the latest version of PyMuPdf now. What kind of fontfile did you use for mixed font? For example, mixed of Japanese and English. |
Why do you say that? Of course there is! It is >>> import fitz
>>> font=fitz.Font("cjk")
>>> font.name
'Droid Sans Fallback Regular'
>>> |
oh, sorry for that. |
These methods font=fitz.Font("cjk")
page.insert_font(fontname="F0", fontbuffer=font.buffer)
page.insert_text(..., fontname="F0",...) |
Thank you very much! This saved my day! |
ok, I'll see what I can do. |
Hello! Is there a way to insert Chinese text using
worked, file was Ok, but I want to use textbox to ensure the right positioning: So I did:
(I downloaded a fontfile from here: https://fonts.google.com/specimen/ZCOOL+XiaoWei?subset=chinese-simplified#standard-styles) Also I tried It doesn't give me any errors, but the output file is blank |
If you would try font = fitz.Font(fontfile="...")
page=...
tw = fitz.TextWriter(page.rect)
tw.fill_textbox(...)
tw.write_text(page) For a TTF font, this hsould deliver the best results. |
Thank you for the snippet. I followed your advice and had a problem with this line
ValueError: Text must start in rectangle. |
Actually the mistake was when I tried to insert text with fontsize bigger than rect
works perfectly when I provide the smaller fontsize |
My bad, should have looked up the right call pattern again first: |
This method |
I have another question about extract font family buffer, sunch as example.ttf. |
The fontfile = open(f"myfont.{ext}", "wb")
fontfile.write(buffer)
fontfile.close() But please note, that in many (if not most) PDFs not the complete font is embedded, but only those characters of a font, which are actually used inside the PDF. |
@JorjMcKie Thanks a lot. I use doc.extract_font get all font families of a pdf as follows: But I don't know the meaning of 'BCDEEE\BCDFEE\BCDGEE\BCDHEE\BCDIEE'. I just get font info from span dictionary by page.get_text('dict'), which only show SimSun, ArialMT, Calibri. |
The prefix of 6 upper case ASCII letter with a "+" mean a font subset: this is not the complete "SimSun.ttf" for example. |
if it supports, how should i set the 'encoding' when i try to use the 'page.insertText()' method? or any other method recommend,thank you!
The text was updated successfully, but these errors were encountered: