pdf #1

xuze1993 · 2019-07-02T06:03:18Z

I've pulled a website from webhttrack which is mixed of pdf and html,it seems that localgoogle can only index html files,is there anyway to solve the problem?

kodejuice · 2019-07-02T23:54:22Z

The code can be modified to read pdf documents (with a pdf library) while crawling and index it, but a copy of the file would need to be kept so the user can open it in the search result page. thats not good i think.

kodejuice · 2019-07-02T23:56:46Z

The code can be modified to read pdf documents (with a pdf library) while crawling and index it, but a copy of the file would need to be kept so the user can open it in the search result page. thats not good i think.

or we could just use the original pdf link in the search results, but if the original file is longer available, you wont be able to open it

xuze1993 · 2019-07-03T01:35:46Z

gocha,nice work anyway.
Sadly that fewer static sites are left on the web.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pdf #1

pdf #1

xuze1993 commented Jul 2, 2019

kodejuice commented Jul 2, 2019

kodejuice commented Jul 2, 2019

xuze1993 commented Jul 3, 2019

pdf #1

pdf #1

Comments

xuze1993 commented Jul 2, 2019

kodejuice commented Jul 2, 2019

kodejuice commented Jul 2, 2019

xuze1993 commented Jul 3, 2019