Skip to content

nitinkumar30/imageFileToPdf

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JPG to PDF converter

A framework made for converting an image file to PDF file.
Usage of just 4 packages of Python.

Working

  1. First, this will search for the text inside the image file(most probably .jpg/.PNG file).
  2. Then, it'll fetch the text & store it in a variable.
  3. And then, writes it into a docx file(So that if it extracts wrong text, you can manually type the correct text).
  4. And finally, convert the word file hence created into a pdf file.
multipleImageToPdfConverter.mp4

Requirements

  1. You need to download tesseract.
  2. You need following packages of python:
    1. PIL
    2. pytesseract
    3. docx
    4. docx2pdf
  3. IDLE (PyCharm Community Edition preferrable)

How to run

  1. Go to main file.
  2. Just check the following lines in the code:
text = extractTextFromImg(i) # - extract text from image file
appendDataToWord(j, text) # - append data into word file
convertDocxToPdf(j, k) # - convert it into pdf
  1. You don't need to change anything in this file. The changes which you need to make is in variables file.
  2. Just change the value of following pattern variable in variables:
    1. pattern = str(i)
  3. Also, you need to put value of total files inside the imgFiles directory in the variable noOfFiles variable.
    1. noOfFiles = 3 # Put the total no of files present.

How to change the pattern

  1. pattern is nothing but how image files are arranged in the directory
  2. Let's say the sequence in which filename is present
  3. For eg:- If the filename is img1.PNG, put pattern = 'img' +str(i)+

Author

Nitin Kumar

About

Created framework to convert an image file into word then to pdf file.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages