Skip to content

Commit 0734dbd

Browse files
authored
Update and rename Readme.md to README.md
1 parent 04ceda0 commit 0734dbd

File tree

1 file changed

+12
-4
lines changed
  • BasicPythonScripts/Text Extractor from PDF

1 file changed

+12
-4
lines changed

BasicPythonScripts/Text Extractor from PDF/Readme.md renamed to BasicPythonScripts/Text Extractor from PDF/README.md

Lines changed: 12 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,18 @@
33
## Modules Required
44
- PyPDF2 - It is used in Python for PDF related operations
55

6-
## How it Works?
6+
## AIM
7+
To build a Python Script using the PyPDF2 Module which can extract text from a PDF file.
8+
9+
## COMPILATION STEPS
710
* Import PyPDF2 Module to:-
811
* Read the pdf into the program to further manipulate it
912
* Count the number of pages in the PDF
1013
* Extract the text from a single PDF page
11-
* Initialize an empty string
12-
* A for loop parses through each page
14+
* Initialize an empty string which will store the text being extracted from the PDF file
15+
* A for loop is made to parse through each page
1316
* The extractText() function is used to extract text from the parsed PDF page
14-
* The extracted text is added to the emptry string initialized
17+
* The extracted text is added to the emptry string initialized using simple string concatenation
1518
* After parsing is done, the string in which the extracted text is stored is written in a new file named **extracted_text.txt** using basic File Handling in Python
1619

1720
## PDF FILE WITH TEXT
@@ -32,3 +35,8 @@ To know more: [PyPDF2 Docs](https://pythonhosted.org/PyPDF2/)
3235
## File handling
3336
Python has some inbuilt methods to handles files and perform operations like reading and writing.
3437
read about them : [File Handling Docs](https://www.geeksforgeeks.org/reading-writing-text-files-python/)
38+
39+
## Author
40+
41+
- [@Sakalya Mitra](https://github.com/Sakalya100)
42+

0 commit comments

Comments
 (0)