Skip to content

adricu/openai-api-test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

 Basic PDF Summarization and Query System

Create a basic script that extracts text from a PDF by chapter, summarizes each chapter, stores these summaries in an SQLite database, and answers questions about these chapters using OpenAI's API.

Running the script

You just need to create an .env file and place there the OPENAPI api key.

To run execute the script:

python -m task

Approach

  • I did the task in the simplest way possible by covering the functionality asked in the assignment
  • Used PyMuPDF library to read and process the PDF
  • Used openai async client
  • Used Chat completions API to generate text summaries and ask questions about them

Assumptions

  • There is an "easy" way to split the pdf file in chapters by executing a regex expression here. It can be tweaked depending of the PDF structure.
  • The PDF to work with is called test.pdf. It can be changed here

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages