Skip to content

Pdf parser #49

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
52f1777
Initial write up of the project objectives
sarmakdvsr Oct 20, 2021
c336438
update readme
Nov 9, 2021
37a5115
Updated proposal
sarmakdvsr Nov 10, 2021
eeed1b5
Merge pull request #1 from dkrovi2/update_readme
dkrovi2 Nov 10, 2021
6091d2d
architecture diagram
sarmakdvsr Nov 10, 2021
4c28cfb
architecture diagram
sarmakdvsr Nov 10, 2021
45c5f46
basic parsing-engine codebase setup
sarmakdvsr Nov 10, 2021
82fd6e7
progress-report first cut
sarmakdvsr Nov 13, 2021
afcf520
progress report
sarmakdvsr Nov 13, 2021
2326afe
move docs into doc director
sarmakdvsr Nov 13, 2021
3b62e75
Intermediate commit
Nov 13, 2021
e009a59
Move progress report to top-level
sarmakdvsr Nov 14, 2021
fbf8de3
Basic search implementation
Nov 15, 2021
6d5793c
sample input change
Nov 19, 2021
49fac21
Merge branch 'main' into Doc_Scoring
Nov 19, 2021
1a87078
word parsing
Nov 19, 2021
6e3c26f
Merge pull request #3 from dkrovi2/word_parsing
sidmeister Nov 19, 2021
5c71d61
ignore out directory
sarmakdvsr Nov 20, 2021
03bfcb3
ignore build artifact dirs. Add sample inputs
sarmakdvsr Nov 20, 2021
42eca39
PDF parsing added and some code refactoring
sarmakdvsr Nov 21, 2021
19eb9e1
Scoring implementation
Nov 27, 2021
45df681
Merge branch 'main' into Doc_Scoring
Nov 27, 2021
f71b97a
Refined scoring implementation
Nov 27, 2021
7b92d95
Merge pull request #2 from dkrovi2/Doc_Scoring
saxenaj Nov 27, 2021
a5eef23
Sample json file
Nov 27, 2021
3c06bdb
Merge pull request #4 from dkrovi2/Doc_Scoring
saxenaj Nov 27, 2021
64fb859
Merge remote-tracking branch 'origin/main' into pdf-parser
sarmakdvsr Nov 27, 2021
9ff7c38
pdf and doc parsing -> extract skills
sarmakdvsr Nov 28, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
progress report
  • Loading branch information
sarmakdvsr committed Nov 13, 2021
commit afcf520910a7fbe8179430a94cf7467b8eb462fd
20 changes: 3 additions & 17 deletions progress-report.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,7 @@

## High-level Tasks

The following are the key milestones identified for this project:

| Task | Time needed | ETA |
|:--------------------------------------------------------------|-------------:|----------------:|
| Gather representative data set for training and evaluation | 8 hours | Nov 8 |
| Parsing engine to parse resumes and job descriptions | 20 hours | Nov 15 |
| Progress report | 2 hours | Nov 15 |
| Analysis engine to analyze resumes | 30 hours | Nov 22 |
| Scoring engine to match resumes to provided job description | 30 hours | Nov 29 |
| Basic UI to search for resumes matching a job description | 24 hours | Dec 5 |
| Software documentation | 8 hours | Dec 9 |
| **Total** |**122 hours** | |

Please use the the following architecture diagram as a reference for the discussion below:

![architecture-diagram](doc/architecture-diagram.png)
Please refer to the high-level tasks identified for this project in the [proposal document](https://github.com/dkrovi2/CourseProject/blob/main/proposal.md#please-justify-that-the-workload-of-your-topic-is-at-least-20--n-hours-n-being-the-total-number-of-students-in-your-team-you-may-list-the-main-tasks-to-be-completed-and-the-estimated-time-cost-for-each-task).

## Tasks Completed

Expand Down Expand Up @@ -57,5 +42,6 @@ The following tasks are pending start:

## Current Challenges

* A thorough analysis on how to give more weight to a skill that occurs less no of times in document, but the candidate worked on those skills for multiple years.
* A thorough analysis on how to give more weight to a skill that occurs less number of times in a resume, but the candidate worked on those skills for multiple years.


Binary file added progress-report.pdf
Binary file not shown.