Skip to content

Commit

Permalink
Short description from the article.
Browse files Browse the repository at this point in the history
  • Loading branch information
srogatch authored Sep 1, 2018
1 parent c38b6f9 commit ade5a16
Showing 1 changed file with 23 additions and 1 deletion.
24 changes: 23 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,24 @@
# TextMatching
Match a text file against a repository of text files
Match a text file against a repository of text files.

Here is a program for finding the most similar file (to a sample file) in a repository of candidate files.

Below is the output for sample "FAIRY TALES By The Brothers Grimm"
```
0.0320757 Repo\THE ADVENTURES OF TOM SAWYER.txt
0.0363329 Repo\A TALE OF TWO CITIES - A STORY OF THE FRENCH REVOLUTION.txt
0.0388528 Repo\ALICEТS ADVENTURES IN WONDERLAND.txt
0.0440605 Repo\MOBY-DICK or, THE WHALE.txt
0.046679 Repo\THE ADVENTURES OF SHERLOCK HOLMES.txt
0.0472574 Repo\The Iliad of Homer.txt
0.0511793 Repo\The Romance of Lust.txt
0.053746 Repo\PRIDE AND PREJUDICE.txt
0.0543531 Repo\BEOWULF - AN ANGLO-SAXON EPIC POEM.txt
0.0557194 Repo\Frankenstein; or, the Modern Prometheus.txt
```

Here the whole repository is listed, starting from most similar texts down to least similar. As you can see, fairy-tales come first and a horror book comes last.

Commercially this can be used for matching the current page a user is viewing against a repository of advertisement pages, so to find the most relevant advertisement.

Another application is for matching the job description or a resume of a relevant but unwilling to join candidate, against a repository of resumes so to find a similar candidate.

0 comments on commit ade5a16

Please sign in to comment.