-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
23 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,24 @@ | ||
# TextMatching | ||
Match a text file against a repository of text files | ||
Match a text file against a repository of text files. | ||
|
||
Here is a program for finding the most similar file (to a sample file) in a repository of candidate files. | ||
|
||
Below is the output for sample "FAIRY TALES By The Brothers Grimm" | ||
``` | ||
0.0320757 Repo\THE ADVENTURES OF TOM SAWYER.txt | ||
0.0363329 Repo\A TALE OF TWO CITIES - A STORY OF THE FRENCH REVOLUTION.txt | ||
0.0388528 Repo\ALICEТS ADVENTURES IN WONDERLAND.txt | ||
0.0440605 Repo\MOBY-DICK or, THE WHALE.txt | ||
0.046679 Repo\THE ADVENTURES OF SHERLOCK HOLMES.txt | ||
0.0472574 Repo\The Iliad of Homer.txt | ||
0.0511793 Repo\The Romance of Lust.txt | ||
0.053746 Repo\PRIDE AND PREJUDICE.txt | ||
0.0543531 Repo\BEOWULF - AN ANGLO-SAXON EPIC POEM.txt | ||
0.0557194 Repo\Frankenstein; or, the Modern Prometheus.txt | ||
``` | ||
|
||
Here the whole repository is listed, starting from most similar texts down to least similar. As you can see, fairy-tales come first and a horror book comes last. | ||
|
||
Commercially this can be used for matching the current page a user is viewing against a repository of advertisement pages, so to find the most relevant advertisement. | ||
|
||
Another application is for matching the job description or a resume of a relevant but unwilling to join candidate, against a repository of resumes so to find a similar candidate. |