Detection of Duplicates Among Non-structured Data From Different Data Sources long The "long" directory is the long version of my master thesis presentation. Link to a video of the presentation. short The "short" directory is the short version of the presentation for a workshop.