Skip to content

Multithreading Java tool to pull JPG images from a S3 bucket and push extracted EXIF data to a PostgreSQL database.

Notifications You must be signed in to change notification settings

gerosa/exif-data-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

EXIF Data Extractor

Requirements

  • Java 8
  • PostgreSQL 9.3

Restrictions

  1. Only JPEG files are supported. If a different file type is detected, a warning is logged.
  2. Updated files will not be re-processed.

How to run:

This solution requires a number of environment variables for runtime configuration.

$ export WALDO_PHOTOS_JDBC_URL="jdbc:postgresql://server/database"
$ export WALDO_PHOTOS_JDBC_USER="username"
$ export WALDO_PHOTOS_JDBC_PASSWORD="password"

Build the project using Maven 3.3

mvn clean install 

Create the database schema:

mvn flyway:migrate 

Finally run the project:

mvn exec:java

Future Improvements:

  1. Improve performance refactoring the naming convention of the file keys by adding the the timestamp prefix (e.g. YYYYMMDD-hhmmss) and retrieving only new files
  2. Implement unit and integration tests.
  3. Integrate with Netflix's Archaius to allow dynamic configuration.

About

Multithreading Java tool to pull JPG images from a S3 bucket and push extracted EXIF data to a PostgreSQL database.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages