The tool is designed for organizing long-term image datasets and their gradual building with minimal manual effort by extensively using the image metadata.
Available features:
- organizes the given images by date and sensor
- can extract frames from video files
- can use GPX track files
- maintains a database of different image sensors
- identifies image sequences
- can assign multiple tag types to images and sequences
- day-time (day, night, dawn, dusk)
- named locations defined by GeoJSON polygons
Only a few Python packages are required:
All of them can be installed using pip:
pip install pyexiftool pyyaml numpy shapely opencv-python
PyExifTool requires the ExifTool to be installed on the system. The installation instructions can be found on the ExifTool website. ExifTool is also available in the package repositories of most Linux distributions.
# Ubuntu
sudo apt install exiftool
# Arch Linux
sudo pacman -S perl-image-exiftool
# Fedora
sudo dnf install perl-Image-ExifTool
The tool works best if run on original images containing full EXIF from a camera or a smartphone. The processing can be done gradually in batches, building the dataset over time.
Basic usage example:
python3 process.py <path_to_input_image_dir> <path_to_dataset_root>
For all the available options, use the --help
flag:
python3 process.py --help
- The named locations (used for location tagging) are defined by polygons in a GeoJSON file. The map with polygons can be created, e.g., using uMap.
- The tool creates a specific dataset directory structure:
- the
<day_directories>
are named by the date in the formatYYYY-MM-DD
- the
<sensor_directories>
are named by the sensor name constructed from the EXIF metadata - the
metadata.yaml
file contains the definition of sequences and tags for the given day and sensor - the
sensors.yaml
file contains the definition of all the sensors used in the dataset
- the
<dataset_root>
├── <day_directories>
│ ├── <sensor_directories>
│ │ ├── <image_files>
│ │ ├── <video_files>
│ │ ├── metadata.yaml
├── sensors.yaml
- The locations of images are extracted from the EXIF metadata (if available) or interpolated using a GPX track file.
- Video files cannot directly contain a location in their metadata, so the locations for the extracted frames are interpolated from a GPX track file and saved in the frame EXIF.
- The tool matches the images to the GPX track file solely by the time of capture. Therefore it cannot handle the case when multiple GPX files corresponding to different sensors capturing at the same time are being processed at once. This case can be solved by processing the corresponding image data and GPX files sequentially (first process the images and GPX files from sensor A, then process the images and GPX files from sensor B, etc.).
- image resizing
- global image database
- image retrieval (based on locations, tags, sequences, etc.)
- automatic image-based tagging (weather, foliage, etc.)