At the Bronze
layer:
- It will be divided into 3 DAGs serving to collect data from sources
- Each DAG is responsible for collecting raw data from Parquet and user files (including images and metadata) from the source into MongoDB and MinIO aggregate stores
At the Silver
and Gold
layers:
- Silver layer is used to refine raw metadata from Bronze which will establish the refined metadata for
Catalog
layer in Data Lake - Gold layer obtain to extract image feature from sources and save them in MinIO