A collection of links for various datasets.
- Google Dataset Search
- Hugging Face Datasets
- PapersWithCode, ML Datasets
- UC Irvine Machine Learning Repository
-
Das Datenportal für Deutschland
-
Data Open Data Portal (Oldenburg)
-
Open Data of city Zürich
- A really impressive collection of datasets about the city
- https://data.stadt-zuerich.ch/
-
Deutsche Bahn (Open Data Portal)
-
GENESIS-Online Database (Statistisches Bundesamt)
-
Daten aus dem Handelsregister
- Awesome Public Datasets (GitHub)
- Dataset on Kaggle
- Collections - high quality data and datasets organized by topic
- RELATIONAL DATASET REPOSITORY
- Links to 1 111 (and counting) interestingd datasets (shared via Google Spreadsheet)
- Open Data on AWS
- KDnuggets datasets
- 70+ Machine Learning Datasets – Gain real-world experience with Data Science projects
- Descriptions are given in Russian, but the datasets are in various languages
- Data from Publications
- Microsoft Research Open Data
- GeoLife GPS Trajectories [298.66 MB]
- T-Drive trajectory data sample [~140 MB]
- Microsoft Data Science Initiative
- LETOR: Learning to Rank for Information Retrieval
- Solar Measurement Grid Oahu, Hawaii by NREL
- UK Domestic Appliance-Level Electricity (UK-DALE) dataset
- Github repository - https://github.com/JackKelly/UK-DALE_metadata
- Berkeley Lab's Tracking the Sun (Open PV Dataset)
- Energy efficiency Data Set from UCI
- PLAID: the Plug Load Appliance Identification Dataset
- Indian Dataset for Ambient Water and Energy (IWAE)
- SMARD.DE Data
- Non-Intrusive Load Monitoring Toolkit (nilmtk)
- Global Power Plant Database
- Some data analysis results can be found here - https://github.com/wri/global-power-plant-database
- Erneuerbare-Energien-Gesetz aggregated data
- BLOND, a building-level office environment dataset of typical electrical appliances
- publication - https://www.nature.com/articles/sdata201848
- dataset - https://mediatum.ub.tum.de/1375836
- SmartH2O Project
- truthful_qa - Datasets at Hugging Face
- wino_bias - Datasets at Hugging Face
- bold - Datasets at Hugging Face
- CKAN - the world’s leading open-source data portal platform
- Datasette - A tool for exploring and publishing data
- Amundsen - data discovery and metadata engine
- OpenRefine - a powerful tool for working with messy data