Skip to content

dexter-mh-lee/datahub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataHub

Build Status Gitter

DataHub

Introduction

DataHub is Linkedin's generalized metadata search & discovery tool. To learn more about DataHub, check out our Linkedin blog post and Strata presentation. This repository contains the complete source code to be able to build DataHub's frontend & backend services.

Quickstart

  1. Install docker and docker-compose.
  2. Clone this repo and make sure you are at the datahub branch.
  3. Run below command to download and run all Docker containers in your local:
cd docker/quickstart && docker-compose pull && docker-compose up --build
  1. After you have all Docker containers running in your machine, run below command to ingest provided sample data to DataHub:
./gradlew :metadata-events:mxe-schemas:build && cd metadata-ingestion/mce-cli && pip install --user -r requirements.txt && python mce_cli.py produce -d bootstrap_mce.dat

Note: Make sure that you're using Java 8, we have a strict dependency to Java 8 for build.

  1. Finally, you can start DataHub by typing http://localhost:9001 in your browser. You can sign in with datahub as username and password.

Quicklinks

Roadmap

  1. Add user profile page
  2. Deploy DataHub to Azure Cloud

About

A Generalized Metadata Search & Discovery Tool

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 45.6%
  • Python 34.6%
  • TypeScript 17.7%
  • Shell 0.7%
  • JavaScript 0.6%
  • Dockerfile 0.2%
  • Other 0.6%