Skip to content

capdevc/PMTokenize

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Uses epic to tokenize pubmed abstracts stored in MongoDB via Spark.

JSON parsing via rapture + json4s.

requires the mongo hadoop connector: https://github.com/mongodb/mongo-hadoop/

To run and test it use SBT invoke: 'sbt run'

About

PubMed Tokenizer with MongoDB, Spark and Epic

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages