The charis is Apache Pig UDF library for data transformation from log data into a wide table format data. This UDF project is started for clinical epidemiologic study using Japanese Diagnosis Procedure Combination Database data.
Requiement Hadoop >=1.0.0, pig >=0.9.2
Download https://raw.github.com/wiki/hiromasah/charsiu/releases/charsiu-udf-1.1.jar
Write "register /path/to/charsiu-udf-1.1.jar;" into Pig script.
- product leader
- Hiromasa Horiguchi (The University of Tokyo)
- contributer
- Tatsuya Nakamura (Kurusugawa Computer, Inc.) Taisuke Sato (Kurusugawa Computer, Inc.) Toru Nishikawa (Preferred Infrastructure)
Apache License Version 2.0
- fixed file name of 2012/ff1.schema
- added DPC schema of 2012
- changed DPC data input method to index file method for S3
- added UDF MulticastEvaluate, LoadDataWithSchema
- modified a UDF StoreDataWithSchema for free encoding
- modified a specification of choosing file system for DPC data
- added DPC schema of 2011
- several bug fixes
- added license text
- setting for maven report
initial release