ETL4Profiling is a Kettle extension that gets the profile of the database inputed in the plugins, specially DBpedia, its main case of study.
To use the project, the only need is to download the newest version of the plugins and extract the file .tar.gz
in the folder plugins/
of your Kettle.
ETL4Profiling is currently providing the following plugins:
- DBpediaTriplification
- GetDBpediaData
- InnerProfiling
- MergeProfiling
- PropertyAnalyzer
- ResourceInputAnalyzer
- ResourcePropertiesAnalyzer
- TemplatePropertyAnalyzer
- TemplateResourceAnalyzer
- TemplateResourceInputAnalyzer
- Java 8 to development.
- Maven to manage dependencies.
- Kettle 8.2 to test and deploy.
This project was developed using Eclipse IDE, but it can be used with any other IDE of your preference.
- Download and install Kettle, Pentaho Data Integration (pdi-ce-8.2.0.0-342 or latest version).
- Download, install and settup Maven.
- Download the newest version of the plugins and change the variable
pdi.home
in the pom of your root projectplugins
to the place where your Kettle is installed, and then, runmvn clean install
in yourplugins
folder.
To create a new plugin, a specific documentation is available in HOW TO CREATE A KETTLE PLUGIN by John Curcio.
To create a plugin for ETL4Profiling the only modification is in the mvn creation. Instead of br.ufrj.ppgi.greco.kettle
in the groupId, it should be br.ufrj.dcc.kettle
.
cd ETL4Profiling/plugins
$ mvn archetype:generate -DgroupId=br.ufrj.dcc.kettle.NomeDoPlugin -DartifactId=NomeDoPlugin -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false
This project uses the MIT license. For more details, read LICENSE.md.