http://www.manning.com/dimidukkhurana
Code is managed by maven. Be sure to install maven on your platform
before running these commands. Also be aware that HBase is not yet
supported on the OpenJDK platform, the default JVM installed on most
modern Linux distributions. You'll want to install the Oracle (Sun)
Java 6 runtime and make sure it's configured on your $PATH
before
you continue. Again, on Ubuntu, you may find the oab-java6
utility to be of use.
To build a self-contained jar:
$ mvn package
The jar created using this by default will allow you to interact with
HBase running in standalone mode on your local machine. If you want
to interact with a remote (possibly fully distributed) HBase
deployment, you can put your hbase-site.xml
file in the
src/main/resources
directory before compiling the jar.
We have provided a launcher script to run TwitBase and the utilities that the HBaseIA project comes with.
$ bin/launcher
Just run the launcher without any arguments and it'll print out the usage information.
TwitBase applications can also be run using java directly:
$ java -cp target/twitbase-1.0.0.jar <app> [options...]
Utilities for interacting with TwitBase include:
HBaseIA.TwitBase.InitTables
: create TwitBase tablesHBaseIA.TwitBase.TwitsTool
: tool for managing TwitsHBaseIA.TwitBase.UsersTool
: tool for managing UsersHBaseIA.TwitBase.LoadUsers
: tool for loading random UsersHBaseIA.TwitBase.LoadTwits
: tool for loading random Twits
The following MapReduce jobs can be launched the same way:
HBaseIA.TwitBase.mapreduce.TimeSpent
: run TimeSpent log processing MR jobHBaseIA.TwitBase.mapreduce.CountShakespeare
: run Shakespearean counter MR jobHBaseIA.TwitBase.mapreduce.HamletTagger
: run hamlet-tagging MR job
The following utilities are available for you to play with:
utils.TablePreSplitter
: create pre-split table
Copyright (C) 2012 Nick Dimiduk, Amandeep Khurana
Distributed under the Apache License, version 2.0, the same as HBase.