-
Notifications
You must be signed in to change notification settings - Fork 50
HDFS
Apache Hadoop includes a distributed file system called "HDFS" which we plan to use in some incarnation in Grappa.
For now, we have the latest stable version of Hadoop downloaded from hadoop.apache.org, installed at:
/sampa/share/hadoop-1.0.3
For a variety of reasons, our HDFS stuff dies sometimes or needs kicking to get it working again. Here's the list of commands I typically run to reboot it:
# from 'n71.sampa'
/sampa/share/hadoop-1.0.3/bin/stop-dfs.sh
/sampa/share/hadoop-1.0.3/bin/start-dfs.sh
ssh n69
# from 'n69'
/sampa/share/polysh/polysh.py `sinfo -p grappa -o '%n' -h`
# from 'polysh' prompt:
ready (12)> sudo /sampa/share/hadoop-1.0.3/bin/stop-fuse-dfs.sh
ready (12)> :hide_password
<enter password>
ready (12)> sudo /sampa/share/hadoop-1.0.3/bin/start-fuse-dfs.sh
JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64
HADOOP_HOME=/sampa/share/hadoop-1.0.3
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_HOME/jre/lib/amd64/server:$HADOOP_HOME/c++/Linux-amd64-64/lib:$HADOOP_HOME/lib/native/Linux-amd64-64
CLASSPATH=$(echo $HADOOP_HOME/*.jar | tr ' ' ':'):$(echo $HADOOP_HOME/lib/*.jar | tr ' ' ':'):$HADOOP_HOME/conf
Include/library flags:
-I$(HADOOP_HOME)/c++/Linux-amd64-64/include
-I$(HADOOP_HOME)/src/c++/libhdfs
-I$(JAVA_HOME)/include
-L$(HADOOP_HOME)/c++/Linux-amd64-64/lib
-L$(JAVA_HOME)/jre/lib/amd64/server
-lhdfs
-ljvm
Configuration files for HDFS are in $(HADOOP_HOME)/conf
.
-
masters
: n71.sampa -
slaves
: [grappa nodes]? -
core-site.xml
,mapred-site.xml
,hdfs-site.xml
: Configure various things like:- where daemons run (n71 & all Grappa nodes)
- block size, amount of memory for caching, etc.
- amount of duplication (1)
- where hadoop files go on each of the 'slaves' (/scratch/hadoop.{name,data})
> cd $(HADOOP_HOME)
# (note: if things are going wrong, must first physically delete HDFS data on all nodes)
# start shell on all grappa nodes (assuming that's where HDFS's data lives)
> clush -bw `sinfo -p grappa -o '%N' -h`
> rm -rf /scratch/hadoop.data
> quit
# set up & format HDFS (need to do this the first time)
> bin/hadoop namenode -format
# ssh to master node
> ssh n71.sampa
# start dfs daemons (should see nameservers & dataservers start up)
# note: this must be called from the master node or else the nameserver will be running in the wrong place
> bin/start-dfs.sh
# shutdown
> bin/stop-dfs.sh
You can't interact with the HDFS stuff directly, so you have to go through the Hadoop executable. Note: it seems to work best to give an "absolute" path for HDFS destinations ("/" refers to the root of HDFS's filesystem).
# 'ls'
> $(HADOOP_HOME)/bin/hadoop dfs -ls /grappa_ckpts
# Copy files into HDFS, they should get distributed across
> $(HADOOP_HOME)/bin/hadoop dfs -put <localfile> <dst>
# List the rest of the available filesystem commands
> $(HADOOP_HOME)/bin/hadoop dfs
- Allows for access over HTTP
- Built into hadoop v1.0.3 and integrated with the DFS NameNodes and DataNodes, so no extra servers need to be fired up
- To enable, add the following to
conf/hdfs-site.xml
(and restart dfs servers):
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
For Hadoop v1.0.3, the code for fuse-dfs can be found in /src/contrib/fuse-dfs
.
# in $HADOOP_HOME/src/contrib/fuse-dfs
# make sure `$JAVA_HOME` and `$HADOOP_HOME` env. variables are set correctly
> ./configure LDFLAGS="-L$HADOOP_HOME/c++/Linux-amd64-64/lib -L$JAVA_HOME/jre/lib/amd64/server" CFLAGS="-I$HADOOP_HOME/src/c++/libhdfs"
> make PERMS=1
# executable `fuse_dfs` should be built in fuse-dfs/src
- Find
fuse_dfs_wrapper.sh
in fuse-dfs/src, edit the paths in there to reflect your system - Find out which port the namenode is listening on. I think the default for v1.0.3 is 8020, but you can also check in the NameNode's log file by searching for this line:
2012-09-07 11:09:28,854 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode up at: n71.sampa/10.1.2.71:8020
> grep -R 'Namenode up at' $HADOOP_HOME/logs/
- Test out your configuration:
# in $HADOOP_HOME/src/contrib/fuse-dfs/src
> sudo ./fuse_dfs_wrapper.sh dfs://<namenode-hostname>:<namenode-port> <mount-point>
# (for example)
> mkdir /scratch/hdfs
> sudo ./fuse_dfs_wrapper.sh dfs://n71.sampa:8020 /scratch/hdfs
# ignore the warning 'fuse-dfs didn't recognize /scratch/hdfs,-2', it apparently says that no matter what
# check that it's working:
> ls /scratch/hdfs
# I have made some simple scripts to start and stop fuse-dfs nodes when they go down.
# You'll know they've gone down if they say "Transport endpoint is not connected."
# To restart, just ssh to the node and run:
> sudo /sampa/share/hadoop-1.0.3/bin/stop-fuse-dfs.sh
> sudo /sampa/share/hadoop-1.0.3/bin/start-fuse-dfs.sh
- Fuse logs things in
/var/log/messages
, so check there for messages-
ERROR: could not connect to n71.sampa:50070 fuse_impls_getattr.c:37
meant that I had the wrong port
-
- Input/output error (
ls: cannot access /scratch/hdfs: Input/output error
)- Might have the wrong port. Check the NameNode log for the port (see above)
- Kill the
./fuse_dfs
process - Clean up the mounted fs:
sudo umount -l /scratch/hdfs
(if you don't, you'll get errors that sayTransport endpoint is not connected
)
Should be able to add the following to /etc/fstab, if the wrapper_script is on your path and named fuse_dfs
.
fuse_dfs#dfs://<namenode>:<port> /mountpoint fuse usertrash,rw 0 0