Track disk utilization and failed volumes on each of your HDFS DataNodes. This Agent check collects metrics for these, as well as block- and cache-related metrics.
Use this check (hdfs_datanode) and its counterpart check (hdfs_namenode), not the older two-in-one check (hdfs); that check is deprecated.
The HDFS DataNode check is included in the Datadog Agent package, so you don't need to install anything else on your DataNodes.
The Agent collects metrics from the DataNode's JMX remote interface. The interface is disabled by default, so enable it by setting the following option in hadoop-env.sh
(usually found in $HADOOP_HOME/conf):
export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.port=50075 $HADOOP_DATANODE_OPTS"
Restart the DataNode process to enable the JMX interface.
Edit the hdfs_datanode.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent's configuration directory. See the sample hdfs_datanode.d/conf.yaml for all available configuration options:
init_config:
instances:
- hdfs_datanode_jmx_uri: http://localhost:50075
Restart the Agent to begin sending DataNode metrics to Datadog.
Run the Agent's status
subcommand and look for hdfs_datanode
under the Checks section.
See metadata.csv for a list of metrics provided by this integration.
The HDFS-datanode check does not include any events at this time.
hdfs.datanode.jmx.can_connect
:
Returns Critical
if the Agent cannot connect to the DataNode's JMX interface for any reason (e.g. wrong port provided, timeout, un-parseable JSON response).
Need help? Contact Datadog Support.