Monitoring - Hadoop

Monitoring is an important part of system administration. In this section, we look at the monitoring facilities in Hadoop and how they can hook into external monitoring systems.

The purpose of monitoring is to detect when the cluster is not providing the expected level of service. The master daemons are the most important to monitor: the namenodes (primary and secondary) and the jobtracker. Failure of datanodes and tasktrackers is to be expected, particularly on larger clusters, so you should provide extra capacity so that the cluster can tolerate having a small percentage of dead nodes at any time.

In addition to the facilities described next, some administrators run test jobs on a periodic basis as a test of the cluster’s health.

There is lot of work going on to add more monitoring capabilities to Hadoop, which is not covered here. For example, Chukwa is a data collection and monitoring system built on HDFS and MapReduce, and excels at mining log data for finding large-scale trends.

Logging

All Hadoop daemons produce logfiles that can be very useful for finding out what is happening in the system. “System logfiles” explains how to configure these files.

Setting log levels

When debugging a problem, it is very convenient to be able to change the log level temporarily for a particular component in the system. Hadoop daemons have a web page for changing the log level for any log4j log name, which can be found at /logLevel in the daemon’s web UI. By convention, log names in Hadoop correspond to the classname doing the logging, although there are exceptions to this rule, so you should consult the source code to find log names.

For example, to enable debug logging for the JobTracker class, we would visit the jobtracker’s web UI at http:// jobtracker-host: 50030/ logLevel and set the log name org.apache.hadoop.mapred.JobTracker to level DEBUG. The same thing can be achieved from the command line as follows:

Log levels changed in this way are reset when the daemon restarts, which is usually what you want. However, to make a persistent change to a log level, simply change the log4j.properties file in the configuration directory. In this case, the line to add is:

Getting stack traces

Hadoop daemons expose a web page (/stacks in the web UI) that produces a thread dump for all running threads in the daemon’s JVM. For example, you can get a thread dump for a jobtracker from http:// jobtracker-host: 50030/ stacks.

Metrics

The HDFS and MapReduce daemons collect information about events and measurements that are collectively known as metrics. For example, datanodes collect the following metrics (and many more): the number of bytes written, the number of blocks replicated, and the number of read requests from clients (both local and remote).

Metrics belong to a context, and Hadoop currently uses “dfs”, “mapred”, “rpc”, and “jvm” contexts. Hadoop daemons usually collect metrics under several contexts. For example, datanodes collect metrics for the “dfs”, “rpc”, and “jvm” contexts.

How Do Metrics Differ from Counters?

The main difference is their scope: metrics are collected by Hadoop daemons, whereas counters (see “Counters” ) are collected for MapReduce tasks and aggregated for the whole job. They have different audiences, too: broadly speaking, metrics are for administrators, and counters are for MapReduce users.

The way they are collected and aggregated is also different. Counters are a MapReduce feature, and the MapReduce system ensures that counter values are propagated from the tasktrackers where they are produced, back to the jobtracker, and finally back to the client running the MapReduce job. (Counters are propagated via RPC heartbeats; see “Progress and Status Updates” .) Both the tasktrackers and the jobtracker perform aggregation.

The collection mechanism for metrics is decoupled from the component that receives the updates, and there are various pluggable outputs, including local files, Ganglia, and JMX. The daemon collecting the metrics performs aggregation on them before they are sent to the output.

A context defines the unit of publication; you can choose to publish the “dfs” context, but not the “jvm” context, for instance. Metrics are configured in the conf/hadoopmetrics. properties file, and, by default, all contexts are configured so they do not publish their metrics. This is the contents of the default configuration file (minus the comments):

Each line in this file configures a different context and specifies the class that handles the metrics for that context. The class must be an implementation of the MetricsCon text interface; and, as the name suggests, the NullContext class neither publishes nor updates metrics.The other implementations of MetricsContext are covered in the following sections.

FileContext

FileContext writes metrics to a local file. It exposes two configuration properties:fileName, which specifies the absolute name of the file to write to, and period, for the time interval (in seconds) between file updates. Both properties are optional; if not set, the metrics will be written to standard output every five seconds.

The term “context” is (perhaps unfortunately) overloaded here, since it can refer to either a collection of metrics (the “dfs” context, for example) or the class that publishes metrics (the NullContext, for example).

FileContext

Configuration properties apply to a context name and are specified by appending the property name to the context name (separated by a dot). For example, to dump the “jvm” context to a file, we alter its configuration to be the following:

jvm.class=org.apache.hadoop.metrics.file.FileContext
jvm.fileName=/tmp/jvm_metrics.log

In the first line, we have changed the “jvm” context to use a FileContext, and in the second, we have set the “jvm” context’s fileName property to be a temporary file. Here are two lines of output from the logfile, split over several lines to fit the page:

FileContext can be useful on a local system for debugging purposes, but is unsuitable on a larger cluster since the output files are spread across the cluster, which makes analyzing them difficult.

GangliaContext

Ganglia (http://ganglia.info/) is an open source distributed monitoring system for very large clusters. It is designed to impose very low resource overheads on each node in the cluster. Ganglia itself collects metrics, such as CPU and memory usage; by using GangliaContext, you can inject Hadoop metrics into Ganglia.

GangliaContext has one required property, servers, which takes a space- and/or comma-separated list of Ganglia server host-port pairs. Further details on configuring this context can be found on the Hadoop wiki.

For a flavor of the kind of information you can get out of Ganglia, see Figure, which shows how the number of tasks in the jobtracker’s queue varies over time.

NullContextWithUpdateThread

Both FileContext and a GangliaContext push metrics to an external system. However, some monitoring systems notably JMX need to pull metrics from Hadoop. Null ContextWithUpdateThread is designed for this. Like NullContext, it doesn’t publish any metrics, but in addition it runs a timer that periodically updates the metrics stored in memory. This ensures that the metrics are up-to-date when they are fetched by anothersystem.

All implementations of MetricsContext, except NullContext, perform this updating function (and they all expose a period property that defaults to five seconds), so you need to use NullContextWithUpdateThread only if you are not collecting metrics using another output. If you were using GangliaContext, for example, then it would ensure the metrics are updated, so you would be able to use JMX in addition with no furtherconfiguration of the metrics system. JMX is discussed in more detail shortly.

CompositeContext

CompositeContext allows you to output the same set of metrics to multiple contexts, such as a FileContext and a GangliaContext. The configuration is slightly tricky and is best shown by an example:

The arity property is used to specify the number of subcontexts; in this case, there are two. The property names for each subcontext are modified to have a part specifying the subcontext number, hence jvm.sub1.class and jvm.sub2.class.

Java Management Extensions

Java Management Extensions (JMX) is a standard Java API for monitoring and managing applications. Hadoop includes several managed beans (MBeans), which expose Hadoop metrics to JMX-aware applications. There are MBeans that expose the metrics in the “dfs” and “rpc” contexts, but none for the “mapred” context (at the time of this writing) or the “jvm” context (as the JVM itself exposes a richer set of JVM metrics).These MBeans are listed in Table .Java Management Extensions

The JDK comes with a tool called JConsole for viewing MBeans in a running JVM. It’s useful for browsing Hadoop metrics, as demonstrated in Figure

Java Management Extensions

Although you can see Hadoop metrics via JMX using the default metrics configuration, they will not be updated unless you change the MetricsContext implementation to something other than NullContext.

For example, NullContextWithUpdateThread is appropriate if JMX is the only way you will be monitoring metrics.

Many third-party monitoring and alerting systems (such as Nagios or Hyperic) can query MBeans, making JMX the natural way to monitor your Hadoop cluster from an existing monitoring system. You will need to enable remote access to JMX, however, and choose a level of security that is appropriate for your cluster. The options here include password authentication, SSL connections, and SSL client-authentication. Seethe official Java documentation§ for an in-depth guide on configuring these options.

All the options for enabling remote access to JMX involve setting Java system properties, which we do for Hadoop by editing the conf/hadoop-env.sh file. The following configuration settings show how to enable password-authenticated remote access to JMX on the namenode (with SSL disabled). The process is very similar for other Hadoop daemons:

The jmxremote.password file lists the usernames and their passwords in plain text; the JMX documentation has further details on the format of this file. With this configuration, we can use JConsole to browse MBeans on a remote namenode.

Alternatively, we can use one of the many JMX tools to retrieve MBean attribute values. Here is an example of using the “jmxquery” command-line tool (and Nagios plug-in, available from http://code.google.com/p/jmxquery/) to retrieve the number of under-replicated blocks:

This command establishes a JMX RMI connection to the host namenode-host on port 8004 and authenticates using the given username and password. It reads the attribute UnderReplicatedBlocks of the object named hadoop:service=NameNode,name=FSNamesys temState and prints out its value on the console. The -w and -c options specify warning and critical levels for the value: the appropriate values of these are normally determined after operating a cluster for a while.

It’s common to use Ganglia in conjunction with an alerting system like Nagios for monitoring a Hadoop cluster. Ganglia is good for efficiently collecting a large number of metrics and graphing them, whereas Nagios and similar systems are good at sending alerts when a critical threshold is reached in any of a smaller set of metrics.

‖ It’s convenient to use JConsole to find the object names of the MBeans that you want to monitor. Note that MBeans for datanode metrics contain a random identifier in Hadoop 0.20, which makes it difficult to monitor them in anything but an ad hoc way. This was fixed in Hadoop 0.21.0.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

Hadoop Topics