An HDFS plugin for Logstash. This plugin is provided as an external plugin (see Usage below) and is not part of the Logstash project.
run logstash with the --pluginpath
(-p
) command line argument to let logstash know where the plugin is. Also, you need to let Java know where your Hadoop JARs are, so set the CLASSPATH
variable correctly.
E.G.
CLASSPATH=$(find /path/to/hadoop -name '*.jar' | tr '\n' ':'):/etc/hadoop/conf:/path/to/logstash-1.1.7-monolithic.jar java logstash.runner agent -f conf/hdfs-output.conf -p /path/to/cloned/logstash-hdfs
Note that logstash is not executed with java -jar
because executable jars ignore external classpath. Instead we put the logstash jar on the class path and call the runner class.
Important: the Hadoop configuration dir containing hdfs-site.xml
must be on the classpath.
Config options are basically the same as the file output, but have a look at the doc/
directory for specfics.
By default, the plugin will use Hadoop's default configuration location. However, a logstash configuration option named 'hadoop_config_resources' has been added that will allow the user to pass in multiple configuration classpath locations to override this default configuration.
output {
hdfs {
path => "/path/to/output_file.log"
hadoop_config_resources => ['path/to/configuration/on/classpath/hdfs-site.xml']
}
}
Please note, HDFS versions prior to 2.x do not properly support append. See HADOOP-8230 for reference. To enable append on HDFS, set dfs.support.append in hdfs-site.conf (2.x) or dfs.support.broken.append on 1.x, and use the enable_append config option:
input {
hdfs {
path => "/path/to/output_file.log"
enable_append => true
}
}
If append is not supported and the file already exists, the plugin will cowardly refuse to reopen the file for writing unless enable_reopen is set to true. This is probably a very bad idea, you have been warned!
Flush and sync don't actually work as promised on HDFS (see HDFS-536).
In Hadoop 2.x, hflush
provides flush-like functionality and the plugin will use hflush
if it is available.
Nevertheless, flushing code has been left in the plugin in case flush
and sync
will work on some HDFS implementation.
The plugin is released under the LGPL v3.