mozilla

Decoders

Apache Access Log Decoder

New in version 0.6.

Parses the Apache access logs based on the Apache ‘LogFormat’ configuration directive. The Apache format specifiers are mapped onto the Nginx variable names where applicable e.g. %a -> remote_addr. This allows generic web filters and outputs to work with any HTTP server input.

Config:

  • log_format (string)

    The ‘LogFormat’ configuration directive from the apache2.conf. %t variables are converted to the number of nanosecond since the Unix epoch and used to set the Timestamp on the message. http://httpd.apache.org/docs/2.4/mod/mod_log_config.html

  • type (string, optional, default nil):

    Sets the message ‘Type’ header to the specified value

  • user_agent_transform (bool, optional, default false)

    Transform the http_user_agent into user_agent_browser, user_agent_version, user_agent_os.

  • user_agent_keep (bool, optional, default false)

    Always preserve the http_user_agent value if transform is enabled.

  • user_agent_conditional (bool, optional, default false)

    Only preserve the http_user_agent value if transform is enabled and fails.

  • payload_keep (bool, optional, default false)

    Always preserve the original log line in the message payload.

Example Heka Configuration

[TestWebserver]
type = "LogstreamerInput"
log_directory = "/var/log/apache"
file_match = 'access\.log'
decoder = "CombinedLogDecoder"

[CombinedLogDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/apache_access.lua"

[CombinedLogDecoder.config]
type = "combined"
user_agent_transform = true
# combined log format
log_format = '%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"'

# common log format
# log_format = '%h %l %u %t \"%r\" %>s %O'

# vhost_combined log format
# log_format = '%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"'

# referer log format
# log_format = '%{Referer}i -> %U'

Example Heka Message

Timestamp:2014-01-10 07:04:56 -0800 PST
Type:combined
Hostname:test.example.com
Pid:0
UUID:8e414f01-9d7f-4a48-a5e1-ae92e5954df5
Logger:TestWebserver
Payload:
EnvVersion:
Severity:7
Fields:
name:”remote_user” value_string:”-“
name:”http_x_forwarded_for” value_string:”-“
name:”http_referer” value_string:”-“
name:”body_bytes_sent” value_type:DOUBLE representation:”B” value_double:82
name:”remote_addr” value_string:”62.195.113.219” representation:”ipv4”
name:”status” value_type:DOUBLE value_double:200
name:”request” value_string:”GET /v1/recovery_email/status HTTP/1.1”
name:”user_agent_os” value_string:”FirefoxOS”
name:”user_agent_browser” value_string:”Firefox”
name:”user_agent_version” value_type:DOUBLE value_double:29

New in version 0.6.

GeoIpDecoder

Decoder plugin that generates GeoIP data based on the IP address of a specified field. It uses the Go project: https://github.com/abh/geoip as a wrapper around MaxMind’s geoip-api-c library. This decoder assumes you have downloaded and installed the geoip-api-c library from MaxMind’s website. Currently, only the GeoLiteCity database is supported, which you must also download and install yourself into a location to be referenced by the db_file config option. By default the database file is opened using “GEOIP_MEMORY_CACHE” mode. This setting is hard-coded into the wrapper’s geoip.go file. You will need to manually override that code if you want to specify one of the other modes listed here:

Note

If you are using this with the ES output you will likely need to specify the raw_bytes_field option for the target_field specified. This is required to preserve the formatting of the JSON object.

Config:

  • db_file:

    The location of the GeoLiteCity.dat database. Defaults to “/var/cache/hekad/GeoLiteCity.dat”

  • source_ip_field:

    The name of the field containing the IP address you want to derive the location for.

  • target_field:

    The name of the new field created by the decoder. The decoder will output a JSON object with the following elements:

    • latitute: string,

    • longitude: string,

    • location: [ float64, float64 ],
    • coordinates: [ string, string ],

    • countrycode: string,

    • countrycode3: string,

    • region: string,

    • city: string,

    • postalcode: string,

    • areacode: int,

    • charset: int,

    • continentalcode: string

[apache_geoip_decoder]
type = "GeoIpDecoder"
db_file="/etc/geoip/GeoLiteCity.dat"
source_ip_field="remote_host"
target_field="geoip"

MultiDecoder

This decoder plugin allows you to specify an ordered list of delegate decoders. The MultiDecoder will pass the PipelinePack to be decoded to each of the delegate decoders in turn until decode succeeds. In the case of failure to decode, MultiDecoder will return an error and recycle the message.

Config:

  • subs ([]string):

    An ordered list of subdecoders to which the MultiDecoder will delegate. Each item in the list should specify another decoder configuration section by section name. Must contain at least one entry.

  • log_sub_errors (bool):

    If true, the DecoderRunner will log the errors returned whenever a delegate decoder fails to decode a message. Defaults to false.

  • cascade_strategy (string):

    Specifies behavior the MultiDecoder should exhibit with regard to cascading through the listed decoders. Supports only two valid values: “first-wins” and “all”. With “first-wins”, each decoder will be tried in turn until there is a successful decoding, after which decoding will be stopped. With “all”, all listed decoders will be applied whether or not they succeed. In each case, decoding will only be considered to have failed if none of the sub-decoders succeed.

Here is a slightly contrived example where we have protocol buffer encoded messages coming in over a TCP connection, with each message containin a single nginx log line. Our MultiDecoder will run each message through two decoders, the first to deserialize the protocol buffer and the second to parse the log text:

[TcpInput]
address = ":5565"
parser_type = "message.proto"
decoder = "shipped-nginx-decoder"

[shipped-nginx-decoder]
type = "MultiDecoder"
subs = ['ProtobufDecoder', 'nginx-access-decoder']
cascade_strategy = "all"
log_sub_errors = true

[ProtobufDecoder]

[nginx-access-decoder]
type = "SandboxDecoder"
filename = "lua_decoders/nginx_access.lua"

    [nginx-access-decoder.config]
    type = "combined"
    user_agent_transform = true
    log_format = '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"'

Linux Cpu Stats Decoder

New in version 0.7.

Parses a payload containing the contents of a /proc/loadavg file into a Heka message.

Config:

  • payload_keep (bool, optional, default false)

    Always preserve the original log line in the message payload.

Example Heka Configuration

[CpuStats]
type = "FilePollingInput"
ticker_interval = 1
file_path = "/proc/loadavg"
decoder = "CpuStatsDecoder"

[CpuStatsDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/cpustats.lua"

Example Heka Message

Timestamp:2014-01-10 07:04:56 -0800 PST
Type:stats.cpustats
Hostname:test.example.com
Pid:0
UUID:8e414f01-9d7f-4a48-a5e1-ae92e5954df5
Payload:
EnvVersion:
Severity:7
Fields:
name:”1MinAvg” value_type:DOUBLE value_double:”3.05”
name:”5MinAvg” value_type:DOUBLE value_double:”1.21”
name:”15MinAvg” value_type:DOUBLE value_double:”0.44”
name:”NumProcesses” value_type:DOUBLE value_double:”11”
name:”FilePath” value_string:”/proc/loadavg”

Linux Disk Stats Decoder

New in version 0.7.

Parses a payload containing the contents of a /sys/block/$DISK/stat file (where $DISK is a disk identifier such as sda) into a Heka message struct. This also tries to obtain the TickerInterval of the input it recieved the data from, by extracting it from a message field named TickerInterval.

Config:

  • payload_keep (bool, optional, default false)

    Always preserve the original log line in the message payload.

Example Heka Configuration

[DiskStats]
type = "FilePollingInput"
ticker_interval = 1
file_path = "/sys/block/sda1/stat"
decoder = "DiskStatsDecoder"

[DiskStatsDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/diskstats.lua"

Example Heka Message

Timestamp:2014-01-10 07:04:56 -0800 PST
Type:stats.diskstats
Hostname:test.example.com
Pid:0
UUID:8e414f01-9d7f-4a48-a5e1-ae92e5954df5
Payload:
EnvVersion:
Severity:7
Fields:
name:”ReadsCompleted” value_type:DOUBLE value_double:”20123”
name:”ReadsMerged” value_type:DOUBLE value_double:”11267”
name:”SectorsRead” value_type:DOUBLE value_double:”1.094968e+06”
name:”TimeReading” value_type:DOUBLE value_double:”45148”
name:”WritesCompleted” value_type:DOUBLE value_double:”1278”
name:”WritesMerged” value_type:DOUBLE value_double:”1278”
name:”SectorsWritten” value_type:DOUBLE value_double:”206504”
name:”TimeWriting” value_type:DOUBLE value_double:”3348”
name:”TimeDoingIO” value_type:DOUBLE value_double:”4876”
name:”WeightedTimeDoingIO” value_type:DOUBLE value_double:”48356”
name:”NumIOInProgress” value_type:DOUBLE value_double:”3”
name:”TickerInterval” value_type:DOUBLE value_double:”2”
name:”FilePath” value_string:”/sys/block/sda/stat”

Nginx Access Log Decoder

New in version 0.5.

Parses the Nginx access logs based on the Nginx ‘log_format’ configuration directive.

Config:

  • log_format (string)

    The ‘log_format’ configuration directive from the nginx.conf. $time_local or $time_iso8601 variable is converted to the number of nanosecond since the Unix epoch and used to set the Timestamp on the message. http://nginx.org/en/docs/http/ngx_http_log_module.html

  • type (string, optional, default nil):

    Sets the message ‘Type’ header to the specified value

  • user_agent_transform (bool, optional, default false)

    Transform the http_user_agent into user_agent_browser, user_agent_version, user_agent_os.

  • user_agent_keep (bool, optional, default false)

    Always preserve the http_user_agent value if transform is enabled.

  • user_agent_conditional (bool, optional, default false)

    Only preserve the http_user_agent value if transform is enabled and fails.

  • payload_keep (bool, optional, default false)

    Always preserve the original log line in the message payload.

Example Heka Configuration

[TestWebserver]
type = "LogstreamerInput"
log_directory = "/var/log/nginx"
file_match = 'access\.log'
decoder = "CombinedLogDecoder"

[CombinedLogDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/nginx_access.lua"

[CombinedLogDecoder.config]
type = "combined"
user_agent_transform = true
# combined log format
log_format = '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"'

Example Heka Message

Timestamp:2014-01-10 07:04:56 -0800 PST
Type:combined
Hostname:test.example.com
Pid:0
UUID:8e414f01-9d7f-4a48-a5e1-ae92e5954df5
Logger:TestWebserver
Payload:
EnvVersion:
Severity:7
Fields:
name:”remote_user” value_string:”-“
name:”http_x_forwarded_for” value_string:”-“
name:”http_referer” value_string:”-“
name:”body_bytes_sent” value_type:DOUBLE representation:”B” value_double:82
name:”remote_addr” value_string:”62.195.113.219” representation:”ipv4”
name:”status” value_type:DOUBLE value_double:200
name:”request” value_string:”GET /v1/recovery_email/status HTTP/1.1”
name:”user_agent_os” value_string:”FirefoxOS”
name:”user_agent_browser” value_string:”Firefox”
name:”user_agent_version” value_type:DOUBLE value_double:29

Linux Memory Stats Decoder

New in version 0.7.

Parses a payload containing the contents of a /proc/meminfo file into a Heka message.

Config:

  • payload_keep (bool, optional, default false)

    Always preserve the original log line in the message payload.

Example Heka Configuration

[MemStats]
type = "FilePollingInput"
ticker_interval = 1
file_path = "/proc/meminfo"
decoder = "MemStatsDecoder"

[MemStatsDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/memstats.lua"

Example Heka Message

Timestamp:2014-01-10 07:04:56 -0800 PST
Type:stats.memstats
Hostname:test.example.com
Pid:0
UUID:8e414f01-9d7f-4a48-a5e1-ae92e5954df5
Payload:
EnvVersion:
Severity:7
Fields:
name:”MemTotal” value_type:DOUBLE representation:”kB” value_double:”4047616”
name:”MemFree” value_type:DOUBLE representation:”kB” value_double:”3432216”
name:”Buffers” value_type:DOUBLE representation:”kB” value_double:”82028”
name:”Cached” value_type:DOUBLE representation:”kB” value_double:”368636”
name:”FilePath” value_string:”/proc/meminfo”

The total available fields can be found in man procfs. All fields are of type double, and the representation is in kB (except for the HugePages fields). Here is a full list of fields available:

MemTotal, MemFree, Buffers, Cached, SwapCached, Active, Inactive, Active(anon), Inactive(anon), Active(file), Inactive(file), Unevictable, Mlocked, SwapTotal, SwapFree, Dirty, Writeback, AnonPages, Mapped, Shmem, Slab, SReclaimable, SUnreclaim, KernelStack, PageTables, NFS_Unstable, Bounce, WritebackTmp, CommitLimit, Committed_AS, VmallocTotal, VmallocUsed, VmallocChunk, HardwareCorrupted, AnonHugePages, HugePages_Total, HugePages_Free, HugePages_Rsvd, HugePages_Surp, Hugepagesize, DirectMap4k, DirectMap2M, DirectMap1G.

Note that your available fields may have a slight variance depending on the system’s kernel version.

MySQL Slow Query Log Decoder

New in version 0.6.

Parses and transforms the MySQL slow query logs. Use mariadb_slow_query.lua to parse the MariaDB variant of the MySQL slow query logs.

Config:

  • truncate_sql (int, optional, default nil)

    Truncates the SQL payload to the specified number of bytes (not UTF-8 aware) and appends ”...”. If the value is nil no truncation is performed. A negative value will truncate the specified number of bytes from the end.

Example Heka Configuration

[Sync-1_5-SlowQuery]
type = "LogstreamerInput"
log_directory = "/var/log/mysql"
file_match = 'mysql-slow\.log'
parser_type = "regexp"
delimiter = "\n(# User@Host:)"
delimiter_location = "start"
decoder = "MySqlSlowQueryDecoder"

[MySqlSlowQueryDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/mysql_slow_query.lua"

    [MySqlSlowQueryDecoder.config]
    truncate_sql = 64

Example Heka Message

Timestamp:2014-05-07 15:51:28 -0700 PDT
Type:mysql.slow-query
Hostname:127.0.0.1
Pid:0
UUID:5324dd93-47df-485b-a88e-429f0fcd57d6
Logger:Sync-1_5-SlowQuery
Payload:/* [queryName=FIND_ITEMS] */ SELECT bso.userid, bso.collection, ...
EnvVersion:
Severity:7
Fields:
name:”Rows_examined” value_type:DOUBLE value_double:16458
name:”Query_time” value_type:DOUBLE representation:”s” value_double:7.24966
name:”Rows_sent” value_type:DOUBLE value_double:5001
name:”Lock_time” value_type:DOUBLE representation:”s” value_double:0.047038

Nginx Error Log Decoder

New in version 0.6.

Parses the Nginx error logs based on the Nginx hard coded internal format.

Config:

  • tz (string, optional, defaults to UTC)

    The conversion actually happens on the Go side since there isn’t good TZ support here.

Example Heka Configuration

[TestWebserverError]
type = "LogstreamerInput"
log_directory = "/var/log/nginx"
file_match = 'error\.log'
decoder = "NginxErrorDecoder"

[NginxErrorDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/nginx_error.lua"

[NginxErrorDecoder.config]
tz = "America/Los_Angeles"

Example Heka Message

Timestamp:2014-01-10 07:04:56 -0800 PST
Type:nginx.error
Hostname:trink-x230
Pid:16842
UUID:8e414f01-9d7f-4a48-a5e1-ae92e5954df5
Logger:TestWebserverError
Payload:using inherited sockets from “6;”
EnvVersion:
Severity:5
Fields:
name:”tid” value_type:DOUBLE value_double:0
name:”connection” value_type:DOUBLE value_double:8878

PayloadRegexDecoder

Decoder plugin that accepts messages of a specified form and generates new outgoing messages from extracted data, effectively transforming one message format into another.

Note

The Go regular expression tester is an invaluable tool for constructing and debugging regular expressions to be used for parsing your input data.

Config:

  • match_regex:

    Regular expression that must match for the decoder to process the message.

  • severity_map:

    Subsection defining severity strings and the numerical value they should be translated to. hekad uses numerical severity codes, so a severity of WARNING can be translated to 3 by settings in this section. See Heka Message.

  • message_fields:

    Subsection defining message fields to populate and the interpolated values that should be used. Valid interpolated values are any captured in a regex in the message_matcher, and any other field that exists in the message. In the event that a captured name overlaps with a message field, the captured name’s value will be used. Optional representation metadata can be added at the end of the field name using a pipe delimiter i.e. ResponseSize|B = “%ResponseSize%” will create Fields[ResponseSize] representing the number of bytes. Adding a representation string to a standard message header name will cause it to be added as a user defined field i.e., Payload|json will create Fields[Payload] with a json representation (see Field Variables).

    Interpolated values should be surrounded with % signs, for example:

    [my_decoder.message_fields]
    Type = "%Type%Decoded"
    

    This will result in the new message’s Type being set to the old messages Type with Decoded appended.

  • timestamp_layout (string):

    A formatting string instructing hekad how to turn a time string into the actual time representation used internally. Example timestamp layouts can be seen in Go’s time documentation.

  • timestamp_location (string):

    Time zone in which the timestamps in the text are presumed to be in. Should be a location name corresponding to a file in the IANA Time Zone database (e.g. “America/Los_Angeles”), as parsed by Go’s time.LoadLocation() function (see http://golang.org/pkg/time/#LoadLocation). Defaults to “UTC”. Not required if valid time zone info is embedded in every parsed timestamp, since those can be parsed as specified in the timestamp_layout.

  • log_errors (bool):

    New in version 0.5.

    If set to false, payloads that can not be matched against the regex will not be logged as errors. Defaults to true.

Example (Parsing Apache Combined Log Format):

[apache_transform_decoder]
type = "PayloadRegexDecoder"
match_regex = '^(?P<RemoteIP>\S+) \S+ \S+ \[(?P<Timestamp>[^\]]+)\] "(?P<Method>[A-Z]+) (?P<Url>[^\s]+)[^"]*" (?P<StatusCode>\d+) (?P<RequestSize>\d+) "(?P<Referer>[^"]*)" "(?P<Browser>[^"]*)"'
timestamp_layout = "02/Jan/2006:15:04:05 -0700"

# severities in this case would work only if a (?P<Severity>...) matching
# group was present in the regex, and the log file contained this information.
[apache_transform_decoder.severity_map]
DEBUG = 7
INFO = 6
WARNING = 4

[apache_transform_decoder.message_fields]
Type = "ApacheLogfile"
Logger = "apache"
Url|uri = "%Url%"
Method = "%Method%"
Status = "%Status%"
RequestSize|B = "%RequestSize%"
Referer = "%Referer%"
Browser = "%Browser%"

PayloadXmlDecoder

This decoder plugin accepts XML blobs in the message payload and allows you to map parts of the XML into Field attributes of the pipeline pack message using XPath syntax using the xmlpath library.

Config:

  • xpath_map:

    A subsection defining a capture name that maps to an XPath expression. Each expression can fetch a single value, if the expression does not resolve to a valid node in the XML blob, the capture group will be assigned an empty string value.

  • severity_map:

    Subsection defining severity strings and the numerical value they should be translated to. hekad uses numerical severity codes, so a severity of WARNING can be translated to 3 by settings in this section. See Heka Message.

  • message_fields:

    Subsection defining message fields to populate and the interpolated values that should be used. Valid interpolated values are any captured in an XPath in the message_matcher, and any other field that exists in the message. In the event that a captured name overlaps with a message field, the captured name’s value will be used. Optional representation metadata can be added at the end of the field name using a pipe delimiter i.e. ResponseSize|B = “%ResponseSize%” will create Fields[ResponseSize] representing the number of bytes. Adding a representation string to a standard message header name will cause it to be added as a user defined field i.e., Payload|json will create Fields[Payload] with a json representation (see Field Variables).

    Interpolated values should be surrounded with % signs, for example:

    [my_decoder.message_fields]
    Type = "%Type%Decoded"
    

    This will result in the new message’s Type being set to the old messages Type with Decoded appended.

  • timestamp_layout (string):

    A formatting string instructing hekad how to turn a time string into the actual time representation used internally. Example timestamp layouts can be seen in Go’s time documentation. The default layout is ISO8601 - the same as Javascript.

  • timestamp_location (string):

    Time zone in which the timestamps in the text are presumed to be in. Should be a location name corresponding to a file in the IANA Time Zone database (e.g. “America/Los_Angeles”), as parsed by Go’s time.LoadLocation() function (see http://golang.org/pkg/time/#LoadLocation). Defaults to “UTC”. Not required if valid time zone info is embedded in every parsed timestamp, since those can be parsed as specified in the timestamp_layout.

Example:

[myxml_decoder]
type = "PayloadXmlDecoder"

[myxml_decoder.xpath_map]
Count = "/some/path/count"
Name = "/some/path/name"
Pid = "//pid"
Timestamp = "//timestamp"
Severity = "//severity"

[myxml_decoder.severity_map]
DEBUG = 7
INFO = 6
WARNING = 4

[myxml_decoder.message_fields]
Pid = "%Pid%"
StatCount = "%Count%"
StatName =  "%Name%"
Timestamp = "%Timestamp%"

PayloadXmlDecoder’s xpath_map config subsection supports XPath as implemented by the xmlpath library.

  • All axes are supported (“child”, “following-sibling”, etc)
  • All abbreviated forms are supported (”.”, “//”, etc)
  • All node types except for namespace are supported
  • Predicates are restricted to [N], [path], and [path=literal] forms
  • Only a single predicate is supported per path step
  • Richer expressions and namespaces are not supported

ProtobufDecoder

The ProtobufDecoder is used for Heka message objects that have been serialized into protocol buffers format. This is the format that Heka uses to communicate with other Heka instances, so one will always be included in your Heka configuration whether specified or not. The ProtobufDecoder has no configuration options.

The hekad protocol buffers message schema in defined in the message.proto file in the message package.

Example:

[ProtobufDecoder]

Rsyslog Decoder

New in version 0.5.

Parses the rsyslog output using the string based configuration template.

Config:

  • template (string)

    The ‘template’ configuration string from rsyslog.conf. http://rsyslog-5-8-6-doc.neocities.org/rsyslog_conf_templates.html

  • tz (string, optional, defaults to UTC)

    If your rsyslog timestamp field in the template does not carry zone offset information, you may set an offset to be applied to your events here. Typically this would be used with the “Traditional” rsyslog formats.

    Parsing is done by Go, supports values of “UTC”, “Local”, or a location name corresponding to a file in the IANA Time Zone database, e.g. “America/New_York”.

Example Heka Configuration

[RsyslogDecoder]
type = "SandboxDecoder"
filename = "lua_decoders/rsyslog.lua"

[RsyslogDecoder.config]
type = "RSYSLOG_TraditionalFileFormat"
template = '%TIMESTAMP% %HOSTNAME% %syslogtag%%msg:::sp-if-no-1st-sp%%msg:::drop-last-lf%\n'
tz = "America/Los_Angeles"

Example Heka Message

Timestamp:2014-02-10 12:58:58 -0800 PST
Type:RSYSLOG_TraditionalFileFormat
Hostname:trink-x230
Pid:0
UUID:e0eef205-0b64-41e8-a307-5772b05e16c1
Logger:RsyslogInput
Payload:“imklog 5.8.6, log source = /proc/kmsg started.”
EnvVersion:
Severity:7
Fields:
name:”programname” value_string:”kernel”

SandboxDecoder

The SandboxDecoder provides an isolated execution environment for data parsing and complex transformations without the need to recompile Heka. See Sandbox.

Config:

Example

[sql_decoder]
type = "SandboxDecoder"
filename = "sql_decoder.lua"

ScribbleDecoder

New in version 0.5.

The ScribbleDecoder is a trivial decoder that makes it possible to set one or more static field values on every decoded message. It is often used in conjunction with another decoder (i.e. in a MultiDecoder w/ cascade_strategy set to “all”) to, for example, set the message type of every message to a specific custom value after the messages have been decoded from Protocol Buffers format. Note that this only supports setting the exact same value on every message, if any dynamic computation is required to determine what the value should be, or whether it should be applied to a specific message, a SandboxDecoder using the provided write_message API call should be used instead.

Config:

  • message_fields:

    Subsection defining message fields to populate. Optional representation metadata can be added at the end of the field name using a pipe delimiter i.e. host|ipv4 = “192.168.55.55” will create Fields[Host] containing an IPv4 address. Adding a representation string to a standard message header name will cause it to be added as a user defined field, i.e. Payload|json will create Fields[Payload] with a json representation (see Field Variables). Does not support Timestamp or Uuid.

Example (in MultiDecoder context)

[mytypedecoder]
type = "MultiDecoder"
subs = ["ProtobufDecoder", "mytype"]
cascade_strategy = "all"
log_sub_errors = true

[ProtobufDecoder]

[mytype]
type = "ScribbleDecoder"

    [mytype.message_fields]
    Type = "MyType"

StatsToFieldsDecoder

New in version 0.4.

The StatsToFieldsDecoder will parse time series statistics data in the graphite message format and encode the data into the message fields, in the same format produced by a StatAccumInput plugin with the emit_in_fields value set to true. This is useful if you have externally generated graphite string data flowing through Heka that you’d like to process without having to roll your own string parsing code.

This decoder has no configuration options. It simply expects to be passed messages with statsd string data in the payload. Incorrect or malformed content will cause a decoding error, dropping the message.

The fields format only contains a single “timestamp” field, so any payloads containing multiple timestamps will end up generating a separate message for each timestamp. Extra messages will be a copy of the original message except a) the payload will be empty and b) the unique timestamp and related stats will be the only message fields.

Example:

[StatsToFieldsDecoder]