mozilla

Heka Message

Message Variables

  • uuid (required, []byte) - 16 byte array containing a type 4 UUID.
  • timestamp (required, int64) - Number of nanoseconds since the UNIX epoch.
  • type (optional, string) - Type of message i.e. “WebLog”.
  • logger (optional, string) - Data source i.e. “Apache”, “TCPInput”, “/var/log/test.log”.
  • severity (optional, int32) - Syslog severity level.
  • payload (optional, string) - Textual data i.e. log line, filename.
  • env_version (optional, string) - Envelope version. Semantic version of the message content (http://semver.org/ (although in most cases it is just the major version)).
  • pid (optional, int32) - Process ID that generated the message.
  • hostname (optional, string) - Hostname that generated the message.
  • fields (optional, Field) - Array of Field structures.

Field Variables

  • name (required, string) - Name of the field (key).

  • value_type (optional, int32) - Type of the value stored in this field.
    • STRING = 0 (default)
    • BYTES = 1
    • INTEGER = 2
    • DOUBLE = 3
    • BOOL = 4
  • representation (optional, string) - Freeform metadata string where you can describe what the data in this field represents. This information might provide cues to assist with processing, labeling, or rendering of the data performed by downstream plugins or UI elements. Examples of common usage follow:

  • value_* (optional, value_type) - Array of values, only one type will be active at a time.

Stream Framing

Heka has some custom framing that can be used to delimit records when generating a stream of binary data. The entire structure encapsulating a single message consists of a one byte record separator, one byte representing the header length, a protobuf encoded message header, a one byte unit separator, and the binary record content (usually a protobuf encoded Heka message). This message structure is indicated in this diagram:

digraph header {
	node [shape=record, fontsize=10];

	struct1 [label="Record Separator\n(byte=0x1E) \
	|Header Length\n(byte) \
	|Header\n(protocol buffer) \
	|Unit Separator\n(byte=0x1F) \
	|Message"];
}

The header schema is as follows:

  • message_length (required, uint32) - length in bytes of the serialized message data
  • hmac_hash_function (optional, int32) - enum indicating the hash function used to sign the message, 0 for MD5, 1, for SHA1
  • hmac_signer (optional, string) - string token identifying HMAC signer
  • hmac_key_version (optional, uint32) - version number of the provided HMAC key
  • hmac (optional, []byte) - binary representation of provided HMAC key

Clients interested in decoding a Heka stream will need to read the header length byte to determine the length of the header, extract the encoded header data and decode this into a Header structure using an appropriate protobuf library. From this they can then extract the length of the encoded message data, which can then be extracted from the data stream and processed and/or decoded as needed.