Fwd: [lttng-dev] [CTF2-SPEC-2.0] Announcing CTF 2, the next generation of the Common Trace Format

From: Mathieu Desnoyers
Date: Tue Mar 26 2024 - 15:09:53 EST


Hi,

After 8 years of proposals and RC iterations, here is the final version
of the Common Trace Format specification version 2.

Feedback is welcome!

Mathieu

-------- Forwarded Message --------
Subject: [lttng-dev] [CTF2-SPEC-2.0] Announcing CTF 2, the next generation of the Common Trace Format
Date: Tue, 26 Mar 2024 19:00:37 +0000
From: Philippe Proulx via lttng-dev <lttng-dev@xxxxxxxxxxxxxxx>
Reply-To: Philippe Proulx <pproulx@xxxxxxxxxxxx>
To: diamon-discuss@xxxxxxxxxxxxxxxxxxxxxxxxx <diamon-discuss@xxxxxxxxxxxxxxxxxxxxxxxxx>
CC: lttng-dev@xxxxxxxxxxxxxxx <lttng-dev@xxxxxxxxxxxxxxx>, tracecompass-dev@xxxxxxxxxxx <tracecompass-dev@xxxxxxxxxxx>

To all tracing enthusiasts,

The Diagnostic and Monitoring Workgroup (DiaMon) is thrilled to announce
the launch of version 2 of the Common Trace Format (CTF), a binary trace
format designed to be fast to write without compromising great
flexibility.

The first version of CTF [1] has been widely used and tested in the
industry for almost fifteen years now, by different producers and
consumers.

CTF 2 is a major revision of CTF 1, bringing many improvements, such as:

‣ Using JSON text sequences [2] for the metadata stream.

‣ Simplifying the metadata stream.

‣ Adding new field classes (bit array, bit map, boolean, LEB128, BLOB,
and optional) and improving existing ones.

‣ Supporting UTF-16 and UTF-32 string fields.

‣ Using roles instead of reserved structure member names to identify
meaningful fields.

‣ Adding the attribute and extension features to extend and customize
the format.

SPECIFICATION LINKS
━━━━━━━━━━━━━━━━━━━
The initial revision of CTF2-SPEC-2.0 is available here:

<https://diamon.org/ctf/CTF2-SPEC-2.0.html>

Its AsciiDoc source is <https://diamon.org/ctf/CTF2-SPEC-2.0.adoc>.

The latest revision of CTF 2 is always <https://diamon.org/ctf/>.

The latest revision of CTF 1 is <https://diamon.org/ctf/v1.8.3/>.

FROM LEGACY TO FUTURE
━━━━━━━━━━━━━━━━━━━━━
The initial version of CTF [1] has been in widespread use and undergone
rigorous testing across various sectors by a range of users for close to
fifteen years.

Today, CTF 2 comes to existence to address significant limitations of
CTF 1 that make it challenging to implement a consumer and nearly
impossible to extend.

Developed over the course of eight years, CTF 2 is the culmination of
two initial proposals and nine release candidates, each one adding what
we now consider precious features that have incrementally shaped it into
the robust and versatile tracing format it is today.

We're confident that CTF 2 meets our important original design goals,
namely:

‣ CTF 2 data streams must be backward compatible with CTF 1 [1] data
streams.

CTF 2 only specifies metadata stream changes.

‣ The CTF 2 data streams must be highly efficient for a tracer to
produce.

The addition of features such as the configurable bit order of a
fixed-length bit array field, variable-length integer fields, UTF-16
and UTF-32 string field encodings, and BLOB fields makes this truer
than ever.

‣ A CTF 2 metadata stream must be extensible by users of the
specification.

The namespaced attribute and extension mechanisms of CTF 2 metadata
streams enable limitless extensibility.

‣ A CTF 2 trace should be easy to consume.

A CTF 2 metadata stream is a JSON text sequence [2], removing all the
complexity of parsing the custom DSL brought by CTF 1.

FEATURE HIGHLIGHTS
━━━━━━━━━━━━━━━━━━
CTF 2 brings many improvements over CTF 1, the most notable ones being:

JSON text sequence metadata stream:
Whereas a CTF 1 [1] metadata stream is written in TSDL, a somewhat
complex DSL inspired by the C language, a CTF 2 metadata stream is a
JSON text sequence [2].

A JSON text sequence simply is a sequence of JSON values, each one
beginning with the "record separator (RS)" (U+001E) codepoint and
ending with a "new line (NL)" (U+000A) codepoint.

Using a JSON text sequence instead of a JSON array, for example,
makes it easier to stream the objects of a CTF 2 metadata stream and
allows the tracer to add metadata information while tracing occurs.

Attributes:
Any trace producer may use attributes to add information to specific
metadata stream objects, for example:

{
"type": "null-terminated-string",
"encoding": "utf-16le",
"attributes": {
"meow-tracer": {
"confidentiality-level": 4
}
}
}

A tracer may add namespaced attributes to trace class, data stream
class, clock class, event record class, and field class objects.

A general trace consumer may safely ignore attributes: you don't
need them to decode a data stream.

Extensions:
The purpose of an extension is to add core features to CTF 2 or to
modify existing core features. In other words, an extension may
alter the format itself. For example:

{
"type": "preamble",
"version": 2,
"extensions": {
"meow-tracer": {
"variable-clock-frequency": true
}
}
}

While a general trace consumer may safely ignore attributes, it must
not ignore extensions.

Field roles:
The name of a structure field member is now decoupled from any
special meaning of said field.

For example, you may name a packet header integer field `myosotis`
and make it contain the current data stream ID:

{
"name": "myosotis",
"field-class": {
"type": "fixed-length-unsigned-integer",
"byte-order: "little-endian",
"length": 16,
"roles": ["data-stream-id"]
}
}

New field classes:
CTF 2 brings new field classes to fill some important gaps of CTF 1:

Fixed-length bit array field:
A compact sequence of bits without integral semantics.

Any fixed-length scalar field conceptually is a fixed-length bit
array field.

Fixed-length bit map field:
A fixed-length bit array field with flags associating bit index
ranges to names.

Fixed-length boolean field:
A fixed-length bit array field of which the decoded value is
either true or false.

Variable-length integer field:
An integral value encoded with the LEB128 [3] format.

BLOB field:
A compact, byte-aligned sequence of bytes with an associated
media type:

{
"type": "static-length-blob",
"length": 511267,
"media-type": "image/tiff"
}

The length (number of bytes) of a BLOB field may be static or
dynamic (provided by an anterior integer field).

Optional field:
A field which is either another field or nothing (zero bits):

{
"type": "optional",
"selector-field-location": {
"path": ["config", "has-debug-info"]
},
"selector-field-ranges": [[1, 255]],
"field-class": {
"type": "null-terminated-string"
}
}

This is similar to what you could achieve with a variant field
having an empty option in CTF 1.

All UTF string field encodings:
A null-terminated, static-length, or dynamic-length string field may
have the UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, or UTF-32LE encoding:

{
"type": "static-length-string",
"length": 32,
"encoding": "utf32-be"
}

JOIN THE MOVEMENT
━━━━━━━━━━━━━━━━━
We invite you to explore the CTF 2 specification and see firsthand the
impact it can have on your tracing projects.

Even though the CTF 2 specification is now published, your feedback
remains incredibly important to us so that we can be confident that the
document is flawless and as easy to understand as possible.

As far as DiaMon projects are concerned, what we know today about CTF 2
adoption is:

Babeltrace 2 [4]:
Planned in the next minor release, Babeltrace 2.1.

LTTng [5]:
Planned in LTTng 2.15.

Trace Compass [6]:
Partial support currently; awaiting more CTF 2 trace samples to
complete the development.

If you're considering adopting CTF 2 for your own project, please share
the news with us!

REFERENCES
━━━━━━━━━━
[1]: https://diamon.org/ctf/v1.8.3/
[2]: https://datatracker.ietf.org/doc/html/rfc7464
[3]: https://en.wikipedia.org/wiki/LEB128
[4]: https://babeltrace.org/
[5]: https://lttng.org/
[6]: https://eclipse.dev/tracecompass/
_______________________________________________
lttng-dev mailing list
lttng-dev@xxxxxxxxxxxxxxx
https://lists.lttng.org/cgi-bin/mailman/listinfo/lttng-dev