A Guide to W3C Trace Context

Earlier this year the W3C Trace Context Recommendation was finally published. A standard way of passing distributed trace correlation has been needed for a long time, and it is good to see there is finally a standard, and many vendors have already moved to adopt it.

The Recommendation defines what a distributed trace is:

A distributed trace is a set of events, triggered as a result of a single logical operation, consolidated across various components of an application. A distributed trace contains events that cross process, network and security boundaries. A distributed trace may be initiated when someone presses a button to start an action on a website – in this example, the trace will represent calls made between the downstream services that handled the chain of requests initiated by this button being pressed.

What constitutes a single logical operation depends on the system. In the example above it is a single button press on a website, whereas in a batch processing system it might be for each item processed, or in a complex UI it might consist of both a button press and a subsequent confirmation dialog.

The W3C Trace Context Recommendation describes how the correlation information — an identifier for the operation, and the parent-child relationships between components — is passed in service calls, but doesn’t cover what to do with that information, apart from how to pass it to the next component.

This is a guide mostly how to use Trace Context for logging, although it also applies to metrics and other telemetry.

Continue reading A Guide to W3C Trace Context