Logs, Traces, Metrics, Oh My

Introduction

Hello and welcome to our OpenTelemetry workshop. Today you will be setting up an OTel Collector, a small proxy service for moving telemetry data, sometimes called signals, from place to place. This workshop is mostly self-guided, outside of the short tutorial in this README and the short slideshow found here. We want to encourage you to experiment, break things, and ask questions! Don't be shy about talking to your neighbors as well. Learning is better when it's done together :)

What Is Telemetry

Telemetry is derived from the Greek roots tele, meaning "far off" and metron, "measure".

Telemetry is all the information about a remote system. Some examples of Telemetry data:

A reading from a mass spectrometer
The speed of a car being measured by its speedometer
That text you sent your boss to tell them you are running late for the 3rd time this week
- But only in the most abstract sense :p

In our case, it is our information about running software. This can be things like memory usage, network packets sent, information about HTTP requests, etc.

Why Is Telemetry

Telemetry allows you to peer into remote systems and retrieve data about that system. There are tons of great uses for telemetry data, including:

Optimizing a web-server to reduce request latency
Reading logs from a piece of software to see why it's behavior doesn't match your expectation
Tracing an error in a distributed system to find the specific service that failed and why

When Is Telemetry

Ideally any time your software is running it should be keeping information about itself.

Who Is Telemetry

What ever is having information about it measured. For our use case, this is software.

Where Is Telemetry

This is the focus of this workshop!

What We Are Not Covering

How Docker and similar container runtimes work
How to scale these systems in production
- We are happy to chat about that though, it's just out of scope for this tutorial
The specifics of the OTLP spec
The internals of the storage backends

Requirements

A container runtime
- Docker Desktop
- Podman Desktop
  - If you use Podman you will need to set the CONTAINER_RUNTIME environment variable to podman
Make
- Generally preinstalled on UNIX like systems
- Make for Windows
  - I don't have a Windows machine so YMMV

Tutorial

Head on over here to start setting up the OTel Collector!

Overview

This is a simple service to service flow to demonstrate the capabilities of OpenTelemetry. We have an Auth service to create, validate and get users and a Profile service to create, update and get profile data. This flows into our OpenTelemetry Collector, which takes all our signals and forwards them to their respective storage systems. Speaking of telemetry storage systems, we are using the LGTM stack:

Loki for logs
Grafana for our UI
Tempo for traces
Mimir (We are using an equivalent called Prometheus) for metrics.

That sure looks good to me...

Diagram

Below we have a diagram explaining how data flows between these services

Auth Service

An authentication service with the following endpoints:

POST /signUp

curl -X POST http://localhost:8000/signup \
  -H "Content-Type: application/json" \
  -d '{"email": "user@example.com", "password": "Password123"}'

POST /signIn

curl -X POST http://localhost:8000/signin \
  -H "Content-Type: application/json" \
  -d '{"email": "user@example.com", "password": "Password123"}'

GET /user

requires token from signin/signup

curl http://localhost:8000/user \
  -H "Authorization: bearer $TOKEN"

Profile Service

A user profile service that authenticates against the Auth service. It has the following endpoints:

GET /

curl http://localhost:8080/ \
  -H "Authorization: bearer $TOKEN"

PUT /

NOTE: this will create a profile if one does not exist.

curl -X PUT http://localhost:8080/ \
  -H "Authorization: bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"username": "johndoe", "bio": "Hello world", "location": "New York, NY, 10001, 123 Main St"}'

Otel Collector

Proxy that receives, processes, and exports data to Loki, Tempo, and Prometheus backends.

Runs on the following ports:

Transport	Use	Port
gRPC	OTLP	`4317`
HTTP	OTLP	`4318`
HTTP	Health Check	`131311`
HTTP	Debug	`55679`
HTTP	Metrics	`8888`

Tempo

Data store for traces

Runs on http://localhost:3200/

Receives OTLP data over gRPC on port 4317 (This port isn't exposed as it would conflict with our collector)

Red Panda

Tempo utilizes a Kafka compatible queue for handling workloads.

Loki

Data store for logs

Runs on http://localhost:3100/

Receives OTLP data over HTTP

Prometheus

Data store for metrics

Runs on http://localhost:9090
Scrapes data from the Collector on port 9090 (This port isn't exposed on the collector container because it would conflict with the Prometheus container)

Grafana

Grafana is our visualization tool. It pulls in data from our sources to let us create graphs and dashboards.

The Grafana container exposes port 3000. Just load up http//localhost:3000.

Starting the Services

First we want to build our local images by running: make

Then to bring up the services defined here you can run make up. This will start the services in the current terminal. If you want to start them detached you can use make ARGS="-d" up.

Gen Traffic

Once all the services are up and running you can run our gentraffic script. This is a pretty simple script that just generates dummy data for services so we can actually visualize something. It will create a bunch of fake users, then create profiles for them as well. Run the following:

make trafficgen

This will spin up a Docker container and create 100 fake users. If you want to change the number of users you can use the USER_COUNT environment variable:

make trafficgen USER_COUNT=1000 would generate 1000 fake users.

Generating Signals

Once you have the collector running you can run telemetrygen, a helpful CLI tool from OpenTelemetry to generate dummy signals to test our collector setup. You can run this inside of Docker using make telemetrygen. This will default to sending 3 traces to the collector. If you want to test other signals in other quantities you can use these make variables:

SIGNAL
- The signal to send. Needs to be one of traces, logs, or metrics
COUNT
- The number of signals to send

For example, we could use the following to send 5 logs: make SIGNAL=logs COUNT=5 telemetrygen.

Links

Here are links to all the technologies we used:

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
auth-service		auth-service
collector		collector
gentraffic		gentraffic
grafana		grafana
loki		loki
postgres		postgres
profile-service		profile-service
prometheus		prometheus
screenshots		screenshots
slides		slides
tempo		tempo
Makefile		Makefile
README.md		README.md
docker-compose.yaml		docker-compose.yaml
system-diagram.png		system-diagram.png
tutorial.md		tutorial.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Logs, Traces, Metrics, Oh My

Introduction

What Is Telemetry

Why Is Telemetry

When Is Telemetry

Who Is Telemetry

Where Is Telemetry

What We Are Not Covering

Requirements

Tutorial

Overview

Diagram

Auth Service

Profile Service

Otel Collector

Tempo

Red Panda

Loki

Prometheus

Grafana

Starting the Services

Gen Traffic

Generating Signals

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Logs, Traces, Metrics, Oh My

Introduction

What Is Telemetry

Why Is Telemetry

When Is Telemetry

Who Is Telemetry

Where Is Telemetry

What We Are Not Covering

Requirements

Tutorial

Overview

Diagram

Auth Service

Profile Service

Otel Collector

Tempo

Red Panda

Loki

Prometheus

Grafana

Starting the Services

Gen Traffic

Generating Signals

Links

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages