Hello and welcome to our OpenTelemetry workshop. Today you will be setting up an OTel Collector, a small proxy service for moving telemetry data, sometimes called signals, from place to place. This workshop is mostly self-guided, outside of the short tutorial in this README and the short slideshow found here. We want to encourage you to experiment, break things, and ask questions! Don't be shy about talking to your neighbors as well. Learning is better when it's done together :)
Telemetry is derived from the Greek roots tele, meaning "far off" and metron, "measure".
Telemetry is all the information about a remote system. Some examples of Telemetry data:
- A reading from a mass spectrometer
- The speed of a car being measured by its speedometer
- That text you sent your boss to tell them you are running late for the 3rd time
this week
- But only in the most abstract sense :p
In our case, it is our information about running software. This can be things like memory usage, network packets sent, information about HTTP requests, etc.
Telemetry allows you to peer into remote systems and retrieve data about that system. There are tons of great uses for telemetry data, including:
- Optimizing a web-server to reduce request latency
- Reading logs from a piece of software to see why it's behavior doesn't match your expectation
- Tracing an error in a distributed system to find the specific service that failed and why
Ideally any time your software is running it should be keeping information about itself.
What ever is having information about it measured. For our use case, this is software.
This is the focus of this workshop!
- How Docker and similar container runtimes work
- How to scale these systems in production
- We are happy to chat about that though, it's just out of scope for this tutorial
- The specifics of the OTLP spec
- The internals of the storage backends
- A container runtime
- Docker Desktop
- Podman Desktop
- If you use Podman you will need to set the
CONTAINER_RUNTIMEenvironment variable topodman
- If you use Podman you will need to set the
- Make
- Generally preinstalled on UNIX like systems
- Make for Windows
- I don't have a Windows machine so YMMV
Head on over here to start setting up the OTel Collector!
This is a simple service to service flow to demonstrate the capabilities of OpenTelemetry. We have an Auth service to create, validate and get users and a Profile service to create, update and get profile data. This flows into our OpenTelemetry Collector, which takes all our signals and forwards them to their respective storage systems. Speaking of telemetry storage systems, we are using the LGTM stack:
Lokifor logsGrafanafor our UITempofor tracesMimir(We are using an equivalent calledPrometheus) for metrics.
That sure looks good to me...
Below we have a diagram explaining how data flows between these services
An authentication service with the following endpoints:
POST /signUp
curl -X POST http://localhost:8000/signup \
-H "Content-Type: application/json" \
-d '{"email": "user@example.com", "password": "Password123"}'POST /signIn
curl -X POST http://localhost:8000/signin \
-H "Content-Type: application/json" \
-d '{"email": "user@example.com", "password": "Password123"}'GET /user
requires token from signin/signup
curl http://localhost:8000/user \
-H "Authorization: bearer $TOKEN"A user profile service that authenticates against the Auth service. It has the
following endpoints:
GET /
curl http://localhost:8080/ \
-H "Authorization: bearer $TOKEN"PUT /
NOTE: this will create a profile if one does not exist.
curl -X PUT http://localhost:8080/ \
-H "Authorization: bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"username": "johndoe", "bio": "Hello world", "location": "New York, NY, 10001, 123 Main St"}'Proxy that receives, processes, and exports data to Loki, Tempo, and Prometheus backends.
Runs on the following ports:
| Transport | Use | Port |
|---|---|---|
| gRPC | OTLP | 4317 |
| HTTP | OTLP | 4318 |
| HTTP | Health Check | 131311 |
| HTTP | Debug | 55679 |
| HTTP | Metrics | 8888 |
Data store for traces
Runs on http://localhost:3200/
- Receives OTLP data over gRPC on port
4317(This port isn't exposed as it would conflict with our collector)
Tempo utilizes a Kafka compatible queue for handling workloads.
Data store for logs
Runs on http://localhost:3100/
- Receives OTLP data over HTTP
Data store for metrics
- Runs on http://localhost:9090
- Scrapes data from the Collector on port
9090(This port isn't exposed on the collector container because it would conflict with the Prometheus container)
Grafana is our visualization tool. It pulls in data from our sources to let us create graphs and dashboards.
The Grafana container exposes port 3000. Just load up http//localhost:3000.
First we want to build our local images by running: make
Then to bring up the services defined here you can run make up. This will start
the services in the current terminal. If you want to start them detached you can
use make ARGS="-d" up.
Once all the services are up and running you can run our gentraffic script. This
is a pretty simple script that just generates dummy data for services so we can
actually visualize something. It will create a bunch of fake users, then create
profiles for them as well. Run the following:
make trafficgen
This will spin up a Docker container and create 100 fake users. If you want to
change the number of users you can use the USER_COUNT environment variable:
make trafficgen USER_COUNT=1000 would generate 1000 fake users.
Once you have the collector running you can run
telemetrygen,
a helpful CLI tool from OpenTelemetry to generate dummy signals to test our
collector setup. You can run this inside of Docker using make telemetrygen.
This will default to sending 3 traces to the collector. If you want to test other
signals in other quantities you can use these make variables:
SIGNAL- The signal to send. Needs to be one of
traces,logs, ormetrics
- The signal to send. Needs to be one of
COUNT- The number of signals to send
For example, we could use the following to send 5 logs: make SIGNAL=logs COUNT=5 telemetrygen.
Here are links to all the technologies we used:
