Skip to content

Arch Forum 2025-04-17

Participants: Backend devs & Victor

Agenda

  • Observability 2.0 part2
    * Loglevels discussion
    * Observability improvements prioritization

Summary

Loglevels

The log level hierarchy (INFO, WARN, ERR and TEKM) were discussed. It became clear that we need a stricter definition. Additionally, the log level goes hand in hand with alerting.

Observability improvements prio

We had a walkthrough of the issue identified in the previous Arch forum, and then ended with a vote of the issues the devs think are worth spending time on (3 votes / dev).

The outcome can be seen below.

Area Topic Votes
Logs Raw logs of external req/resp 4
Logs Ruid handling for bigger jobs 3
Metrics Ops and Warning slack channels spammy and unclear 3
General OpenTelemetry for everything 3
New devs TEKM is not a standard log level 3
Query Kibana limited/unfamiliar query lanaguge 2
New devs Loglevel is two concepts in one (level and type) 2
Logs Log Sanitizing 1
Logs Serilog vs MS ILogger 1
Logs Dont use string interpolation for logs 1
Logs Increased log retention 1
Metrics Custom metrics guidelines 1
Metrics Datadog dashboards and monitors are messy 1
Logs Span/Parentspans are cumbersome to use
Logs Fields not indexed in elastic
Query DW/BigQuery for troubleshooting/analysis
Metrics Trial Elastic/Kibana for more alerts
General Servicemap / application map for system overview
General Real tracing

Original sheet used: https://docs.google.com/spreadsheets/d/1aFs_Ic1yI2lYD0Wk6hsOExOC0WAqbNeNv50JdKPg0-Y/edit?gid=0#gid=0