Skip to main content

Kubernetes Engineering Observability: What to Track in 2026

· 7 min read
Artur Pan
CTO & Co-Founder at PanDev

A platform team running 11 production Kubernetes clusters has 94,000 metrics scraped every 15 seconds, 2.4 TB of logs per day in Loki, and a Grafana instance with 340 dashboards. When their VP of Engineering asked "are our teams shipping reliably on K8s?", nobody could answer in under an hour. They had cluster observability. They had zero engineering observability.

These are two different problems. Cluster observability tells you whether pods are healthy. Engineering observability tells you whether engineering on top of those clusters is healthy — whether deployments are fast, whether rollbacks are rare, whether developers are waiting on infrastructure or fighting with it. Most K8s shops have solved the first and ignored the second. The 2024 CNCF annual survey reported that 68% of enterprise K8s users struggle with "making observability actionable", which is a polite way of saying they have metrics but no decisions come out of them.

HR + Engineering: Collaboration Playbook for Growing Teams

· 8 min read
Artur Pan
CTO & Co-Founder at PanDev

In 2024, LinkedIn's Workforce Report flagged "HR-Engineering misalignment" as the #2 reason scaling tech teams lose senior engineers, right behind compensation. The usual failure mode: HR designs job ladders on a generic template, Engineering runs calibration as an undocumented side-channel, and two months later the best senior left because their title didn't update with their responsibilities.

This is not an HR problem, and not an Engineering problem. It's a collaboration problem that surfaces every 6-12 months during promotion and compensation cycles. Here's a playbook for making the partnership actually work — who owns what, when, and which data gets shared.

Junior to Senior: Promotion Criteria Backed by Data

· 9 min read
Artur Pan
CTO & Co-Founder at PanDev

A 3.5-year engineer at a 120-person scaleup I worked with last year was "obviously senior" — by everyone's intuition. Her Git and IDE data told a different story: she was shipping more features than any senior on the team, but she wasn't reviewing PRs from people outside her squad, never owned a system-design proposal end-to-end, and her commits clustered in a narrow 2-component surface area. Her manager's gut said senior. The behavioral evidence said: ready in 6-9 months, not today. The 6-month data revisit confirmed it — she got there, and the promotion landed stronger than the intuition-based one would have.

Promotion decisions fail in two directions. Promote-too-early produces under-supported seniors who quietly under-perform and sometimes leave. Promote-too-late loses your best engineers to competitors who saw the readiness first. A 2023 First Round Review study on engineering careers found the single largest driver of senior-engineer regret was "promoted without being ready," cited by 41% of respondents. Data-backed criteria reduce both errors.

Travel and Hospitality Engineering: Booking Platform Teams

· 10 min read
Artur Pan
CTO & Co-Founder at PanDev

A former Expedia engineer told me the quote that should be pinned above every travel-engineering team's desk: "We don't ship software — we ship promises about the future availability of physical objects." An Amadeus GDS query returns inventory that's simultaneously being consumed by 50+ competing distribution channels. Your code has to reconcile that in under 400ms or the user gives up.

Phocuswright's 2024 travel-technology report pegs the global online-travel industry at $1.06 trillion in gross bookings, with roughly 38% flowing through technology platforms that sit between travelers and suppliers. Amazon Web Services' travel-vertical analysis documents that peak-season traffic on booking engines routinely exceeds 15× the yearly baseline — more extreme than any other e-commerce vertical except Black Friday retail. Engineering teams built on "just scale horizontally" assumptions discover, on the first December, that search-cache misses on an unreachable GDS generate cascading failures 90 seconds deep.

AdTech Engineering: Data-Heavy Teams and Productivity

· 7 min read
Artur Pan
CTO & Co-Founder at PanDev

In our IDE dataset of 100+ B2B companies, engineers on AdTech platforms ship 38% fewer pull requests per month than engineers in SaaS tooling — and produce more customer revenue per head. Meanwhile The Trade Desk disclosed it processes over 13 million ad requests per second. Scale like that reshapes what "productive" means. A PR count that would look alarming in a consumer app is perfectly normal when a single configuration line is deployed across 10 million QPS.

AdTech engineering is different, and measuring it with generic DORA-only dashboards misses the point. This article lays out what data-heavy teams actually spend time on, what the numbers look like across the 14 AdTech companies in our dataset, and which productivity signals matter more than throughput for real-time bidding, attribution, and ad-server work.

Staff Engineer: Career Framework with Real Metrics

· 8 min read
Artur Pan
CTO & Co-Founder at PanDev

Will Larson's 2021 survey of 14 staff engineers at large tech companies produced a finding most ladders still ignore: only one in three senior engineers wants the Staff title, and of those, fewer than half make it in five years. The promotion is not a natural continuation of Senior. It's a role change — different work, different signals, different failure modes. Engineering ladders that treat it as "Senior+" produce stalled careers and a pile of ICs who quit for an EM job at another company.

This framework is what actually predicts readiness, drawn from a mix of Larson's research, Tanya Reilly's The Staff Engineer's Path, and the patterns we see in delivery data across 100+ B2B engineering organizations.

Top Expenses Report: Monthly Reviews That End in Decisions

· 9 min read
Artur Pan
CTO & Co-Founder at PanDev

The standing monthly engineering cost review at the 80-person org we worked with in March 2026 ran 90 minutes. Six dashboards. Four department leads each defending their numbers. The output: a Slack message saying "let's dig in next month." Same message in February. Same in January. The dashboards were excellent. The decisions were zero.

The problem is not data scarcity. Asana's 2024 Anatomy of Work report found knowledge workers spend 58% of the day on "work about work," meetings, status updates, and dashboard reviews, and that the modal review meeting produces no concrete next action. Engineering cost reviews are a textbook case. Too many numbers, no forcing function for a decision.

Cost Heatmap: Spot the Most Expensive Project in 30 Seconds

· 12 min read
Artur Pan
CTO & Co-Founder at PanDev

Open the Finances page for an organization with 38 active projects. The default view is a sortable table: project name, cost last 30 days, cost all-time, owner, status. The CFO's monthly cost review starts here. 38 rows, 8 minutes of scrolling, and a 60% chance the most-expensive project is on row 17 where nobody actually looks. Edward Tufte made the case in The Visual Display of Quantitative Information (1983, 2nd ed. 2001) that humans process color and size before they process numbers. A heatmap of the same 38 projects surfaces the dark-red square in under a second. Stephen Few's Information Dashboard Design (2006, 2nd ed. 2013) reaches the same conclusion in industry research: when monitoring requires "find the outlier," tabular data is the wrong primary view. PanDev Metrics' Projects Heatmap widget runs both modes side by side. This post is about why the mosaic should be the default and the list the cross-check.

Media and Streaming Engineering: Building for Peak Load

· 9 min read
Artur Pan
CTO & Co-Founder at PanDev

When Super Bowl LVIII streamed on CBS in 2024, peak concurrent viewers hit 123 million — a number that isn't a KPI, it's a physics problem. Disney+'s Ahsoka finale generated 14 million account logins in a 15-minute window. Netflix's Tyson-Paul fight in late 2024 failed visibly on Twitter because the streaming stack buckled at ~60 million concurrent streams. Media engineering is not optimizing for average throughput. It's optimizing for the one hour per quarter where your graphs go vertical.

The companies that do this well share a specific team shape, a specific release cadence, and a specific set of measurement habits that don't apply to most B2B SaaS. Pulling DORA metrics off a streaming platform and comparing them to a CRM is apples and typhoons. This is a field guide for the engineering leaders who run — or are about to run — a media platform through peak.

Principal Engineer: How to Measure Your Real Impact

· 8 min read
Artur Pan
CTO & Co-Founder at PanDev

A principal engineer at a 200-person fintech spent Q3 writing 180 lines of code. Her team shipped 340,000 lines in the same period. When her CTO looked at coding-time dashboards for a performance review, she almost got flagged as underperforming. What actually happened in Q3: she rewrote the payment reconciliation spec that unblocked two teams, mentored three senior engineers into tech-lead roles, and killed a six-month project that would have shipped something the market didn't want. Her measurable output was tiny. Her impact was the largest of any engineer in the company that quarter.

This is the principal engineer measurement paradox. Every staff-plus framework (Will Larson's, Tanya Reilly's The Staff Engineer's Path, the Google internal engineering ladder) acknowledges it: principal engineers are paid for judgment and force multiplication, not throughput. But most engineering orgs measure them like senior engineers with a bigger title. This article is how to measure principal impact honestly — and how a principal should measure their own impact when the review conversation comes.