Replay with Paul Osman: A deep dive into OpenTelemetry and Kubernetes

February 15, 2023

with

Kenneth Rose

In this replay of a previous episode, unlock the secrets of OpenTelemetry, Lambda as a scalable CPU, and building a platform team. We revisit our past episode with Paul Osman, a Senior Staff Software Engineer at LinkedIn. At the time of the recording, Paul was a staff Platform Engineer at Honeycomb. In this episode, we dive into OpenTelemetry, architecture, and how to build long-term Ops strategies.

Replay with Paul Osman: A deep dive into OpenTelemetry and Kubernetes

Episode details

In this replay of a previous episode, unlock the secrets of OpenTelemetry, Lambda as a scalable CPU, and building a platform team.

We revisit our past episode with Paul Osman, a Senior Staff Software Engineer at LinkedIn. At the time of the recording, Paul was a staff Platform Engineer at Honeycomb. In this episode, we dive into OpenTelemetry, architecture, and how to build long-term Ops strategies.

Join us as we discuss:

What OpenTelemetry is and what it can do
Gaining better observability of data
What to tackle first when developing a platform team

The capabilities of OpenTelemetry

OpenTelemetry open-source observability framework consists of several different tools, APIs and SDKs. Using OpenTelemetry, you can generate, collect and export telemetry data.

According to Paul, OpenTelemetry produces two main categories of work — one is a specification that describes how telemetry data should look; the other is a set of open-source APIs that allow easy generation from systems. And the entire system is rooted in the perfect union of two communities.

“OpenTelemetry is one of those great stories in open source or open standards. Two communities recognize that they're serving the same audience and then decide to actually combine their efforts.” — Paul Osman

OpenTelemetry also offers a variety of services and tools to meet unique business needs. For example, by using a product called OpenTelemetry Collector, the user can fork off telemetry data. Rather than having to switch telemetry data within applications, you can easily change configurations to evaluate multiple vendors.

Gaining better observability of data

Reinstrumenting code is frequently a non-starter for people. Often, they would rather stick with their existing systems or build something from scratch. Instrumentation and reinstrumentation is work that most customers do not necessarily care about. For those with hundreds of services, instrumentation is something that customers want to think about once and never again.

The services and systems provided through HoneyComb and OpenTelemetry are offering just this — customers no longer have to instrument their code each time they evaluate or switch vendors.

“I think one thing that we've learned in this world is that instrumenting your code is dead.” — Paul Osman

By allowing versatility in vendors and manipulating configuration rather than code, new opportunities for observability arise.

Traditional observability relies on three factors: logs, metrics and traces. But Honeycomb breaks the historical folds, relying on data beyond these metrics.

“At its heart, Honeycomb is an ultra-wide event store,” Paul says. “We just accept keys and values. You can embed those and have as many of them as possible to make an ultra-wide table that represents your data.”

“How well data represents the internal state of your system is really the degree to which you have observability.” — Paul Osman

Rather than relying on logs, metrics and traces, Honeycomb allows you to assess any data point as traceable and observable. Ultimately, this allows organizations to build massive datasets according to their needs and services.

What to tackle first when developing a platform team

According to Paul, the big problem areas that new platform teams will face vary greatly on where, how and by whom their products are used. But regardless of industry or use, there are a few necessities for developing a platform team.

“Whether you’re serving customers or other engineers on your team, your job is to ultimately accelerate value delivery,” Paul says.

And to do that, you have to understand the problems your users are facing in depth. In some cases, engineers may have difficulty with reliability; in others, engineers may take a long time to get their code into production.

In the end, it’s the responsibility of the platform or internal systems teams to investigate existing problems and help combat them.

"There are no best practices in our industry,” Paul says, “there are only sets of guidelines you can use to look for and evaluate problems with value in mind.”

Establishing this understanding before building a platform team is essential. While some organizations thrive with a specialized team, others work cross-functionally. But each environment faces a unique set of challenges and therefore requires an equally unique approach.

‍

Want to learn more about moving away from monolithic software, empowering your teams and the idea of ‘build and rent’? Listen on Apple Music, Spotify or wherever you find your podcasts.

Meet your host

Kenneth Rose

Kenneth (Ken) Rose is the CTO and Co-Founder of OpsLevel. Ken has spent over 15 years scaling engineering teams as an early engineer at PagerDuty and Shopify. Having in-the-trenches experience has allowed Ken a unique perspective on how some of the best teams are built and scaled and lends this viewpoint to building products for OpsLevel, a service ownership platform built to turn chaos into consistency for engineering leaders.

‍

More episodes

Latest blog posts

DevOps resources tips and best practices

Blog

Platform engineering and IDPs: How they work together to improve developer productivity

In this article, we’ll explore how platform engineering and internal developer portals complement each other.

Blog

March Product Updates

Some of the big releases from the month of March.

Blog

How Generative AI Is Changing Software Development: Key Insights from the DORA Report

Discover the key findings from the 2024 DORA Report on Generative AI in Software Development. Learn how OpsLevel’s AI-powered tools enhance productivity, improve code quality, and simplify documentation, while helping developers avoid common pitfalls of AI adoption.

Start today

Engineering is hard work. Your developer portal doesn’t need to be.

Get started

Subscribe

Join our newsletter to stay up to date on features and releases.

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.

Responsible Disclosure

Replay with Paul Osman: A deep dive into OpenTelemetry and Kubernetes

Episode details

The capabilities of OpenTelemetry

“OpenTelemetry is one of those great stories in open source or open standards. Two communities recognize that they're serving the same audience and then decide to actually combine their efforts.” — Paul Osman

Gaining better observability of data

“I think one thing that we've learned in this world is that instrumenting your code is dead.” — Paul Osman

“How well data represents the internal state of your system is really the degree to which you have observability.” — Paul Osman

What to tackle first when developing a platform team

“Whether you’re serving customers or other engineers on your team, your job is to ultimately accelerate value delivery,” Paul says.

Meet your host

More episodes

Subscribe for regular updates.

Most Popular

Latest blog posts

Engineering is hard work. Your developer portal doesn’t need to be.