Pyroscope and Otel Integration

Jul 2, 2023

Title: Profiling and Performance Analysis in Go Applications with Pyroscope and Tempo

Introduction

In modern software development, understanding and optimizing the performance of applications is crucial for delivering high-quality and efficient software. Profiling tools play a vital role in identifying performance bottlenecks and optimizing code execution. Pyroscope provides developers with valuable insights into the runtime behavior of their code. Additionally, Tempo, an open-source distributed tracing system, enables the monitoring and visualization of request flows in complex distributed architectures.

This study aims to explore the integration of Pyroscope and Tempo into a simple Go application with an HTTP server. By incorporating both Pyroscope’s profiling capabilities and Tempo’s distributed tracing features, we can gain a comprehensive understanding of the application’s performance characteristics and how requests propagate through the system.

This study will be divided into three parts:

The first one will focus only in the integration of otel and pyroscope;
The second one will focus on deploying an app into kubernetes to explore pyroscope service discovery;
The third one will focus on labeling profiling data and add other profiling metrics than CPU.

In summary, this first part will consist of:

Implementing a simple Go http server
Instrumenting the Go application with Pyroscope and Tempo
Generating and collecting profiling data

By combining Pyroscope’s profiling capabilities, Tempo’s distributed tracing features, and our Go application, this study aims to showcase the powerful insights and performance optimization potential that can be achieved. The results obtained from this study will contribute to a better understanding of profiling and distributed tracing techniques and their application in real-world scenarios, fostering the development of efficient and performant Go applications in distributed environments.

Continued advancements in profiling and distributed tracing techniques are essential for improving the performance and resource efficiency of modern software systems. Through this study, we hope to demonstrate the value of Pyroscope and Tempo as reliable tools for profiling, tracing, and optimizing Go applications, ultimately leading to improved performance, enhanced user experiences, and more efficient distributed architectures.

Understanding Profiling and Distributed Tracing

From pyroscope blog the following explaination of profiling and continuos profiling is: “Profiling" is a dynamic method of analyzing the complexity of a program, such as CPU utilization or the frequency and duration of function calls. With profiling, you can locate exactly which parts of your application are consuming the most resources. Continuous profiling is a more powerful version of profiling that adds the dimension of time. By understanding your system’s resources over time, you can then locate, debug, and fix issues related to performance.

Some techniques that can be used to provide insights about an app are:

Execution time profiling: This technique measures the time taken by different parts of the code to execute. It helps identify functions or code blocks that are taking a significant amount of time to run, allowing developers to focus their optimization efforts on those areas.
CPU profiling: CPU profiling provides information about the CPU usage by the program. It helps identify functions or code paths that consume excessive CPU resources, enabling developers to optimize those sections to reduce the overall CPU usage.
Memory profiling: Memory profiling helps identify memory usage patterns and potential memory leaks. It provides insights into memory allocation and deallocation, allowing developers to identify areas where excessive memory is being used or where memory leaks may be occurring.
I/O profiling: This technique focuses on measuring the input/output operations of the program. It helps identify areas where the program may be spending a significant amount of time waiting for I/O operations to complete, such as disk reads or network requests.

Distributed tracing is a technique used to monitor and analyze the flow of requests through distributed systems. It provides insights into how requests are processed as they traverse multiple services, allowing developers to understand the interactions and performance characteristics of different components within the system.

In a distributed system, a single user request often triggers a cascade of interactions across multiple services or microservices. Each service performs specific tasks and may communicate with other services to complete the request. Distributed tracing helps visualize and trace the path of a request as it propagates through these services, providing a detailed picture of its journey.

When combined, distributed tracing and continuous profiling allow for an extensive understanding of how each request flows through a system. With distributed tracing, one can comprehend the latency and the path of a request as it triggers each service. When both instrumentation techniques are implemented, it becomes easier to understand the reasons behind specific latency issues and to identify the use cases in the system that consume more resources.

This comprehensive approach leads to a more accurate understanding of bottlenecks, enabling better, faster, and simpler optimization opportunities. It also facilitates easier root-cause analysis in cases where something went wrong.

The Study: Building a Go Application

Describe the purpose and functionality of the Go application used in the study. The app is a simple http server, having a single endpoint /metrics, as the objective is to be able to integrate otel telemetry with pyroscope profiling data. The app is built using uber FX framework to manage dependency injection and control the app lifecycle. It`s architecture is based on onion architecture, dividing the directories as following:

.
├── api
│   ├── api.go
│   ├── api_health_check.go
│   ├── routes.go
│   └── tracing_middleware.go
├── app
│   └── app.go
├── core
│   ├── config
│   │   └── config.go
│   ├── core.go
│   └── telemetry
│       └── telemetry.go
├── infra
│   ├── config
│   │   └── config.go
│   └── pyroscope
│       └── pyroscope.go
└── main.go

Integrating Pyroscope and Tempo

It`s easy to be able to collect profilling data of the Go app. First, we need configure the client to send data to the pyroscope server. The code responsible for doing it is located at infra/pyroscope:

package pyroscope

import (
	"log"

	"github.com/pyroscope-io/client/pyroscope"
	"github.com/rafael-polakiewicz/profiling-prom-jaeger/core/config"
)

type Pyroscope struct {
	profiler *pyroscope.Profiler
}

func New(config *config.Config) *Pyroscope {
	profiler, err := pyroscope.Start(pyroscope.Config{
		ApplicationName: config.App.Name,
		ServerAddress:   config.Profiler.Server,
		Logger:          pyroscope.StandardLogger,
		Tags:            map[string]string{"environment": config.App.Environment},

		ProfileTypes: []pyroscope.ProfileType{
			// these profile types are enabled by default:
			pyroscope.ProfileCPU,
			pyroscope.ProfileAllocObjects,
			pyroscope.ProfileAllocSpace,
			pyroscope.ProfileInuseObjects,
			pyroscope.ProfileInuseSpace,

			// these profile types are optional:
			// pyroscope.ProfileGoroutines,
			// pyroscope.ProfileMutexCount,
			// pyroscope.ProfileMutexDuration,
			// pyroscope.ProfileBlockCount,
			// pyroscope.ProfileBlockDuration,
		},
	})
	if err != nil {
		log.Fatalf("new profiler: %v", err)
	}

	log.Default().Println("")
	return &Pyroscope{
		profiler: profiler,
	}
}

The second step is to register a route to serve the profiling data. For cpu profiling, one popular option is pprof. Thanks to the community, echo-contrib repo contains github.com/labstack/echo-contrib/pprof, which triggers pprof and makes it easy to create a route pprof.Register(api.server). This code is located at ./api/api.go:

func (api *API) Start() error {
	api.registerTracer()
	api.registerRoutes()
	pprof.Register(api.server)

	return api.server.Start(":" + api.config.Server.Port)
}

And it`s done. If you have a pyroscope server running and start the app, cpu profiling will be up and running.

The integration of profiling data with tracing can be easily achieved by using the folliwing the lib github.com/pyroscope-io/otel-profiling-go. With the tracing provider already configured, one only have to set the otel provider by using the function NewTracerProvider. The code that sets the tracer is located at ./core/telemetry:

import(
    ...
    otelpyroscope "github.com/pyroscope-io/otel-profiling-go"
)
func NewTelemetry(config *config.Config) (*Telemetry, error) {
	// set otel resource and exporter

	provider := sdktrace.NewTracerProvider(
		sdktrace.WithResource(resources),
		sdktrace.WithSampler(sdktrace.AlwaysSample()),
		sdktrace.WithSpanProcessor(
			sdktrace.NewBatchSpanProcessor(exporter),
		),
	)

	otel.SetTracerProvider(
		otelpyroscope.NewTracerProvider(
			provider,
			otelpyroscope.WithAppName(config.App.Name),
			otelpyroscope.WithPyroscopeURL("http://localhost:4040"),
			otelpyroscope.WithRootSpanOnly(true),
			otelpyroscope.WithAddSpanName(true),
			otelpyroscope.WithProfileURL(true),
			otelpyroscope.WithProfileBaselineURL(true),
		),
	)

	return &Telemetry{tracerProvider: provider}, nil
}

Note that there are some configurations when creating the pyroscope tracer provider. The will be explained further.

Correlating Profiling Data and Spans

Explain the process of correlating the profiling data from Pyroscope with the request spans from Tempo.
Discuss how this correlation helped in gaining a deeper understanding of the performance characteristics of different code segments.

Conclusion and Key Takeaways

Summarize the key findings, insights, and optimizations achieved through the study.
Emphasize the importance of profiling and distributed tracing in optimizing Go applications.
Discuss the broader implications and applications of the study’s results.

Future Directions and Recommendations

Provide suggestions for further research or improvements based on the study’s outcomes.
Share recommendations for developers interested in integrating profiling and distributed tracing tools into their own applications.

Wrapping Up

Conclude the article with a final thought or call to action, encouraging readers to explore profiling and distributed tracing in their own projects.