Introduction to Performance Profiling Best Practices and Fundamentals

September 22, 2018 

Introduction to Performance Profiling Best Practices and Fundamentals

What is Performance Profiling?

While executing profiling at go code, the result stored in the file either with .out or .pprof file. The result can be stored in any file format while performing generating a benchmark result. Then to read the result from a file in which profiling information stored, follow commands mentioned in this document.


How Performance Profiling Works?

Generation of profiling results without using a benchmark related command -

  • Write a code to do Profiling.
  • Go to specific location where code file is placed.
  • Execute command to get profiling information.
  • Go build -o app/test && time .app/test >filename.pprof.
  • Profiling information for code stores in file named as (filename.pprof).
  • Execute go tool pprof filename.pprof to get information from profiling file.

Generation of profiling command along with the use of BenchMark command -

Write a code in form of functions according to requirements.

Write a BenchMark for written methods of code in file with extension -


“Filename_test.go”.

Test all the BenchMark present in the file for functions of code


Go test -run =. -bench =. -cpuprofile = cpu.out -benchmem -memprofile = mem.out.
  • After execution of above command result for CPU Profiling will store in cpu.out and result for Memory Profiling stored in mem.out.
  • Execution of following command helps to access content of cpu.out and mem.out file.
  • Execute go tool pprof cpu.out for CPU Profiling and go tool pprof mem.out for Memory Profiling.

Benefits of Enabling Performance Profiling

  • Helps to identify bottlenecks of code.
  • While bottlenecks identified, changes to code easily made, which are creating a bottleneck to a code.

Why Profiling Matters?

  • It is required to find an execution time and resource usage inside a specific function.
  • For command execution time and CPU resource usage, various commands performed.
  • There is a need to follow some parameters; if parameters appearing on the console are understandable, then the output easily analyzed.

Parameters to analyze the result

 

Profiling KPIS

  • Flat - helps to identify a time spend to execute a specific function.
  • Flat% - shows the amount of CPU spent to implement a function.
  • Cum - It stands for cumulative frequency. It represents a time spend to execute a specific function including time spent to perform an all other function called from that particular function.
  • Cum% - Shows that what amount to CPU to implement a particular function including all other functions present inside that function.

How to Adopt Performance Profiling?

 

Memory Profiling commands

Commands that help in CPU and memory profiling explained below -


go tool pprof -cum --inuse_objects mem.out
go tool pprof -cum --alloc_space mem.out
go tool pprof -cum --inuse_space mem.out
go tool pprof -cum --alloc_objects mem.out

CPU Profiling commands

To do profiling on cpu.out file in which CPU profiling regarding information stored. Use the following command -


go tool pprof cpu.out 

CPU Profiling on a specific function

To do profiling on specific function following command can be used.


Syntax="list name_of_function"

Profiling for top CPU processes

 

Top command

This command is listing resources according to more resource usage. The resource which is consuming more CPU is listed first and which is taking less CPU resource is listed in last.

TOP 4

Specifying the number of nodes along with the top command helps us to define that how many nodes user wants to see in a result.

Profiling on the main method


list main\.

This command will use the resource usage for the main class.

Profiling on main method also lists resource usage of other components as follows -


runtime.memclrNoHeapPointers
runtime._ExternalCode
main.main
runtime.(*mheap).alloc
runtime._GC
runtime.heapBits.initSpan
runtime.largeAlloc
runtime.main
runtime.makeslice
runtime.mallocgc
runtime.mallocgc.func1
runtime.systemstack
 

Visualize result from a.PDF File

To generate a result in a .pdf file using a mem.out or cpu.out, use the following command -

Go to the location of the project where a file with .out extension is present. Use the following command for this. /usr/local/src/Projects/Doc_demo/main$ ls

After that execute the following command to generate a pdf file.

go tool pprof --pdf "location of .test file" "location of the file whose result converted into pdf file" >filename.pdf


go tool pprof   --pdf cpu.out > cpu0.pdf

After execution of above command, the cpu.out or mem.out result will be available in a file with.PDF file.

After then listing all files, file with the .pdf extension will be present.

Whenever a PDF file generated which containing a profiling result can be open by using the following command.


xdg-open cpu1.pdf

Finally, After accessing a PDF file from the console, the PDF file will be open as follows.

Above all procedure was to store a CPU.out file result in a PDF file. On the other hand, the same above procedure can be followed to save a mem.out file results in a file with.PDF extension.

Memory profiling on API end points

To do profiling on API based benchmarks, we need to follow steps, which are specified below -

  • First of all, go to a location at which code for go is present.
  • Run an application at a specific point where you want to run.
  • Then open a new console screen as like the previous path (look above step ), from here run a benchmark for application, by using a command written below -

go test -run=xxx -bench=. -cpuprofile profile_cpu.out

Concluding Performance Profiling

In a nutshell, that profiling required to check that in how much time is taken by a specific function to execute. In the same way profiling at API endpoints tells that how much time an endpoint seeking to fulfill the request. It traces each line of code. It helps to identify that how much time each line of code executes. Mainly, it helps for code optimization. Due to this hot area of code identified or piece of code identified acting as a bottleneck for the whole code.