The Profiler Performance Engineering#26
Conversation
7f9019b to
52b183f
Compare
| @@ -0,0 +1,189 @@ | |||
| --- | |||
| title: "The Profiler Performance Engineering" | |||
There was a problem hiding this comment.
I think we should try to find a better title.. Perhaps "The Profiler, a story of what it does and doesn't show"? :)
| Modern software systems are increasingly complex, and ensuring their performance under real-world conditions is critical | ||
| to delivering reliable and scalable applications. Traditional performance testing often focuses on high-level metrics such | ||
| as average response time or throughput. While these metrics are useful, they can obscure deeper system inefficiencies and | ||
| bottlenecks. To uncover these hidden issues, a more granular and methodical approach is required—one that examines the |
There was a problem hiding this comment.
change required-one to "required. - One"?
| This document introduces a performance engineering workflow that integrates profiling techniques with the | ||
| https://www.brendangregg.com/usemethod.html[USE Method] (Utilization, Saturation, and Errors) to diagnose and resolve | ||
| performance issues. By combining performance testing tools like Hyperfoil with low-level profilers such as | ||
| https://github.com/async-profiler/async-profiler[async-profiler] and https://man7.org/linux/man-pages/man1/perf-stat.1.html[perf], | ||
| we demonstrate how to identify CPU stalls, cache misses, and poor memory access patterns. Through a real-world benchmarking scenario, | ||
| we show how profiling data can guide code optimizations and system configuration changes that lead to measurable | ||
| improvements in Instruction Per Cycle (IPC) and overall system responsiveness. | ||
|
|
||
| ## Software Development Life Cycle | ||
|
|
||
| A *software developer* implements features based on defined requirements—such as creating multiple endpoints to solve a | ||
| specific business problem. Once development is complete, the *performance engineering* team gathers SLAs from stakeholders | ||
| and designs performance tests that reflect real business use cases. These tests typically measure metrics like average | ||
| response time. For each release that affects the business logic, the performance tests are rerun to detect any regressions. | ||
| If a regression is found, the team receives feedback to address it. | ||
|
|
||
| There is nothing wrong with this approach, but we can go even further. | ||
|
|
||
| ### Personas | ||
|
|
||
| *Software Developer*: A professional who designs, builds, tests, and maintains software applications or systems. | ||
|
|
||
| *Performance Engineering*: Ensures that a software system meets performance requirements under expected workloads. This | ||
| involves creating and maintaining performance tests—using tools like Hyperfoil and web-based scenarios—to simulate | ||
| real-world usage. The results provide valuable feedback to the team. If the system passes these tests, the product is | ||
| considered ready for General Availability (GA). | ||
|
|
||
| *Profiler Performance Engineering*: Analyzes performance test results by profiling the source code to uncover system | ||
| bottlenecks. The process typically begins by identifying which resource—CPU, memory, disk, or network— the team has | ||
| chosen to stress, guiding the analysis toward the root cause of any performance degradation. | ||
|
|
||
| ### Java Developer Persona Belt | ||
|
|
||
| * Software Developer: Eclipse IDE, IBM Semeru JDK | ||
| * Performance Engineering: Hyperfoil and a web-based application |
There was a problem hiding this comment.
I do not think this section adds much value to the message we should focus on in this blog. @franz1981 ?
There was a problem hiding this comment.
I would like to show what we do and how we can fit under other teams. Like a "benchmarking-post".
| * Profiler Performance Engineering: async-profiler, jfrconv, perf, sar | ||
|
|
||
| ## The USE Method | ||
|
|
||
| According to Brendan Gregg, The **U**tilization **S**aturation and **E**rrors (USE) Method is a methodology for analyzing the | ||
| performance of any system. It directs the construction of a checklist, which for server analysis can be used for | ||
| quickly identifying resource bottlenecks or errors. It begins by posing questions, and then seeks answers, instead of | ||
| beginning with given metrics (partial answers) and trying to work backwards. | ||
|
|
||
| ### Terminology definitions: | ||
|
|
||
| The terminology definition is: | ||
|
|
||
| * *resource*: all physical server functional components (CPUs, disks, busses, ...) | ||
| * *utilization*: the average time that the resource was busy servicing work | ||
| * *saturation*: the degree to which the resource has extra work which it can't service, often queued | ||
| * *errors*: the count of error events | ||
|
|
||
| The metrics are usually expressed in the following terms: | ||
|
|
||
| * *utilization*: as a percent over a time interval. eg, "one disk is running at 90% utilization". | ||
| * *saturation*: as a queue length. eg, "the CPUs have an average run queue length of four". | ||
| * *errors*: scalar counts. eg, "this network interface has had fifty late collisions". |
There was a problem hiding this comment.
I do not think this section adds much value to the message we should focus on in this blog. @franz1981 ?
There was a problem hiding this comment.
Most of those involved in performance may not be aware of this. This sections aim to provide an introduction to the topic without the user opening a new tab and reading the entire USE Method article.
|
|
||
| ### SUT CPU analyses | ||
|
|
||
| We can start by looking for the “perf stat” for the SUT’ PID. "perf stat" is a powerful Linux command-line tool that |
There was a problem hiding this comment.
change to: for the SUT's application PID?
| This metric indicates that 5.2 CPU cores are being utilized. For this test, we have a constraint of only 15 CPU cores | ||
| available. Therefore, 5.2 ÷ 15 equals approximately 34%, meaning the CPU is busy 34% of the time. This suggests that | ||
| the loader is not a highly utilized resource, so we could experiment by increasing its injection rate to raise the | ||
| system load and observe the impact. However, this is not guaranteed to succeed, as other bottlenecks might limit the | ||
| outcome. In our case, the loader can sustain that increase in the injection rate, and now the perf stat output for the SUT is: |
There was a problem hiding this comment.
I think we need to inform about at what rate the load driver was pushing when we saw the 34% utilization and what we changed it to to drive higher load.
| we show how profiling data can guide code optimizations and system configuration changes that lead to measurable | ||
| improvements in Instruction Per Cycle (IPC) and overall system responsiveness. | ||
|
|
||
| ## Software Development Life Cycle |
There was a problem hiding this comment.
Change to === for proper asciidoc formatting
No description provided.