This week, I got the chance to work with an experienced developer while we investigated whether our code was degrading performance on a client’s site. This was totally new to me, so I mostly watched as he quickly collected data and traced function calls. Many of us have probably seen a chart like this before:
I wanted to better understand how to use this tool and how to read the results. So of course the next logical step is…to blog about it!
Generating a Performance Sample
Before going any further, I should point out that Google has done a great job of documenting this tool themselves (of course), so feel free to check it out if anything I write doesn’t make sense or seems incomplete. I relied on it heavily while writing this post.
Let’s assume that we’re using Chrome DevTools, though a lot of this functionality will work the same way in other browsers. Open an incognito window, head to the site in question and open up DevTools (cmd + option + J on a mac, ctrl + shift + J otherwise). DevTools has several interface/tab options, but we want Performance:
I’m using weather.com as a relatively generic, straightforward, and inoffensive example. Here’s a screenshot of our default Performance tab:
On the very top bar, we see our element selector, display type (desktop vs mobile), options for selecting an interface, Errors, Warnings, Issues, and some customization options for settings and preferences. Below that, we see a bar with Performance-specific features. From left-to-right, we’ve got:
- Record: click this to start collecting data on site performance. The resulting recording is called a Profile.
- Start profiling and reload page: this will produce a full performance profile after a page reload. We’ll use it to get an idea of what functions are being called and how long they take when our page starts up.
- Clear: clears the current performance profile.
- Load Profile: We can use this to view a previously saved profile.
- Save Profile: If we’ve got a profile on in the console, we can save it!
- Show recent timeline sessions: Switch between multiple recordings.
- Capture screenshots: Checking this box means that we’ll be able to see exactly how the page looked at the moment of a certain function call. This can help with figuring out whether certain scripts can be made asynchronous, delayed, or even removed completely. A big part of performance management is determining the balance between load speed and site experience.
- Capture settings: This gear toggles the next bar.
If we have Capture settings on, we’ll see four more options:
- Enable advanced paint instrumentation (slow): Paint instrumentation allows us to analyze animations. It’s not super relevant to site performance, but it’s worth exploration. Here’s a short video from Google about it.
- Network: Throttling allows us to simulate a slower connection speed so that we can test our site’s performance in different situations. When loading content at a non-optimal speed, we can get a much better idea of what loads on our site when.
- CPU: We can also throttle our CPU, simulating a slower device rather than a slower network.
To get our first profile, click the Start profiling and reload page button. The page will reload and we’ll see a Status popup in DevTools. It took me about 30 seconds to get through the recording and compilation, so don’t worry if the process seems to be taking a while. For best results, I suggest running the refresh shortly after opening the window. For whatever reason, having the window open for a long time before I clicked that button created a lot of whitespace in my profile.
What does it mean?
Let’s break these results into three sections. The first two sections are divided by a millisecond timeline. Here’s the top section: the Overview.
If we put too much strain on our server or our bitrate limit is too low, we risk dropping frames. This means that a user might experience choppy transitions or scrolling as they navigate our site during times of high stress. On weather.com, we can see that there are some instances where FPS is high, but in general the performance is steady.
At the very top, we see some pink bars with red highlights. Those highlights are flags for long tasks, bottlenecks, or other issues we might want to investigate. Finally, at the very bottom, we see images that will show a screenshot of the webpage at the point of the timeline.
There’s a ton of data in the Overview and it’s compressed such that it’s impossible to actually see details about it. That’s why we have the second section, or the Main section. Here, we can see details about our Main thread (and a lot of other processes, as well:
If we open the Main tab, we’ll be able to see all the JS processes and call stacks going on at a specific moment of the page load. They are organized in a “flame chart,” so named because they create columns that look like flames (ours is upside down). In this chart, rows of the same color are related and the top-most row causes the next row to run. The length of the bar shows how long it took to complete the process. Processes get shorter as the chart gets deeper because as they complete, their parent process completes until the original function call is finally resolved.
We don’t get full transparency, especially since the JS has likely been minified, but if we’re code authors we should be able to recognize some of our tasks and see whether or not they are, for example, causing long load times. We can even see where generic events like DOMContentLoaded occur.
Clicking onto any of these bars will provide additional information in the lowest section of our DevTools display: the Summary section. Here we’ll see the time it took for a function to run, where the function was called, and a breakdown of the runtime. In our original screenshot, we saw a summary breakdown of processes for the entire page load.
If we want to see the data on our JS processes in a different format, we can use the Bottom-Up tab. This shows a list of the functions from any process in order from longest to shortest so that we can identify parts of our site that might be causing latency. Call Tree displays the functions as parents and children, showing who called what. Event Log shows all the processes in the order they occurred.
So now that we know what everything means, how might we go about using it? I searched through the weather.com results for anything that wasn’t minified and found this:
It looks like, around 3.5 seconds after our page started loading, we called a function called startCMTagMain. Since most other function calls I’d seen on this page were anonymous, I assumed this was coming from a third party script. I checked the HTML and indeed, there’s a script that mentioned cmTag:
A bit more research taught me that Taboola is a marketing service that serves ads on websites. Sure enough, there were a few of those on weather.com! That makes sense considering we’re seeing some appendChild calls made close to the bottom of our flame chart.
If I were actually working on this site, I likely would have already known all this without having to do any detective work, but I might still examine this part of the load process to see if Taboola’s script is creating any latency on my site. It looks like the entire task that calls startCMTagMain takes about 60 ms. Anything longer than 50 ms qualifies as a “Long task,” but according to that web.dev documentation, the idea there is to prevent perceptible lag on user input. If this process is simply appending ads onto the bottom of the webpage, I have a feeling that an extra 10 ms wouldn’t be the cause of any issue.
Making use of the Performance feature can require a lot of knowledge about the code we’re reviewing. Without that, we’re examining a bunch of anonymous functions and painting processes. Still, it’s helpful to learn a new tool and be prepared to wield it for future debugging. It’ll be exciting when we can finally use it to solve a problem of our own!