Session Replay: a playback of every user’s interactions with your software
Capturing and playing back a user’s interactions involves specific software design choices for web and mobile.
Seeing how users interact with your site and applications can benefit business and customer support. There’s no better way to understand how an issue has arisen other than seeing it directly, as descriptions from memory can sometimes be inaccurate and make you lose information. Furthermore, it can also help you improve the overall usability of your app or website.
You can devise different solutions if you want to understand how users interact with your sites and applications. At Dynatrace, we believe that the best way is to see the user’s actual behavior, and that’s why we came up with Session Replay.
Session Replay lets you record customer interactions with your web application and replay each click and action in a movie-like experience.
It may sound easy, but building this feature involves multiple decisions, technologies, and stakeholders. In this article, I’ll explain how we built Session Replay and what thoughts went behind our design choices.
Capturing a session
A session is what a single user does during a finite time period. Once the user closes the tab, we close the session. If a session goes on for a long time (the hard limit in the cluster is around 8h), we close the current one and start a new one. If no interaction has taken place for an extended period, the session is ended (idle timeout). A session can contain actions and events, like loading, user identification, and clicking on buttons or errors. But to build a playback, we need more than that.
Once you install OneAgent in your server or application, all actions and events are sent to our servers. To do so, we send all the required data while minimizing the amount sent in different beacons. Masking rules are applied to this data to ensure it doesn’t contain personally identifiable information.
Due to the architecture of both mobile and web and the possibility of transferring data, we have different processes for each case.
The DOM of a webpage helps us recreate what was happening on that page at a given time. With Dynatrace, we observe the whole DOM and send the entire structure for every page load or just a partial (only the changes the DOM has experienced) when something, like a modal, pops up. All these snapshots are sent in beacons with the associated events, like scrolls, keystrokes, or changes in CSS that will help us rebuild the precise moment when the user was in session.
We also took this one step further by offering resource capturing to make sure the applied CSS is the one shown during the user’s session, no matter what later changes the client might have made to their site.
For mobile apps, we take a different approach. We send a screenshot of every loaded view, complete or partial (a screenshot of only the part that has changed) and register scrolls, page rotations, or keystrokes. Every image is optimized, so it’s small enough for fast transfer without compromising readability.
We currently support Session Replay for crashes only for Android and iOS, but it’s on our roadmap to extend it to full sessions.
Collecting the data
All these data chunks are sent to our server through different beacons. Some are related to events, like load or select. Some are binary, like DOM mutations, and other content resources, like CSS, images, or fonts. Then we start a process of telling apart those beacons to see if they belong to plain session data or replay data, as they will be saved in 2 different storages. In there, we check if we have snapshots and view events and if so, the straight session is marked as having replay data.
Some time ago, we started a process to improve our storage by reducing the latency, capturing more actions, like rage clicks, and redistributing the data. These improvements are finished today, but we keep working on further achievements.
As a limitation, we split our user sessions a few hours or after 200 actions. Replay data is separate in our servers, however. Imagine a session lasting 24h. This would mean three sessions for us. Thanks to an internal mechanism, if we want to display the third session, we can show the replay data only for this time range, even though we can fetch the DOM transformations from the previous ones.
Rendering the data
As you can imagine, the amount of data stored is not tiny, and it must be deleted at some point to make space for new data. If an operator tries to access a session that’s already been deleted, they will receive an error message informing them why Dynatrace can’t serve it.
As with the capturing, there are also differences between Web and Mobile.
The whole DOM is rebuilt based on the data we already have. We can then fetch captured resources, like images or CSS, and we reapply the masking just in case something was not masked when recorded, but it is now.
Sadly we can’t reapply the masking for mobile as the content is already baked into images, and we no longer know what’s in there. In this case, we take image after image and rebuild the whole movie by showing the gestures on top of it. These gestures help you understand what the user did, and even though we still don’t have all of them captured, we keep improving.
GDPR and other privacy concerns
You may wonder: doesn’t recording users go against any privacy laws?
We have solved this issue by allowing the end user to choose whether they allow recording. If they don’t accept the cookies displayed as they enter the website, Dynatrace doesn’t record any of their actions. In any case, our default settings are bound to mask all, and then it’s on our clients to fine-tune the rules.
Furthermore, our customers can apply a specific ruleset to mask whatever they believe is sensitive data, and Dynatrace won’t capture it. The masking rules for the web are reapplied on replay in case the client has detected something new. In this way, what has been masked on record isn’t visible anymore, and if the customer enforces stricter rules, Dynatrace can apply them.
Session Replay is one of the most interesting features on the Dynatrace platform, giving users real insight into how customers interact with their product. Building this feature hasn’t come with its challenges, and I’m sure we will encounter more as we keep developing it further.
Stay tuned for future updates!
Thanks to Giulia Di Pietro, Jordi Masramon Solans, and Stefan Eberl for your feedback on this post.
Session Replay: a playback of every user’s interactions with your software was originally published in Dynatrace Engineering on Medium, where people are continuing the conversation by highlighting and responding to this story.