Data interpretation guide
The results we provide are based on three measurements:
- eye tracking
- emotion tracking
- heart rate monitoring
The data we collect are very rich in information but to turn them into actionable insights they need to be interpreted. It is not difficult to do with some pointers and that’s what this cheat sheet is for. It quickly explains how we visualize the results and what type of insights can be gained.
Eye Tracking
Our eye tracking algorithms detect all eye movements of a test person on the screen. In the processing of the eye movement data, we identify the most important type, fixations. Fixations are the type of eye movements where your eye stands still. They are so important because visual information during a fixation is the only thing we truly see. Everything else that we think we see is an illusion constructed by our brain. That is why eye tracking is so important.
Fixations have two properties we take into account, one is the number of fixations and the other is the length. Both are represented in the heat map we create.
Heat map
With the information of the position of the fixation, the number of fixations of all testers, and the length we can create a heat map. The heat map visualizes all this information simply with an intensity level ranging from blue (very low) to red (highest). Blue indicates that this area has been seen but most likely only in the periphery of vision. Red on the other hand has been definitely seen.
Screenshots versus Videos
The analysis we provide is the same and does not depend on the type of content. But the pattern of eye movements is very different if we measure screenshots or videos. To give you an example, public wisdom says that faces attract attention and that eye tracking studies showed that if a face is in an image it gets a lot of fixations. That is in general true, but that holds for screenshots. If there are faces in a video, eye movements don’t only look at the faces. Sometimes the faces in a video attract fixations only once and then the eyes are more interested in the action of the video than the faces. That means it is important to realize that an individual frame of a video cannot be directly compared to a static image and the pattern of fixation. In videos, different types of processes come into play as well, for example predictive eye movements. Imagine you see a video of a person throwing a ball. Your brain already knows where the ball is going to be and fixations will go to those points even though the ball isn’t there yet.
Presentation length
Most websites that are visited get a maximum of about 5 seconds before visitors leave it. We decided to present screenshots of websites for 10 – 15 seconds to our testers. But why 10 – 15 seconds if 5 seconds would be enough?
The reason is simple: more data. We wanted to be sure to get enough data to see if important parts of the screenshots are not seen within the first 5 seconds or not at all. This way we can answer if the screenshot needs to be optimized in terms of time for example re-ordering specific parts so things will be seen within 5 seconds. Or a complete re-design is necessary because the CTA (Call to Action) is not seen at all.
Emotion Tracking
Our algorithms detect 6 basic emotions by identifying facial expressions:
That means, while we are doing eye tracking, we also measure how people feel about what they are looking at, at exactly that point in time! There are a few caveats to take into account though, interpreting the emotion data.
#1 Some facial expressions are very similar to each other
Facial expressions like smiling, which indicates happy, is a very distinct facial expression and is rarely confused with another one.
Our algorithm detects the percentage of detected facial expression. But it is important to remember that some facial expressions are similar to each other. For example, Sadness and Anger:
These two expressions aren’t always expressed to the fullest, as shown in the images, but more often just by changes in the eyebrows, which is almost identical for these two expressions. It is therefore important to take this into account.
#2 Anger isn’t always anger
Similar to the caveat #1, some facial expressions are similar to each other, so the context of what is being looked at matters. People rarely get angry at commercials, websites, or graphic designs. But what they do often get is confused. Confusion is another facial expression that is very similar to sad and anger. The bottom line is, that context of what people look at matters in how you are interpreting the emotion. If testers have to watch a really provoking political speech, they are most likely angry and not confused. The same hold for reading provoking text on a website. But if these things are not present, then it’s more likely that people are simply confused.
Excitement Tracking
The third component we measure is heart rate, which is indicative of excitement. Very often heart rate is correlated with the expression of an emotion. For example, if someone starts to smile and feels happy, the heart rate goes up as well. The same thing happens, when someone is angry, excitement goes up.
It is also possible that excitement goes down, for example while seeing something really relaxing. When it comes to websites especially web shops, it’s not a good sign if excitement goes away. The visitor should be excited about the shopping experience and not bored. But when it comes to consulting websites, a reduction in excitement can be good. Consulting websites are trying to express professionalism and most importantly trust. If the consulting websites manages to convey this, the viewer’s excitement reduces and the consulting websites does exactly what it should do!
Combination is Key
The most important aspect of interpreting the data is to use all measures at the same time. Each measure by itself has limitation, but these limitations can be overcome by looking at another measure. For example, the emotion tracking shows anger/sad in viewers but there is nothing to be angry or sad about. Having a quick look at the eye tracking pattern shows that the viewers are looking all over the place, without a clear structure. This means that viewers are not angry or sad but they are confused, something that could not be determined only by looking at the emotions.
The power of our system is that these three measures combined can answer almost any question without leaving much of a doubt.