A self-experiment on experience sampling

June 1, 2024 -

Updated: June 6, 2024

14 min read

3305 words

For the past three months, I’ve consistently tracked my emotions multiple times a day. My aim was to explore a method known as experience sampling, which focuses on real-time, real-life data collection of mental states. In this post, I will give a short introduction on experience sampling and its merits, considerations in choosing a tool and a theoretical model, and I talk in depth about some fascinating difficulties and challenges I encountered along the way.

Introduction to Experience Sampling

Experience sampling is a common method in psychological research for collecting real-time information as people go about their daily lives. What it looks like in practical terms, is that you may be asked to enter some information into an app several times a day about what’s going on inside your head at that particular moment. It may not seem like much, but experience sampling is surprisingly powerful. Because even with super cool techniques for looking at brains, like functional MRI, you can (most likely) never avoid asking people about what is happening in their mind’s eye. In mental health care, we often collect information using surveys that ask people how they felt over a certain period. “How often did you feel depressed in the last two weeks?” is a question we may ask to gauge someone’s mood. But if they’re feeling especially down on the day they complete the survey, they may unintentionally paint a bleaker picture of their mood than what was true for that period. In addition, we don’t learn anything of how their mood developed over the course of that period: whether they usually feel depressed in the mornings, whether it has been getting worse, or whether there is a pattern that repeats itself. With experience sampling, you ask someone how they are feeling several times a day over a longer period of time. Instead of just one data point, you now have dozens of them. The resolution has just been 20x’ed. Such detailed data can reveal valuable trends and patterns that can teach us much more about an individual’s state than a single measurement can.

Experience sampling does not only hold promise in mental health care and research. There is also a lot to be learned by individuals about themselves. One such group is the Quantified Self community, who are dedicated to self-tracking to improve their lives. I have come across them on various internet dwellings, and again recently while doing a literature search for a review article I am writing. If I am to do research involving experience sampling, it seems natural for me to do some self-experimentation. So, I decided to pick my own emotions as a target, because a) I have them all the time, and b) it is generally useful to be more aware of them. Before I could start, I had to decide on two important prerequisites: a good app, and a good theoretical model for emotions.

Choosing an approach

Choosing an app

In the not so distant past, experience sampling was done using paper diaries. Later, Personal Digital Assistants like the Palm Tungsten were used, and nowadays, we use mobile apps. The landscape for experience sampling, or mood tracking, as such apps are usually called when intended for consumers, is filled with a myriad of tools that, unfortunately, often fall short in important ways. Most apps (on iOS) for mood tracking are too opinionated (like those focussed on gratitude journalling), too restricted, cluttered with useless features, costly (with monthly subscription fees), or careless with sensitive personal data. So, in the spirit of free and open-source software, I scoured this GitHub list of open-source iOS apps and found rTracker: “A generic, customizable personal data tracker”. Right up my alley. rTracker promised me ultimate freedom to design my experience sampling survey exactly as I want, and it delivered (this is not an ad!). A big shoutout to the app’s author Rob Miller, who has kindly and swiftly responded to my request for more data sharing options.

Choosing a model

The next question is: how do I capture my emotions? Which questions do I want to ask myself? Answering these questions requires a model. A good model must be valid, meaning it must capture my true emotional state accurately and precisely. It must also be fast. I felt that capturing my emotional state should not take more than a minute, ideally less. It must also be interpretable. Models for emotions can generally be divided into two groups: those biased towards categories and those biased towards dimensions. Categorical models recognise certain core emotions - like happiness, sadness, fear and anger - and require people to determine which of these best matches how they’re feeling. Dimensional models locate emotions as points in an n-dimensional space. Those dimensions represent important properties that all possible emotional states share. Together, those dimensions are theorised to define, and distinguish between, most emotional states.

A common dimensional model is the circumplex model of affect. It proposes that emotions lie in a two-dimensional space defined by arousal (high or low) and valence (positive or negative). The arousal axis’ range goes from being asleep, calm, or drained - to being awake, stimulated, or energised. The valence axis is related to whether an emotional state is positive or negative, i.e., good or bad. For example, excitement is high arousal and positive valence, while sadness is low arousal and negative valence. Now, the circumplex model has been around a long time (since 1980) and stands out for its simplicity 1 . As a requirement, a good dimensional model should be able to accomodate and differentiate between emotions used in categorical models. The circumplex model seems seems capable at doing this, with most core emotions occupying distinct spots around the valence-arousal circle (Figure 1).

Figure 1. A graphical representation of the circumplex model of affect
Figure 1. A graphical representation of the circumplex model of affect

The Positive And Negative Affect Schedule, or PANAS, is another emotion model that measures 20 emotions — 10 positive and 10 negative. A big advantage of the PANAS over the circumplex model is that it recognises that positive and negative emotions can co-exist independently of another, instead of being on a single valence axis. However, having to fill out 20 dimensions of the PANAS made me opt for the circumplex model, mostly for the sake of simplicity and speed.

Setting up rTracker

A final important design consideration is which scale to use for the valence-arousal axes. I opted for a 7-point discrete scale ranging from -3 to 3 (including 0) for both axes. I added two extra items: influences, and comments (Figure 2). Influences are meant to represent factors I think influence my mood. Comments can be anything; from further explaining my emotions to giving feedback on the experience sampling process. I set rTracker to remind me three times during the day between 08:00 and 22:00, with at least 1 hour between reminders.

Figure 2. My mood tracker as set up in rTracker
Figure 2. My mood tracker as set up in rTracker

Observations

I started tracking in March. It is now June, roughly three months later, and I feel like I’ve had enough exposure to write about it. I won’t show any actual data, although I’m thinking of ways to present it that I feel comfortable with. Rather, I want to use the occasion to write about the things I noticed about this particular experience sampling approach. Some things are specific to the circumplex model, some things generalise to experience sampling or self-tracking at large.

Which is the emotion?

One of the first things I noticed was how difficult it is to know which part of conscious experience is the emotion. Both the valence and the arousal axis appear problematic, but in different ways. For the arousal axis, I have noticed that basic bodily functions like general energy level, hunger, and sleep deprivation interfere heavily with it. If I feel restless, it may be that I’m mostly hungry. Should that count as high arousal? What if I feel tired and anxious at the same time? Which one takes precedence? In a similar vein, feeling emotionally drained and feeling tired are two phenomena that are easily conflated. For the valence axis, thoughts superimposed on the experienced emotion often complicate its interpretation. For example, I may feel deep satisfaction after an intense workout despite serious physical discomfort (e.g., feeling nauseated). At the same time, I may feel some anxiety in the back of my mind because of an unrelated problem at work or in the family. It seems like several separate systems (e.g., cortical, limbic, subcortical) together manifest themselves in conscious experience in ways that resemble experienced emotions in different ways. The degree to which these can be disentangled by conscious thought appears to differ depending on the system from which the interference arises. For the valence axis, interference appears to arise mostly from cortical systems. My theory is that, because interference already comes in the shape of thoughts, which are more interpretable, it is easier to disentangle them from the experienced emotion. For the arousal axis, the problem seems more persistent. Interference comes in the shape of sensations felt in the body or in the head, that are less cognitive and thus harder to disentangle from feelings of arousal, which are highly similar in their presentation.

A second problem with the arousal axis is that it not only appears to have a value, but also a location in the body. Experienced emotion is embodied. Regardless of the theory of emotion you prefer, it seems hard to deny that emotions coincide with bodily sensations of tension or stress (whatever you want to call it). Intense arousal in one part of the body may reflect a different emotion than intense arousal in another part of the body.

The next section will expand on the effect of observation (i.e., reflection), a prerequisite for tracking, on the nature of the emotion itself.

Schrödinger’s Cat

The act of observing can change the nature of the emotion. For example, I’ve had moments where I’d feel energised and alert, but as soon as I started thinking about what to track, that very act seemed to tip the scale towards a negative valence feeling, perhaps best described as tense or anxious. On other occasions where I’d feel contented, becoming aware could increase the intensity of the feeling. Which factors appear to determine this effect? Well, it seems that emotions with high absolute arousal values often have their valence negatively impacted by observation. Observation can change slightly pleasant, high absolute arousal states to slightly unpleasant. For low absolute arousal states, it seems that observation is more likely to make valence move in either direction.

If observation impacts the emotion, it raises the question: should we log the emotion we felt before or during the tracking? If we log the emotion from before the tracking, we may introduce noise in the form of latency (i.e., time between the emotion and the tracking), but logging the emotion we feel during the tracking reduces its accuracy (i.e., validity) because the act of observation can change the emotion in a way that is a) unpredictable, and b) not what we are looking to measure. In research, people often use the terms random error and systematic error. Random error refers to variation in the data that arises from small, random fluctuations, such as the noise introduced by the reduced immediacy of an emotion when logging it after it has changed to something different. Systematic error, on the other hand, refers to predictable inaccuracies introduced by factors like the act of observation changing the emotion. Which do we minimise at the expense of the other here? I am inclined to minimise systematic error, because it’s unpredictable, and because we measure very often, random error is likely to average out over time 2 .

Fooling Yourself

Sometimes I’d be in the presence of someone, who was looking over my shoulder while I got a notification. The watchful eye of someone whose emotions you’re mindful of is likely to distort the answers you give, consciously or unconsciously. I distinctly remember one occasion, where I felt quite drained after a long day at work. My partner was watching over my shoulder, and I realised I was downplaying how low of an arousal state I was actually in. Perhaps partly to not disrupt her mood or create concern, but also because I was unconsciously replacing the difficult question of how I was feeling that day, with the easier question of how I usually feel in their presence. It is easy to draw an analogy here to the duality of Kahneman’s system 1 and system 2 3 . System 1, our fast, automatic, and intuitive mode, often overrides system 2, which is slower, more deliberate, and analytical. In this case, in the presence of my partner, system 1 facilitated a quick, simplified response by defaulting to a familiar pattern of behaviour. Instead of engaging system 2 to consciously assess my true emotional state, I substituted the more complex question of my actual feelings with the easier, habitual response dictated by system 1. This substitution, though less accurate, offered a socially acceptable answer that avoided potential discomfort or concern. It is easy to see how this distortion could be much larger when there is dissatisfaction in a relationship.

I’ve noticed similar effects when being alone. Especially when in a high absolute arousal / low absolute valence state, which as I described, appears to be more unstable, I started to anticipate the apparent negative effect of observation, and logged the emotion in a more favourable way to prevent tipping the scale towards negative.

Only after reading a study by Ida-Marie Arendt and colleagues, who calls these metacognitive beliefs, did I become able to give words to this phenomenon. She writes the following:

“Self-report risks capturing metacognitive beliefs about the symptom rather than the phenomenon itself” 4

Instead of answering the question of how I was feeling at that moment in time (the phenomenon itself), I would answer a different question (the answer to which is the metacognitive belief).

Some examples of questions that ask for metacognitive beliefs instead of the phenomenon itself:

  • How do I want to feel right now?
  • What should I log to prevent feeling worse?
  • How do I usually feel in a similar situation?
  • How am I supposed to feel right now?

The last one, “how am I supposed to feel right now?”, seems particularly hard to spot, especially when pressed for time. I do believe that you can get better at spotting obvious metacognitive beliefs, but solving the issue seems pretty slippery. How do we know that we aren’t replacing a metacognitive belief by a less obvious one? Do we ever stop fooling ourselves?

Figure 3. The meme known as 'Wait, It's All Ohio? Always Has Been', which I believe describes the slippery nature of identifying metacognitive beliefs pretty accurately.
Figure 3. The meme known as 'Wait, It's All Ohio? Always Has Been', which I believe describes the slippery nature of identifying metacognitive beliefs pretty accurately.

Although it is easy to fall prey to this issue of apparent infinite regress , it may still be worth our time to find ways to reduce the impact of metacognitive beliefs on the phenomenon of interest. The impact of each subsequent self-assessment may not be equal, leading to some sort of convergent infinite regress. Regardless, the possibility of self-knowledge is an interesting topic. I’ve enjoyed reading through this Standford overview on phenomenological approaches to self-consciousness (link ). The thing that caught my attention is the position that reflection — the interpretation of lived experience within yourself — involves both a gain and a loss. The interpretation alters and distorts the lived experience, but it may also add valuable information to it.

Models be modelling

The circumplex model is intuitive to understand, and as I describe above, it appears to accommodate many of the core emotions quite well (see Figure 1). But that does not mean that locating specific experienced emotions within the model is straightforward. An accurate model does not need to be easy to use. In fact, my version of the circumplex model turned out to be quite difficult to use for several reasons. Firstly, with scales ranging from -3 to 3, the space in which emotions can be localised is small and has low resolution. Distinguishing between one emotion and an entirely different one can be as subtle as choosing between a 1 or a 2 on a single sub-scale. Secondly, a previous tracking will often linger in your head, leading you to ask: is my current emotion higher or lower in valence or arousal? If you want to remain consistent with previous tracking, answering this question can significantly restrain the usable part of the answer space. Thirdly, the ease of use of this model, which may be used by anyone, depends on its intuitiveness. Most people, I suspect (myself included), conceptualise emotions as categories, not as points on the 2D valence-arousal grid. This is best seen in the way we speak about emotions. We categorise them. We say: “I am angry”, not “I am heavily aroused in a negative way”. If mapping the categories to the axes is not obvious, the data can become confusing and inaccurate.

Summary & Conclusion

In summary, there are several challenges with sampling experienced emotions using the circumplex model of affect. There is the problem of disentangling apparently cortical and subcortical influences on experienced emotion, which cause trouble interpreting both the valence and arousal axis in distinct ways. An additional problem for the arousal axis is the fact that it is usually felt somewhere in the body, something that the circumplex model does not account for. Then there is the unpredictable influence of the act of observation, or reflection, on the experienced emotion. Finally, self-deception for social or ego reasons can interfere with our ability to know our own emotions.

What’s next

The next self-experiment (that is within arm’s reach), is probably going to involve using a less fast, but properly validated model for tracking experienced emotions. I’m curious how much of a difference such a model will make. The I-PANAS-SF (International PANAS Short Form) seems to be a promising candidate for this 5 . It has 10 items, 5 positive and 5 negative in valence. I suspect that, although this seems 5x as long, it isn’t going to take 5x as long to fill in. In time, I’ll learn to recognise items that are relevant to my current state faster and faster.

It may be that trying to tackle self-deception in reflection is akin to peeling an unending onion, where, as we zoom in, problems just proliferate endlessly like some fractal tree structure. Still, if the importance of layers shows a decreasing trend, where the impact of earlier layers proves larger than that of subsequent layers, the process can still lead to meaningful improvements in real life.

Needless to say, I am by no means an expert in this field and it is to be expected that parts of this post are naive or uninformed. If you know of any sources I might find interesting to read, please let me know!

Bibliography & Footnotes


  1. Posner, J., Russell, J. A., & Peterson, B. S. (2005). The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and Psychopathology, 17(3), 715–734. https://doi.org/10.1017/S0954579405050340
  2. Admittedly, looking back at the pre-tracking emotion could just as well be influenced by memory bias that favours the current emotion, so the error may not be entirely random.
  3. Kahneman, D. (2011). Thinking, fast and slow. Farrar, Straus and Giroux.
  4. Arendt, I.-M. T. P., Riisager, L. H. G., Larsen, J. E., Christiansen, T. B., & Moeller, S. B. (2021). Distinguishing between rumination and intrusive memories in PTSD using a wearable self-tracking instrument: A proof-of-concept case study. The Cognitive Behaviour Therapist, 14, e15. https://doi.org/10.1017/S1754470X2100012X
  5. Thompson, E. R. (2007). Development and Validation of an Internationally Reliable Short-Form of the Positive and Negative Affect Schedule (PANAS). Journal of Cross-Cultural Psychology, 38(2), 227–242. https://doi.org/10.1177/0022022106297301