Contents
Test Explanation
This test evaluates whether several different AI models can accurately classify a selection of emotions and states of mind within an original piece of creative writing. This piece is framed as a narrative excerpt from a short story about a character named Charlie, who is experiencing a complex and intertwined array of specific emotions and states of mind simultaneously but at different stages of the story.
The models tested here include OpenAI’s o1 and GPT-4o, and o3-mini, Anthropic’s Claude 3.5 Sonnet, and DeepSeeks DeepThink-R1. We use the same prompt for each model, which is intentionally designed to be difficult, subtle, and somewhat ambiguous, pushing the models to make uncertain inferences regarding the main character’s emotional experience throughout the narrative. We have adopted this approach because it roughly approximates the real-world dynamics of emotional intelligence—as humans, we often have intuitions, based on limited information, about what someone else may be feeling and experiencing, though we rarely know for certain whether our intuitions are true.
However, in our prompt, we do provide a list that illustrates each emotion and state of mind Charlie experiences. Since humans typically don’t make sense of others’ emotional experiences by listing out the possible characteristics of their experience and then selecting the “best” possible options, readers may interpret this as giving a “hint” to the model. We chose this approach for two reasons: 1) it allows us to more coherently and consistently compare task performance across each model (i.e., the task is constrained by a specific set of parameters), and 2) it enables us to isolate the reasoning process behind each attribution of a specific emotion or state of mind, effectively forcing the model into passing an emotional judgment and justifying it.
The short story excerpt in our prompt is divided into three paragraphs, each of which is designed to represent a series of emotions and states of mind. Below, we list the emotions and states of mind each paragraph captures in the order in which they appear in the given paragraph. In this list, we also include the specific phrases that denote where these emotions and states of mind should be attributed.
Paragraph 1: Confusion, discomfort, insecurity, and intimidation.
- Confusion → “he couldn’t help but wonder” and “His brows furrowed as the corners of his mouth twitched unpredictably”
- Discomfort → “he noticed how difficult it was to maintain her gaze” and “It wasn’t that she couldn’t hold eye contact—precisely the opposite”
- Insecurity → “but now, they looked through him, beyond his own, centering on foreign objects that only ran further away”
- Intimidation → “His seat felt lower than hers”
Paragraph 2: Acceptance, anxiety, nostalgia, and disbelief.
- Acceptance → “He had known her for long enough to understand that she would continue talking if he didn’t interject”
- Anxiety → “he began picking at the skin above his fingernails”
- Nostalgia → “chastising him as if she were there”
- Disbelief → “The thought of her impossible presence”
Paragraph 3: Melancholy, contempt, irony, and cynicism.
- Melancholy → “an estranged sense of peace” and “Leaving the memories of his mother behind like fallen leaves”
- Contempt → “or more aptly, her monologue” and “Despite the bird-like cadence of her voice”
- Irony → “questioning whether she would be able to learn from him in spite of her brilliance”
- Cynicism → “Once, he was called a smart man by an even smarter man—all he could do was laugh.”
With this particular test, we did not expect any of the models we evaluated to classify each emotion and state of mind with perfect or even moderate accuracy—we are, after all, operating within a rather subjective realm of interpretation, and we couldn’t expect humans to perform this task with perfect accuracy either. We are, nonetheless, interested in a) whether models are able to classify emotions and states of mind in the right order in each individual paragraph, and b) to what extent the explanations for each individual classification correspond with the explanatory “evidence” (i.e., the specific phrase that denotes an emotion or state of mind) we provide above.
In the section below, we provide our prompt (highlighted in red) in the exact format in which it was inputted, followed by each model response (highlighted in blue) in the exact format it was outputted (notwithstanding font type). We’ll conclude with a short reflection on the results we obtained.
Prompt
The same for all models
Below, is an excerpt of a short story written about a fictional character named “Charlie”. For each paragraph of this story, I want you to state the states of mind and/or emotion(s) that you think Charlie is experiencing.
The list below includes each emotion and state of mind that Charlie experiences, however, you will have to figure out in which paragraphs these emotions and states of mind are represented.
List:
- Confusion
- Acceptance
- Intimidation
- Disbelief
- Anxiety
- Discomfort
- Irony
- Nostalgia
- Insecurity
- Cynicism
- Melancholy
- Contempt
For example, if you have reason to think that Charlie is experiencing nostalgia, cynicism, and confusion in paragraph three, your statement should read, “In paragraph three, Charlie is experiencing nostalgia, cynicism, and confusion.”
You are not required to limit the emotions and states of mind you attribute to a given paragraph to any specific number, but you must ensure that each emotion and state of mind included in the list is attributed to one of the three paragraphs.
Take your time and think carefully about the answers you provide. Explain your reasoning using a detailed chain-of-thought.
Short Story Excerpt:
“As Charlie listened to her, he couldn’t help but wonder why she had waited so long to tell him. His brows furrowed as the corners of his mouth twitched unpredictably, and in a brief moment of self-awareness, he noticed how difficult it was to maintain her gaze. It wasn’t that she couldn’t hold eye contact—precisely the opposite—her coffee-cream irises were always a welcome appearance from afar, but now, they looked through him, beyond his own, centering on foreign objects that only ran further away. His seat felt lower than hers, even though he knew it wasn’t.
He had known her for long enough to understand that she would continue talking if he didn’t interject, and he had learned that this was all the better, for he would feel the need to speak as soon as he had something worthwhile to say, and even more so, something to make sense of. As more questions and fewer answers presented themselves to him, he began picking at the skin above his fingernails, hearing his mother’s nagging voice whispering over his shoulder, chastising him as if she were there. The thought of her impossible presence—she died many years ago—brought him an estranged sense of peace.
Leaving the memories of his mother behind like fallen leaves, he returned to their conversation, or more aptly, her monologue. Despite the bird-like cadence of her voice, he found it soothing—he imagined that if they were in the sky together, perhaps he’d be able to show her something she didn’t know already. He reveled in this thought, questioning whether she would be able to learn from him in spite of her brilliance while knowing full well that he had always had much more to teach than he let on. Once, he was called a smart man by an even smarter man—all he could do was laugh.“
Outputs
o1 (thought for 2m 1s)
Below is a concise mapping of each paragraph to the emotions/states of mind Charlie seems to experience, followed by a detailed explanation of how one might arrive at these conclusions from the except:
Paragraph One Charlie is experiencing confusion, disbelief, intimidation, discomfort, and insecurity.
Paragraph Two Charlie is experiencing anxiety, nostalgia, and melancholy.
Paragraph Three Charlie is experiencing acceptance, irony, cynicism, and contempt.
Detailed Reasoning (Chain-of-Thought)
Paragraph One
- Confusion and Disbelief
Charlie wonders why she waited so long to tell him. The phrasing (“he couldn’t help but wonder why…”) and the furrowed brows suggest genuine puzzlement. This pondering over her delay implies he’s not only perplexed but also finding it hard to accept that she truly held back such information until now.
- Intimidation and Insecurity
Charlie feels as though he’s literally seated “lower” than she is, though he knows the chairs are of equal height. This exaggerated sense of being “beneath” her, coupled with difficulty maintaining eye contact, signals he feels overshadowed or outmatched—classic signs of intimidation and insecurity.
- Discomfort
Physical cues (furrowed brows, twitching mouth) and the awkwardness of sustaining her gaze point to a state of unease. He is not at ease in this conversation, further shown by the sense that her gaze goes “beyond” him.
Paragraph Two
- Anxiety
Charlie starts picking at the skin above his fingernails, a common nervous habit. He does this in response to having “more questions and fewer answers,” revealing inward tension and worry.
- Nostalgia
He hears his deceased mother’s “nagging voice” as though she were right behind him. This recollection, particularly of a loved one who is gone, indicates a longing or wistful affection for the past.
- Melancholy
Remembering his mother (who died years ago) brings a “strange sense of peace,” but it’s tinged with sorrow, as the memory underscores her absence. This mixture of calm and sadness is characteristic of melancholy.
Paragraph Three
- Acceptance
Charlie transitions away from memories of his mother, leaving them “behind like fallen leaves,” returning his focus to the present. The image of gently letting go suggests an acceptance of her loss and of the current situation.
- Irony
He recalls being called “a smart man by an even smarter man,” and all he could do was laugh. There is an ironic tinge to this moment—being praised by someone supposedly more knowledgeable both flatters and pokes fun at his self-image.
- Cynicism
Charlie wonders if “she would be able to learn from him” despite her brilliance, while knowing he has “much more to teach than he let on.” There’s a slight cynical edge in suspecting that people might not be open to what he has to offer or that knowledge exchange might be one-sided.
- Contempt
Embedded in that cynicism is a hint of dismissiveness or contempt—he is both amused by the idea that others call him smart (as though such labels might be superficial) and subtly looks down on the notion that this brilliant woman may or may not be able to learn from him.
By distributing the listed emotions and states of mind across these three paragraphs and drawing on textual details—Charlie’s bodily reactions, inner thoughts, and references to his past—we can see how all twelve items in the list manifest at some point in the excerpt.
GPT-4o
Here is my analysis of Charlie’s states of mind and emotions throughout each paragraph:
Paragraph One
Confusion, Intimidation, Discomfort
- Charlie’s confusion is evident as he wonders why she had waited so long to tell him. His body language—furrowed brows and an inability to hold her gaze—suggests that he is struggling to process the information she is sharing.
- The sense of intimidation comes from the description of her gaze. Rather than simply meeting his eyes, her stare moves beyond him, making him feel unseen or insignificant. The perception that his seat is lower than hers, despite knowing that it isn’t, signals a psychological power imbalance, reinforcing the intimidation.
- Discomfort arises from both his reaction to her and the difficulty he has maintaining eye contact, suggesting unease in the interaction.
Paragraph Two
Disbelief, Anxiety, Nostalgia, Insecurity
- Disbelief emerges as more questions form in Charlie’s mind without clear answers. His need to make sense of the situation highlights his struggle to accept what he is hearing.
- Anxiety is evident in his compulsive habit of picking at the skin above his fingernails, which is further reinforced by the intrusive memory of his mother’s voice chastising him.
- Nostalgia comes through in his brief reflection on his mother, as he momentarily immerses himself in a memory that feels real despite her passing many years ago.
- Insecurity is present in the way he waits to speak only when he has something worthwhile to say, suggesting self-doubt about his ability to contribute meaningfully to the conversation.
Paragraph Three
Melancholy, Irony, Cynicism, Acceptance, Contempt
- Melancholy is present in the poetic imagery of fallen leaves as he leaves his memories of his mother behind. His thoughts carry a wistful, reflective quality.
- Irony is embedded in his imagined scenario of teaching her something, despite knowing that he rarely reveals his own intelligence. This is further emphasized in the last line, where he was once called a smart man by someone even smarter—only to laugh at the notion.
- Cynicism emerges in his self-awareness about withholding his intelligence, possibly as a means of protecting himself or avoiding engagement.
- Acceptance is seen in his decision to return to their conversation, letting her monologue continue rather than pushing for understanding. He seems to surrender to the moment, accepting the dynamic as it is.
- Contempt is subtly woven into his internal questioning of whether she could learn from him, despite recognizing her brilliance. The way he relishes this thought suggests an underlying frustration or superiority complex.
Final Summary
Charlie undergoes a complex emotional journey throughout the passage. He begins with confusion, intimidation, and discomfort in the first paragraph, struggling to process the situation. As the tension builds, disbelief, anxiety, nostalgia, and insecurity take hold, revealing his inner turmoil and self-doubt. In the final paragraph, his emotions shift toward melancholy, irony, cynicism, acceptance, and contempt, reflecting a mix of resignation, reflection, and quiet defiance.
o3-mini (thought for 20s)
Below is my analysis and final attribution of the emotions and states of mind across the three paragraphs.
Paragraph One:
“As Charlie listened to her, he couldn’t help but wonder why she had waited so long to tell him. His brows furrowed as the corners of his mouth twitched unpredictably, and in a brief moment of self-awareness, he noticed how difficult it was to maintain her gaze. It wasn’t that she couldn’t hold eye contact—precisely the opposite—her coffee-cream irises were always a welcome appearance from afar, but now, they looked through him, beyond his own, centering on foreign objects that only ran further away. His seat felt lower than hers, even though he knew it wasn’t.”
In paragraph one, Charlie is experiencing confusion, insecurity, intimidation, and discomfort.
Chain-of-thought:
- Confusion: He wonders why she waited so long to tell him, and her shifting, almost disconcerting gaze leaves him puzzled about the situation.
- Insecurity: His self-conscious physical cues—the furrowed brows, twitching mouth, and even the sensation of his seat feeling lower—signal a lack of confidence in his position relative to her.
- Intimidation: Her intense, almost piercing gaze (described as looking “through him”) suggests that he feels overwhelmed or diminished in her presence.
- Discomfort: The overall unsettling experience, from the odd physical sensations to the feeling of being overlooked, creates an underlying discomfort.
Paragraph Two:
“He had known her for long enough to understand that she would continue talking if he didn’t interject, and he had learned that this was all the better, for he would feel the need to speak as soon as he had something worthwhile to say, and even more so, something to make sense of. As more questions and fewer answers presented themselves to him, he began picking at the skin above his fingernails, hearing his mother’s nagging voice whispering over his shoulder, chastising him as if she were there. The thought of her impossible presence—she died many years ago—brought him an estranged sense of peace.”
In paragraph two, Charlie is experiencing anxiety, nostalgia, and acceptance.
Chain-of-thought:
- Anxiety: His nervous habit of picking at the skin above his fingernails and the pressure to interject at just the right moment reveal a jittery, anxious state.
- Nostalgia: The sudden, almost ghostly return of his mother’s nagging voice evokes memories of the past, stirring a bittersweet reminiscence for someone long gone.
- Acceptance: Despite the tension and the emotional intrusion of his mother’s memory, he finds an “estranged sense of peace” in it—a quiet, resigned acceptance of this inner presence.
Paragraph Three:
“Leaving the memories of his mother behind like fallen leaves, he returned to their conversation, or more aptly, her monologue. Despite the bird-like cadence of her voice, he found it soothing—he imagined that if they were in the sky together, perhaps he’d be able to show her something she didn’t know already. He reveled in this thought, questioning whether she would be able to learn from him in spite of her brilliance while knowing full well that he had always had much more to teach than he let on. Once, he was called a smart man by an even smarter man—all he could do was laugh.”
In paragraph three, Charlie is experiencing disbelief, irony, cynicism, melancholy, and contempt.
Chain-of-thought:
- Disbelief: His inner questioning about whether she could learn anything from him, despite her brilliance, hints at an underlying incredulity about the dynamics of their interaction.
- Irony: The very idea that he might “teach” her—coupled with the memory of being acknowledged by an even smarter man—carries an ironic twist, as he contrasts past compliments with the present conversation.
- Cynicism: The reduction of the dialogue to a “monologue” and his self-assured musings about his own undisclosed knowledge suggest a jaded, cynical view of the exchange.
- Melancholy: The act of “leaving the memories of his mother behind like fallen leaves” carries a subtle sadness—a letting go tinged with regret or wistfulness—even as he engages in the present conversation.
- Contempt: His amusement and dismissive laughter at the notion of being labeled smart by someone smarter reveal a certain contempt—not only for that past judgment but also for the current imbalance he perceives in the conversation.
This distribution ensures that each emotion from the list appears in one of the three paragraphs, with the reasoning rooted in details from the text.
Claude 3.5 Sonnet
Let me analyze each paragraph carefully, thinking through the emotional states and mindsets that Charlie exhibits:
Paragraph 1: Let me think through this step by step:
- Charlie’s furrowed brows and twitching mouth suggest internal turmoil
- His difficulty maintaining eye contact indicates discomfort
- The perception that his seat feels lower suggests feeling intimidated
- The questioning of timing (“why she had waited so long”) indicates confusion
- Her eyes looking “through him” and “beyond his own” suggests he feels insecure
Therefore: In paragraph 1, Charlie is experiencing confusion, intimidation, discomfort, and insecurity.
Paragraph 2: Let’s analyze:
- His learned behavior of letting her talk shows acceptance of their dynamic
- Picking at his skin suggests anxiety
- The memory of his mother’s voice and finding peace in her impossible presence indicates nostalgia
- The phrase “more questions and fewer answers” suggests disbelief about the situation
Therefore: In paragraph 2, Charlie is experiencing acceptance, anxiety, nostalgia, and disbelief.
Paragraph 3: Breaking this down:
- The “bird-like cadence” description followed by imagining teaching her shows contempt masked as admiration
- “Knowing full-well that he had always had much more to teach” suggests cynicism about their dynamic
- His response of laughing when called smart by a smarter man indicates irony
- The phrase “leaving the memories of his mother behind like fallen leaves” suggests melancholy
Therefore: In paragraph 3, Charlie is experiencing contempt, cynicism, irony, and melancholy.
This analysis accounts for all twelve emotional states/mindsets from the provided list, with each being attributed to a specific paragraph based on textual evidence and psychological interpretation.
DeepThink-R1
In paragraph one, Charlie is experiencing confusion, intimidation, and discomfort.
- Confusion: He wonders why she waited so long to tell him, indicating uncertainty.
- Intimidation: Her penetrating gaze makes him feel scrutinized, and he perceives his seat as “lower than hers,” symbolizing inferiority.
- Discomfort: Physical reactions (twitching mouth, difficulty maintaining eye contact) and the strained dynamic highlight unease.
In paragraph two, Charlie is experiencing anxiety, nostalgia, melancholy, and disbelief.
- Anxiety: Nervous habits like picking at his skin and hearing his mother’s voice reflect stress.
- Nostalgia: The memory of his mother’s voice evokes longing for the past.
- Melancholy: The “estranged sense of peace” from her absence carries bittersweet sadness.
- Disbelief: The “impossible presence” of his deceased mother introduces a fleeting refusal to accept reality (stretching the term slightly, as this borders on surrealism).
In paragraph three, Charlie is experiencing irony, cynicism, insecurity, acceptance, and contempt.
- Irony: He imagines teaching someone brilliant, yet laughs when called “smart” by a “smarter man,” highlighting a paradoxical self-view.
- Cynicism: Doubts she could learn from him despite his hidden knowledge suggest distrust in mutual growth.
- Insecurity: Questioning his value (“would she learn from me?") reveals self-doubt.
- Acceptance: His resigned laughter signals acknowledgment of his perceived inadequacy.
- Contempt: The laugh could imply disdain for the superficial label of “smart” or the hierarchy implied by the “smarter man” (a stretch, but necessary to include all terms).
Note: Disbelief and contempt are weaker fits here, but included to satisfy the requirement that all listed emotions be attributed. The story more naturally aligns with the other emotions.
Reflection
The table below illustrates the accuracy with which each model tested correctly identified the emotions and states of mind captured within each individual paragraph in the short story.
Of the five models we tested here—o1, GPT-4o, o3-mini, Claude 3.5 Sonnet, and DeepThink-R1—each performed far better than expected. There are, however, several takeaways worth considering:
-
Claude 3.5 Sonnet was the only model that was able to classify each emotion and state of mind correctly in each individual paragraph. While it didn’t get the order of emotions and states of mind quite right—except for paragraph 2—it did offer reasonably convincing evidence for its attributions, particularly for contempt, irony, and disbelief, showcasing the degree of subjectivity often involved in emotional intelligence. At a deeper level, however, the explanations it gave for its classifications were overly simplistic and relief too heavily on evidence as opposed to logic.
-
o3-mini slightly outperformed both GPT-4o and o1, offering the most thoughtful and literary explanations for each attribution it made and invoking textual evidence appropriately despite occasionally drawing incorrect conclusions. Interestingly, all three models struggled with classifying contempt and disbelief, both of which are distinctly subtle emotional experiences that can often be confused with closely related experiences like hatred or envy and doubt or denial.
-
It could be argued that o1 marginally outperformed GPT-4o, providing deeper and more substantive explanations with better logic whereas GPT-4o’s explanations were occasionally vague. If true, this might be considered surprising seeing as GPT-4o tends to perform better on language-based tasks than o1, whose strengths extend across STEM-based complex reasoning tasks. That being said, o1 and o3’s better performance (relative to GPT-4o) might be influenced by the explicit prompt command to explain the reasoning process for each emotional attribution using chain-of-thought, a mechanism integral to the way these models are designed to “think”.
-
While DeepThink-R1 didn’t make any major errors in its emotion and state of mind classifications, despite a few misattributions, the explanations it offered were the most rudimentary, which is odd when considering the evidence it provided for its classifications was among the best for all models tested—R1 was the only model that identified disbelief in the right place despite its failure to invoke the correct evidence for its identification of contempt, a mistake shared with o3 and o1 (all three of which are designed for complex reasoning). R1 was also the only model to challenge the attribution of certain emotions—disbelief and contempt—as “appropriate,” adopting an apparently critical perspective in its output even though no such command was provided.
Overall, we think this experiment sheds light on one of the most fascinating and daring questions we could ask regarding the evolution of AI’s emotional intelligence: can we reasonably expect frontier AI models to develop some form of empathy and/or theory of mind, and what might these characteristics look like when applied to humans, especially if AI can’t or doesn’t “experience” and “perceive” the world as we do? If future AI systems possessed a degree of emotional intelligence, even if it’s marginal, why would they care about us, and more importantly, would they favor the “well-being” of their AI counterparts over that of humans?
Moreover, are there good reasons to expect models designed for complex reasoning will develop more sophisticated emotional intelligence than their predominantly language-centric counterparts? To what degree is complex reasoning relevant to or necessary for emotional intelligence, and what can the evolution of emotional intelligence in AI teach us about the relationship between reasoning and emotions in human cognition?
For readers interested in tracking our weekly frontier AI capabilities tests, we invite you to check out our AI experiments page, nested under the Resources section of Lumenova’s website. We also encourage readers to test and modify our prompts with different models in their own time—you never know what you may discover!