Computational Narratology: The Science of Stories

Stories as Structures

In 1928, Vladimir Propp analyzed 100 Russian folktales and made a startling discovery: despite their surface variety, they all followed the same underlying structure. Different characters, settings, and details—but identical narrative morphology.

This insight launched a scientific study of narrative. If stories have structure, that structure can be formalized, quantified, and computed. Today, computational narratology uses algorithms to analyze, generate, and understand stories at scale.

Narratology Defined

Narratology is the systematic study of narrative structure. It distinguishes "story" (the events that occurred) from "discourse" (how those events are told). Computational narratology applies formal methods—mathematics, algorithms, machine learning—to this analysis.

Propp's Morphology

Propp identified 31 narrative functions—story events that always occur in the same order when present:

Propp's Functions (Selection)

1.  ABSENTATION: Family member leaves home
2.  INTERDICTION: Hero receives warning/prohibition
3.  VIOLATION: Interdiction is violated
4.  RECONNAISSANCE: Villain seeks information
5.  DELIVERY: Villain gains information
6.  TRICKERY: Villain attempts deception
7.  COMPLICITY: Victim is deceived
8.  VILLAINY: Villain causes harm
...
17. BRANDING: Hero is marked
18. VICTORY: Villain is defeated
19. LIQUIDATION: Initial misfortune resolved
20. RETURN: Hero heads home
...
31. WEDDING: Hero is rewarded

Not all functions appear in every tale, but
those present always appear in this order.

This morphology is surprisingly general. Star Wars, Harry Potter, The Matrix—countless modern stories follow Propp's pattern. The "Hero's Journey" (Joseph Campbell) is essentially Propp reframed for mythological narratives.

Emotional Arcs: The Shapes of Stories

In 2016, researchers at the Computational Story Lab (University of Vermont) analyzed sentiment trajectories in 1,327 stories from Project Gutenberg. Using word-level sentiment analysis, they mapped how emotional valence rises and falls through each narrative.

Their finding: most stories cluster into just six basic emotional arcs:

Rags to Riches: Steady rise (Alice in Wonderland)
Riches to Rags: Steady fall (Romeo and Juliet)
Man in a Hole: Fall then rise (most popular)
Icarus: Rise then fall
Cinderella: Rise, fall, rise (second most popular)
Oedipus: Fall, rise, fall

What Readers Prefer

Download data from Project Gutenberg showed that "Man in a Hole" and "Cinderella" arcs were most popular. Stories with clear emotional trajectories outperformed flat emotional profiles. Readers want shape to their emotional experience.

Stylometry: Authorial Fingerprints

Stylometry quantifies writing style through statistical features: word frequency, sentence length, vocabulary richness, function word usage. These features form an "authorial fingerprint" remarkably resistant to conscious manipulation.

Applications include:

Authorship attribution: Who wrote anonymous texts? Stylometry identified J.K. Rowling as Robert Galbraith
Plagiarism detection: Style shifts reveal inserted passages
Forensic linguistics: Court evidence from ransom notes, threatening letters
Historical debates: Did Shakespeare write Shakespeare? (Stylometry says yes)

Function words (the, of, and, to) are particularly diagnostic. Content words vary with topic; function words reveal unconscious syntactic habits. A typical stylometric analysis uses 100+ features across thousands of words.

Burrows' Delta: Δ(A,B) = Σᵢ |zᵢ(A) - zᵢ(B)| / n

Stylometric Distance

Computational Story Generation

Can computers write stories? Early systems used templates and rules. TALE-SPIN (1977) generated fables by simulating character goals and plans. Brutus (1998) produced short stories about betrayal using explicit narrative models.

Modern approaches use neural networks trained on massive text corpora. GPT-style models generate grammatical, locally coherent text. But challenges remain:

Long-range coherence: Maintaining plot threads over thousands of words
Character consistency: Tracking beliefs, knowledge, personality across scenes
Novelty vs. coherence: Creative outputs that still make narrative sense
Intentionality: Writing toward a goal rather than just predicting next words

Hybrid systems show promise: neural generation constrained by explicit story structure. Use Propp's functions as scaffolding; let neural networks fill in the prose.

The Future of Literary AI

Computational narratology isn't replacing human creativity—it's extending our understanding of what stories are and how they work. Key questions for the field:

Universality: Are story structures cultural or cognitive? Do all human minds parse narrative similarly? Cross-linguistic computational analysis is beginning to answer this.

Impact: How do narrative structures affect readers? Neuroimaging during story comprehension reveals that readers mentally simulate story events. Computational models of this simulation could predict emotional impact.

The universe is made of stories, not of atoms.
Muriel Rukeyser

Stories are technologies—ancient software for transmitting culture, building empathy, and making sense of existence. Understanding their structure computationally doesn't diminish their magic. It reveals how deep that magic runs—wired into the architecture of human cognition itself.