The Great Gender Divergence

The Great Gender Divergence

Share this post

The Great Gender Divergence
The Great Gender Divergence
What can we learn from 100,000 novels, over 300 years?
Copy link
Facebook
Email
Notes
More

What can we learn from 100,000 novels, over 300 years?

Alice Evans's avatar
Alice Evans
Jun 16, 2024
∙ Paid
11

Share this post

The Great Gender Divergence
The Great Gender Divergence
What can we learn from 100,000 novels, over 300 years?
Copy link
Facebook
Email
Notes
More
2
Share

Ted Underwood, David Bamman, and Sabrina Lee have done something incredible, they’ve analysed 100,000 novels over the past 300 years. By analysing both authors and fictive characters, they provide key insights into historical patriarchy.

Their sample comprises English-language texts in the HathiTrust Digital Library and the ChicagoText Lab. This reflects the book-practices of academic libraries (with additional contributions from the Library of Congress and New York Public Library). While not a complete picture of all books ever written, it probably does reflect demand.

Methodologically, Underwood and colleagues use a pipeline called ‘BookNLP’, which identifies character names and then clusters those names (‘Elizabeth’ and ‘Elizabeth Bennett’ become a single person). Descriptive words are then connected to each character - in terms of her actions, adjectives and nouns. BookNLP also appears very accurate in describing gender. Women are identified with 95% precision.

Harnessing this Large Language Model, Underwood and colleagues explore how the characters of popular stories have changed over the past three centuries. This is a fantastic technological advance, as it prohibits earlier cherry-picking.

So what do they find?

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2025 Alice Evans
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More