Mapping Inventions in the Space of Ideas, 1836–2022: Representation, Measurement, and Validation

Thumbnail

Event details

Date 01.03.2024
Hour 11:4513:00
Speaker Vitaly Meursault - Federal Reserve Bank of Philadelphia
Location
UniL Campus, Room Extra 126
Category Conferences - Seminars
Event Language English

Ina Ganguli, Jeffrey Lin, Vitaly Meursault, Nicholas Reynolds

How well can different methods meaningfully represent inventions in the “space of ideas?” We evaluate four leading natural language processing (NLP) models, each of which produces a different numerical representation of patent text. We design three novel, domain-specific validation tasks to select between these representations. Sentence-BERT (S-BERT) significantly outperforms other widely-used NLP models, creating metrics better aligned with both expert and non-expert human judgment about patent similarity. The choice of representation matters significantly for economic measurement. According to S-BERT, contemporaneous patents have declined in similarity over more than a century, as inventions have “spread out” on an expanding knowledge frontier. Other representations report ambiguous or diverging patterns. We reproduce the S-BERT result using newly-digitized records of historical interferences, which show secular declines in the rate of multiple invention. Our results highlight the importance of validation and model selection as an essential step in constructing and using measures derived from patent text.