Innovation researchers currently make use of various patent classification schemas, which are hard to replicate. Using machine learning techniques, we construct a transparent, replicable and adaptable patent taxonomy, and a new automated methodology for classifying patents. We contrast our new schema with existing ones using a long-run historical patent dataset. We find quantitative analyses of patent characteristics are sensitive to the choice of classification; our interpretation of regression coefficients is schema dependent. We suggest much of the innovation literature should be carefully interpreted in light of our findings.
Series
Industrial and Corporate Change INCC 2021, 30(3), 678-705