Jour Fixe - Text-Based Industry Classification of Young and Innovative Companies
Abstract: For more than 70 years, investors, researchers and governments have classified companies into industries. Yet, despite their common use in research and practice, existing industry classification schemes (such as NAICS or SIC codes) are ill-suited to classify young and innovative companies: By categorizing companies according to their means of production rather than their product, these existing schemes fail to reflect the novel business models of many of today’s young and innovative companies.
We propose a new approach to measure industry structure and product similarity for young and innovative companies based on textual analysis of firm’s business descriptions. Drawing on publicly available data from job portals and professional social networks, we obtain business descriptions for 726 private companies which went public within the observation period from 2003 to 2018 and compare our new text-based approach against existing schemes as well as human judgement. Our preliminary results suggest that our new approach can accurately identify similarity in product offerings and yields more reasonable industry classifications than existing schemes.