Predicting company innovativeness by analysing the website data of firms: a comparison across different types of innovation

dc.contributor.authorSõna, Sander
dc.contributor.authorMasso, Jaan
dc.contributor.authorSharma, Shakshi
dc.contributor.authorVahter, Priit
dc.contributor.authorSharma, Rajesh
dc.date.accessioned2022-07-29T10:46:21Z
dc.date.available2022-07-29T10:46:21Z
dc.date.issued2022
dc.description.abstractThis paper investigates which of the core types of innovation can be best predicted based on the website data of firms. In particular, we focus on four distinct key standard types of innovation – product, process, organisational, and marketing innovation in firms. Web-mining of textual data on the websites of firms from Estonia combined with the application of artificial intelligence (AI) methods turned out to be a suitable approach to predict firm-level innovation indicators. The key novel addition to the existing literature is the finding that web-mining is more applicable to predicting marketing innovation than predicting the other three core types of innovation. As AI based models are often black-box in nature, for transparency, we use an explainable AI approach (SHAP - SHapley Additive exPlanations), where we look at the most important words predicting a particular type of innovation. Our models confirm that the marketing innovation indicator from survey data was clearly related to marketing-related terms on the firms' websites. In contrast, the results on the relevant words on websites for other innovation indicators were much less clear. Our analysis concludes that the effectiveness of web-scraping and web-text-based AI approaches in predicting cost-effective, granular and timely firm-level innovation indicators varies according to the type of innovation considered.en
dc.identifier.urihttp://hdl.handle.net/10062/83377
dc.language.isoengen
dc.relationinfo:eu-repo/grantAgreement/EC/H2020/822781///GROWINPROen
dc.rightsinfo:eu-repo/semantics/openAccessen
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectinnovationen
dc.subjectmarketing innovationen
dc.subjectcommunity innovation survey (CIS),en
dc.subjectmachine learningen
dc.subjectneural networken
dc.subjectexplainable AIen
dc.subjectSHAPen
dc.titlePredicting company innovativeness by analysing the website data of firms: a comparison across different types of innovationen
dc.typeinfo:eu-repo/semantics/articleen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
sona_masso_sharma-s_vahter_sharma-r.pdf
Size:
1.54 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.67 KB
Format:
Item-specific license agreed upon to submission
Description: