Insolvency administrators have to find relevant information in vast amounts of documents of insolvent companies. So far, they use keyword lists. The problem is they do not know in advance what might be contained in the data. Further, keywords might be ambiguous and do not capture paraphrases. We perform a topic model analysis on the large scale document set, which reveals the topics contained by the corpus. The analysis is unsupervised and does not require any manual labelling work. As a result, the insolvency administrators get an easy overview what is pertained by the data, can choose the topics that are relevant to them, and quickly find the documents that contain the topics of interest.
Target Audience: Innovation managers, decision makers, data scientists
Prerequisites: Basic understanding of statistics and natural language processing
Level: Advanced