BibRank: Automatic Keyphrase Extraction Platform Using Metadata

Date

2021

Journal Title

Journal ISSN

Volume Title

Publisher

Tartu Ülikool

Abstract

Automatic Keyphrase extraction is the process of automatically identifying the essential phrases from a document. Keyphrases are used in crucial tasks such as document classification, clustering, recommendation, indexing, searching, and summarization. This thesis introduces BibRank, a new semi-supervised automatic keyphrase extraction method that exploits an information-rich dataset collected by parsing bibliographic data in BibTeX format. BibRank combines a novel weighting technique of the bibliographic data with positional, statistical, and word co-occurrence information. We have benchmarked BibRank and state-of-the-art techniques against the dataset. The evaluation indicates that BibRank is more stable and has a better performance than state-of-the-art methods.

Description

Keywords

keyphrase Extraction, Metadata, Natural Language Processing

Citation