Detecting semantically equivalent issue reports using transformer models

Moeini, Behrad

Detecting semantically equivalent issue reports using transformer models

Failid

moeini_computerscience_2021.pdf (459.61 KB)

Kuupäev

2021

Autorid

Moeini, Behrad

Kirjastaja

Tartu Ülikool

Abstrakt

Developers support their software development by creating issue reports that can describe bugs, feature requests, or change requests. As the project grows over time, the number of issue reports also grows in number, and some issues are reported multiple times by different users. To avoiding this issue, several automated approaches have been proposed for retrieving duplicate issue reports. These approaches have been mainly based on information-retrieval techniques. This thesis aims to explore recent advances to detect semantically equivalent text to identify duplicate issue reports. Since several articles are published on this topic, this thesis’s main challenge will be to replicate the existing approaches and compare their performance with the proposed solution. Part of my work is to extract and curate the data from sources such as issue trackers. This thesis will be tackling this as a natural language processing problem and apply advanced techniques to classify whether question pairs are duplicates or not. In this thesis, we take an opensource dataset from GitHub, which many projects have been done on that, so it is easy to compare the result with a different result. We applied different models build a model to detect whether two questions are semantically the same, beginning with simple models and use more complex models step by step. When we applied our model to the dataset that we have and got each model’s result, we take each model their performances and see how are their results.

Märksõnad

Github, Duplicated question, Natural language processing, Transformer model, Neural network

URI

https://hdl.handle.net/10062/92313

Kollektsioonid

LTAT magistritööd – Master's theses

Kirje täielik lehekülg

Detecting semantically equivalent issue reports using transformer models

Failid

Kuupäev

Autorid

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Abstrakt

Kirjeldus

Märksõnad

Viide

URI

Kollektsioonid