Sirvi Autor "Vunk, Sandor" järgi
Nüüd näidatakse 1 - 1 1
- Tulemused lehekülje kohta
- Sorteerimisvalikud
listelement.badge.dso-type Kirje , Causal Information Extraction Using Large Language Models(Tartu Ülikool, 2025) Vunk, Sandor; Magnifico, Giacomo, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituutThis thesis investigates the ability of Large Language Models (LLMs) for causal information extraction, an important task for high-level natural language comprehension. In a controlled experiment of eight flagship models of leading AI organizations — including OpenAI's GPT-o3, Anthropic's Claude 3.7 Sonnet, xAI's Grok-3, and others — this study examines both their ability to extract cause-effect pairs from text and their performance at evaluating such extractions. A purpose-designed multi-domain dataset was generated to serve this end, with controlled causal relations hidden in contexts with diverse complexity levels, covering economics, environmental science, and technology domains. The dataset incorporates a number of difficult variations achieved through the use of cue masking and pair shuffling methods. By applying a zero-shot approach with standardized prompting, a twin evaluation framework is employed that uses traditional human evaluation with a model-based semantic scoring system, in which LLMs score other LLM's extractions. This provides a more informative model performance evaluation. Results revealed impressive causal extraction capabilities across all models, with leading models, outperforming smaller models. Especially notable were OpenAI's GPT-o3, Antropic's Claude 3.7 Sonnet and xAI's Grok-3, outperforming its counterparts. Overall, models demonstrated semantic understanding beyond reliance on explicit linguistic markers, though pair shuffling showed some dependence on pre-trained associations. This research illuminates the capabilities of state-of-the-art LLMs in causal information extraction, establishing a foundation for enhanced causal reasoning systems across diverse domains.