Sirvi Autor "Karagjaur, Mihhail" järgi

Nüüd näidatakse 1 - 1 1

Asset-Oriented Threat Analysis for Large Language Model Systems
(Tartu Ülikool, 2025) Karagjaur, Mihhail; Matulevičius, Raimundas, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituut
Large language model (LLM) deployments continue to proliferate across enterprises without systematic guidance on risk analysis of the LLM-based systems. Addressing this gap, the present study designs and validates an asset-oriented threat model, tailored to LLM systems. The research follows a design-science research paradigm. The research method incorporates (1) a systematic literature review of 45 peer-reviewed and grey sources, which led to the definition of 13 parent attack classes, a total of 24 threat variants. (2) A design of a threat model, which formalized the LLM business and system assets, their security criteria, mapped threats, security requirements, and countermeasures. (3) Two validation procedures, comprising a feasibility analysis of the threat model’s applicability and an empirical test of a jailbreak attack. The feasibility analysis determined that the proposed threat model, mapped to the Mistral Small 3.1, achieved a completeness score of 0.93 out of 1.00. Thus, indicating all but one of the seven system assets were fully represented in the real-world system. To further substantiate the applicability of the threat model, a jailbreak attack (prompt-injection) was executed with 100 prompts from the JailbreakV-28K benchmark open dataset. Without an official safety measure enabled, 78% of applicable prompts resulted in harmful output. With the safety measure enabled, the rate of harmful output was reduced to 70%. Indicating partial but insufficient mitigation. The main artifact of the thesis is an asset-oriented threat model for LLMs. The artifact consists of the following components: 1. High-level UML class and state, and BPMN process diagrams, depicting an LLM system, and mapping elicited threats to the system’s assets. 2. An interactive web page, which allows practitioners to traverse the produced threat model and to acquire information about the elicited assets, threats, and proposed countermeasures. 3. Code of the interactive web page, empirical tests, and datasets, supporting local use of the threat model and reproducibility of the jailbreak empirical test. Findings conclude that the LLM system possesses a wide attack surface while adding unique vectors such as jailbreak and embedding inversion. The thesis provides security and AI engineers with a systematic approach to risk analysis and countermeasure selection. Although the threat model was validated on a single open-weight model, the baseline methodology is model-agnostic and extensible. Future studies could validate the threat model against a wide set of LLM systems and automate control recommendations in the scope of DevSecOps.