Andmebaasi logo
Valdkonnad ja kollektsioonid
Kogu ADA
Eesti
English
Deutsch
  1. Esileht
  2. Sirvi autori järgi

Sirvi Autor "Stomakhin, Fedor" järgi

Tulemuste filtreerimiseks trükkige paar esimest tähte
Nüüd näidatakse 1 - 2 2
  • Tulemused lehekülje kohta
  • Sorteerimisvalikud
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    Framework for Privacy-Preserving Synthesis of Textual Data
    (Tartu Ülikool, 2025) Stomakhin, Fedor; Laur, Sven, juhendaja; Kamm, Liina, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituut
    To safeguard patient privacy, sharing medical record data for research must adhere to various privacy regulations. To facilitate data sharing, various data protection techniques have been proposed, such as pseudonymization, anonymization and the use of synthetic data. The aim of synthetic data generation is, based on an original dataset, to produce a new dataset in a way that preserves the statistical relationships within the original data while not exposing any identifying or sensitive information about the data subjects therein. Synthetically generated data can still be insufficient from the point of view of privacy-preservation. For this purpose, approaches rooted in differential privacy (DP) have been proposed. DP typically relies on worst-case assumptions about attackers' knowledge, potentially leading to overly conservative measures. Applying DP principles to free-form text, such as medical epicrises, is complicated by their high dimensionality and complexity, as the same information can be conveyed in many different ways. In this work, motivated by the challenges of sharing textual health data, we propose and apply a general framework for evaluating privacy risks in text generated by large language models (LLMs). Considering a journalist attack model, we adapt differential privacy principles, quantifying privacy loss (ε, δ) based on the outputs of specific attack functions rather than relying on worst-case assumptions of DP. We demonstrate the framework by establishing baseline privacy characteristics via direct n-gram sampling analysis on both medical and social media texts and by exploring membership inference signals using surprisal analysis on LLMs fine-tuned with social media texts. While assessing synthetic data from standard LLMs highlighted methodological challenges, the framework provides a methodology for evaluating the privacy properties of text generation models and their outputs, informing decisions on sharing such data for research purposes.
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    Web-based Toolbox for Interactive 3D Visualization of Neural Recordings
    (Tartu Ülikool, 2021) Stomakhin, Fedor; Kuzovkin, Ilya, juhendaja; Zafra, Raul Vicente, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituut
    The visualization of brain activity is an approach that aids neuroscience researchers and medical professionals to explore the data they work with. In particular, 3D visualization of brain activity is a technique used when the spatial positions of data points in the brain are important. Numerous tools have been developed for the analysis and editing of various forms of brain activity. In this thesis, a web-based toolbox for interactive 3D visualization of neural recordings was implemented. The use cases of the toolbox were demonstrated by adapting it to visualize intracortical LFP recordings from 100 human subjects.

DSpace tarkvara autoriõigus © 2002-2025 LYRASIS

  • Teavituste seaded
  • Saada tagasisidet