Andmebaasi logo
Valdkonnad ja kollektsioonid
Kogu ADA
Eesti
English
Deutsch
  1. Esileht
  2. Sirvi autori järgi

Sirvi Autor "Injarabian, David Avedis" järgi

Tulemuste filtreerimiseks trükkige paar esimest tähte
Nüüd näidatakse 1 - 1 1
  • Tulemused lehekülje kohta
  • Sorteerimisvalikud
  • Laen...
    Pisipilt
    listelement.badge.dso-type Kirje ,
    Text-Driven Weakly Supervised Medical Image Segmentation
    (Tartu Ülikool, 2025) Injarabian, David Avedis; Fishman, Dmytro, juhendaja; Ariva, Joonas, juhendaja; Tartu Ülikool. Loodus- ja täppisteaduste valdkond; Tartu Ülikool. Arvutiteaduse instituut
    Medical image analysis has become an essential tool for clinical diagnosis, enabling specialists to detect, segment, and monitor various pathologies. Convolutional neural networks have traditionally dominated this field, achieving success in classification and segmentation tasks by relying solely on visual patterns. However, due to their inherent architectural limitations, they are unable to effectively incorporate complementary information such as textual reports provided by medical professionals. Recently, multimodal models, particularly Transformer-based vision-language architectures, have demonstrated promising results in general image recognition and generation tasks by effectively integrating text and visual data. Despite these advances, the potential of multimodal approaches in medical imaging, especially in the context of complex 3D volumetric data such as computer tomography scans, remains largely unexplored. This thesis investigates whether textual context provided by radiology reports can implicitly guide multimodal models to learn spatial locality in their counterpart medical images, potentially leading to emergent segmentation capabilities without explicit segmentation supervision. Such an approach could address the chronic shortage of manually annotated segmentation data, as obtaining these labels is expensive and labor-intensive. By examining how multimodal models trained on paired 3D computer tomography scans and radiology reports respond to textual prompts, the thesis seeks to understand if these models inherently learn meaningful spatial relationships. If multimodal models demonstrate implicit segmentation capabilities, they could serve as a valuable source for generating synthetic weakly supervised segmentation masks, reducing the need for costly manual annotation and supporting radiologists in clinical interpretation and triage workflows.

DSpace tarkvara autoriõigus © 2002-2025 LYRASIS

  • Teavituste seaded
  • Saada tagasisidet