Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings

Schuster, Carolin M.; Roman, Maria-Alexandra; Ghatiwala, Shashwat; Groh, Georg

Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings

Failid

2025_nodalida_1_65.pdf (368.17 KB)

Kuupäev

2025-03

Autorid

Schuster, Carolin M.

Roman, Maria-Alexandra

Ghatiwala, Shashwat

Groh, Georg

Kirjastaja

University of Tartu Library

Abstrakt

Large language models (LLMs) are the foundation of the current successes of artificial intelligence (AI), however, they are unavoidably biased. To effectively communicate the risks and encourage mitigation efforts these models need adequate and intuitive descriptions of their discriminatory properties, appropriate for all audiences of AI. We suggest bias profiles with respect to stereotype dimensions based on dictionaries from social psychology research. Along these dimensions we investigate gender bias in contextual embeddings, across contexts and layers, and generate stereotype profiles for twelve different LLMs, demonstrating their intuition and use case for exposing and visualizing bias.

URI

https://hdl.handle.net/10062/107258

Kollektsioonid

Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)

Kirje täielik lehekülg

Profiling Bias in LLMs: Stereotype Dimensions in Contextual Word Embeddings

Failid

Kuupäev

Autorid

Ajakirja pealkiri

Ajakirja ISSN

Köite pealkiri

Kirjastaja

Abstrakt

Kirjeldus

Märksõnad

Viide

URI

Kollektsioonid