Margins in Contrastive Learning: Evaluating Multi-task Retrieval for Sentence Embeddings

Jørgensen, Tollef Emil; Breitung, Jens

Margins in Contrastive Learning: Evaluating Multi-task Retrieval for Sentence Embeddings

Files

2025_nodalida_1_28.pdf (331.31 KB)

Date

2025-03

Authors

Jørgensen, Tollef Emil

Breitung, Jens

Publisher

University of Tartu Library

Abstract

This paper explores retrieval with sentence embeddings by fine-tuning sentence-transformer models for classification while preserving their ability to capture semantic similarity. To evaluate this balance, we introduce two opposing metrics – polarity score and semantic similarity score – that measure the model's capacity to separate classes and retain semantic relationships between sentences. We propose a system that augments supervised datasets with contrastive pairs and triplets, training models under various configurations and evaluating their performance on top-$k$ sentence retrieval. Experiments on two binary classification tasks demonstrate that reducing the margin parameter of loss functions greatly mitigates the trade-off between the metrics. These findings suggest that a single fine-tuned model can effectively handle joint classification and retrieval tasks, particularly in low-resource settings, without relying on multiple specialized models.

URI

https://hdl.handle.net/10062/107219

Collections

Proceedings of the Joint 25th Nordic Conference on Computational Linguistics and 11th Baltic Conference on Human Language Technologies (NoDaLiDa/Baltic-HLT 2025)

Full item page

Margins in Contrastive Learning: Evaluating Multi-task Retrieval for Sentence Embeddings

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections