PhD defence

Evaluation of Bias and Robustness in Search and Conversational Systems

A Abolghasemi

Date: Friday 6 March 2026
Time: 11:30 - 12:30 hour
Address: Academy Building
Rapenburg 73
2311 GJ Leiden

Supervisor(s)

Prof.dr. S. Verberne
Prof.dr. L. Azzopardi
Prof.dr. M. de Rijke

Summary

Today, many of us rely on search and conversational systems to look for information, answer questions, and help us complete tasks. These systems are powered by advanced AI models that not only retrieve information but also generate new answers. While this makes them powerful and convenient, it also creates new challenges: these systems can make confident mistakes, behave unpredictably, or show hidden biases. This thesis examines how reliable and unbiased these modern AI systems really are when used in realistic situations.

One chapter of the research studies search systems and asks whether ranking models still perform well when search queries are long and complex, for example, when a full document is used to find similar documents. The findings show that traditional keyword-based methods are still very competitive compared to semantic methods.

Another chapter investigates the evaluation of conversational systems and whether they can correctly detect when a user is dissatisfied or satisfied. The findings show that when the evaluation data is adjusted to include more dissatisfied users (making it closer to real-world conditions), large language models seem to be more stable user satisfaction estimators. Additionally, we show that how we test AI systems strongly influences how reliable they appear. The thesis also looks at social bias in search results, such as whether ranked lists of documents unintentionally favor one gender over another. We introduce a new approach to measure this kind of bias more accurately, which can even change which systems are considered unbiased.

Finally, the research explores how LLMs decide which sources to cite when generating answers. It finds that simply labeling a document as human-written or AI-generated can strongly influence which sources the model chooses to reference. This reveals an unexpected bias and raises concerns about how easily such systems could be influenced.

Overall, this work helps us better understand where the models behind modern AI systems are reliable, and sheds light on how we can evaluate and improve them to make them more robust, fair, and trustworthy in real-world use.

PhD dissertations

Approximately one week after the defence, PhD dissertations by Leiden PhD students are available digitally through the Leiden Repository, that offers free access to these PhD dissertations. Please note that in some cases a dissertation may be under embargo temporarily and access to its full-text version will only be granted later.

Press enquiries (journalists only)

+31 (0)71 527 1521
nieuws@leidenuniv.nl

General information

Beadle's Office
pedel@bb.leidenuniv.nl
+31 71 527 7211

Select a different organisation