Universiteit Leiden

nl en

How the rise of AI is creating new opportunities for computational linguists

With the rise of AI, interest in computational linguistics and language models has taken flight. But machines are far from being able to go it alone. In her inaugural lecture, Professor Carole Tiberius will stress the importance of research on word combinations. ‘We know a great deal but there is a great deal that we still don’t know.’

‘If you look at how complicated a language is, it is amazing that we can understand each other so well’, says Tiberius. This fascination is a theme running through her career. ‘I started out as a translator because I found it so extraordinary that people can also understand each other in other languages. I soon thought: it would be great if a computer could do that work but what would it need to do so? That’s how I got into computational linguistics.’

Research opportunities

It is a field at the intersection of computer science and linguistics. Computational linguistics can be used to test linguistic theories but this is no longer the emphasis in 2024. ‘There has been a shift in recent decades, and definitely in recent years, toward using computer systems that can analyse and generate natural language’, says Tiberius. ‘ChatGPT, for example.’

‘If someone comes up with and says something creative today it can catch on and end up in the language.’

But that does not mean that computational linguists are twiddling their thumbs. On the contrary, Tiberius will say in her inaugural lecture. Technological advances have actually created opportunities for research and greater understanding. ‘We often used to have to speculate because we didn’t have large text collections available, but now we do have them and automatic ways to search them quickly. The more reliably we can do so, the more insight we will gain into how language works and how people use language.’

Language is always changing ‘We know a great deal, but there is a great deal that we still don’t know’, says Tiberius. ‘If someone comes up with and says something creative today it can catch on and end up in the language. The use of “gate” for a scandal, for instance. That originated with Watergate and was then used for political scandals and eventually for other scandals too, such as Nipplegate.’

No sooner said than done

Tiberius’s interest is in phraseology: the way individual words can be combined into phrases. ‘Individual words are often ambiguous but when they are placed in context their meaning soon becomes clear. Take “het komt voor de bakker” (a saying that means ‘no sooner said than done’, Ed.) or “ergens geen kaas van gegeten hebben” (a saying that means ‘to have no clue about’, Ed). The literature shows that the big language models still have difficulty with this kind of metaphorical and idiomatic language use. It’s my dream to create a dictionary with the normal usage patterns of words and their meaning’, says Tiberius. ‘If we can code that, then we will take another step in the right direction. It will never be finished but bit by bit we are gaining a better understanding of our language.’

Text: Julie de Graaf
Banner photo: Pexels

This website uses cookies.  More information.