Universiteit Leiden

nl en

Vulnerability in open-source code has been floating around for 15 years: ‘This shows how complex security really is’

Researchers at LIACS have found a vulnerability in open-source code that’s been used around the world for over 15 years. They’ve also developed an AI-based tool to fix the problem automatically. ‘You really can’t afford to lean back.’

PhD candidate Jafar Akhoundali came across the code on GitHub – a platform where developers create, store and share software. The first version he found dated back to 2010. It was part of a file server, and at first glance, nothing seemed wrong. But it turned out to contain a serious vulnerability: it allowed attackers to gain access to files and folders they shouldn’t be able to reach. Password files, for example.

Vibe coding

The software turned out to be widely used. The same bit of code had made its way to StackOverflow, YouTube tutorials, well-known software companies and even university projects. ‘No one wants to reinvent the wheel, so code gets reused all the time,’ Akhoundali explains. ‘People often assume open-source software is safe, because anyone can check it. But that’s not always true.’

A recent trend makes things even riskier: vibe coding. That’s when developers use AI tools to help write code. It’s quick – but it comes with risks.

AI models like GPT, Claude and Copilot are trained on publicly available code. And that includes insecure code. So they end up reproducing the same mistakes.

‘It’s a case of: garbage in, garbage out,’ Akhoundali says. ‘We saw that ourselves when we asked different models to generate this code. Even when you explicitly ask for secure code, they don’t always deliver. You never really know where the answer comes from, and the model presents it as if it’s reliable. But many large language models are trained on unsafe examples – and they treat them as correct.’

AI as part of the solution

At the same time, AI can help fix these kinds of problems. The team developed an automated data pipeline that detects and repairs vulnerable code in seven steps. ‘There are tens of thousands of bits of potentially unsafe code out there,’ Akhoundali explains. ‘It’s impossible to go through them by hand. Our pipeline handles the entire process: it searches for the code, runs tests, flags the risky projects, estimates the impact, and sends out fixes – all automatically.’

And that’s where AI gets to show its helpful side. Using GPT-4, the system writes and applies the necessary patches. Of the 1,756 vulnerable projects identified on GitHub, around 1,600 have already been patched. Doing all that manually would have taken far too long.

Raising awareness: don’t cut on software security

In August, Akhoundali and his colleagues – Hamidreza Hamidi, Kristian Rietveld and Olga Gadyatskaya – will make their tool available to other developers. ‘We hope people will use it to improve their software. Of course, there’s always a risk someone might try to misuse it, but we’ve already patched a lot of the most popular projects. So even if someone tried, it shouldn’t be possible,’ says Akhoundali.

But for him, the most important goal is awareness. ‘People often postpone improving their security until after they’ve been hacked. But cybersecurity is complicated. It takes time, money and experience. This bug had been around for 15 years. And it’s just one example of how tiny details can lead to major risks.’

This website uses cookies.  More information.