OpenEuroLLM: the open-source AI project through which Europe challenges Big Tech

The European Union has 24 official languages—a linguistic diversity that, until recently, may have seemed like a limitation for the development of Artificial Intelligence in Europe. But thanks to the OpenEuroLLM project, this diversity is turning into a powerful resource.

Through an unprecedented collaboration between leading AI companies and European research institutions, OpenEuroLLM aims to develop next-generation language models based on a vision of a digitally sovereign, linguistically inclusive, and technologically advanced Europe.

If successful, the project could mark a turning point in the history of European AI and offer an alternative model of development that values openness, collaboration, and democratic principles—while reducing Big Tech’s influence over European citizens’ data.

What Is a Language Model?

OpenEuroLLM is designed to build open-source language models that adapt to the specific context in which they are used, speaking the appropriate language and becoming valuable digital tools for public and private organizations.

A language model for AI is a system capable of understanding, processing, and generating human language, trained on vast amounts of text. These technologies can generate coherent and contextually appropriate text, understand questions, and provide relevant answers. They also support tasks such as translation, summarization, and creative writing by predicting the most likely next word in a sequence.

The most advanced systems—Large Language Models (LLMs)—process information using billions of parameters, enabling them to perform complex tasks and generate sophisticated responses.

Strategic Investments and Partners

With a total budget of €37.4 million—€20.6 million of which is funded by the Digital Europe Programme—OpenEuroLLM stands out as one of the most important initiatives for building accessible, open, and value-aligned AI in Europe.

The project is driven by a consortium of 20 organizations across the continent, including research centers, companies, and high-performance computing institutions. Italy is represented by the Cineteca di Bologna, which contributes supercomputing infrastructure and Italian-language models for public administration and private sector innovation.

Key partners also include the Barcelona Supercomputing Center, CSC (Finland), and SURF (Netherlands), along with 11 universities and five companies such as Germany’s Aleph Alpha and France’s LightOn, the first publicly listed European generative AI company.

STEP Recognition

For its strategic value, the European Commission awarded OpenEuroLLM the STEP (Strategic Technologies for Europe Platform) seal.

This prestigious recognition is a mark of excellence that facilitates access to additional funding and increases visibility to institutions and investors through the InvestEU platform. It is granted to projects that demonstrate outstanding quality and innovation across five EU-funded programs, including Digital Europe.

Open-Source Models for Digital Autonomy

The models developed by OpenEuroLLM will be open-source, transparent, and compliant with European regulations. This will allow public administrations and companies to innovate without handing over their data to proprietary Big Tech AI systems.

Thanks to strict adherence to the European regulatory framework on privacy, security, and algorithmic transparency, these models will help democratize access to high-quality AI technologies, ensure legal and ethical compliance, and preserve the EU’s rich linguistic and cultural diversity.

The project also collaborates with communities and organizations such as LAION, open-sci, and OpenML, as well as with AI experts united under the project’s Open Strategic Partnership Board, ensuring that models, software, datasets, and evaluation metrics are fully open and adaptable to local needs in both the public and private sectors.

Digital Sovereignty and Technological Independence

At the heart of OpenEuroLLM’s vision is the principle of digital sovereignty.

In an era where data is the “new oil,” Europe can no longer depend on AI technologies developed elsewhere, which expose citizens and institutions to risks related to autonomy, security, privacy, and loss of control over their own data.

The project’s open-source approach reflects Europe’s commitment to transparent, fair, and inclusive innovation, in stark contrast with proprietary models developed by global tech giants—often operating as “black boxes” with unknown internal mechanisms.

Linguistic Diversity as a Strength

Europe’s linguistic diversity, often seen as an obstacle, is actually a tremendous opportunity to build more sophisticated and culturally inclusive AI models. Unlike Big Tech, which has historically prioritized English, OpenEuroLLM guarantees equal attention to all 24 official EU languages.

This multilingual focus is not only about cultural inclusion—it’s a competitive advantage. AI that communicates effectively in every EU language can better serve the 450 million European citizens, many of whom prefer to engage in their native tongue.

Moreover, OpenEuroLLM models enable local solutions for local needs: a municipal administration in Italy could use an Italian model to improve citizen services, while a Finnish company might use a Finnish-language model to optimize its internal processes.

A European Model for Transparent Innovation

OpenEuroLLM is a strong example of the kind of technological infrastructure needed to boost the development and refinement of European AI products. It proves that it’s possible to offer solutions built on transparency, openness, and community involvement—values that are widely recognized in the European tech ecosystem.

At a time of growing public concern about the social impacts of AI, this project shows that it is possible to develop advanced technologies that respect fundamental rights and serve the common good.