Swiss AI sovereignty – market opportunity or really complicated?
- Jürg Stuker
- vor 4 Tagen
- 2 Min. Lesezeit
The AI systems we are about to deploy are not only produced in the US or China, but in most cases also using a cloud scenario where the second endpoint is neither in Switzerland nor under our control.
Since hosting models locally with tools like Ollama, GPT4All or OpenLLM locally is an quite an easy fix, let’s talk about the elephant in the room: Building and tuning models under your control.
To structure this, I will use the open-source definition from GNU: “the users have the freedom to run, copy, distribute, study, change and improve the software”. These freedoms apply both to a fully functioning system and to discrete elements of a system.
In the case of AI, the broader challenges are in the governance of the data: Do I have access to the data from which the system is built, and all the information and configuration needed to rebuild the same weights? The Open Source Initiative (OSI) has published a white paper about Open Source AI which states that only sufficiently detailed information about the training data needs to be open, and not the data itself. So the next time you talk about AI sovereignty, tick off the following table.
Freedom to use | Legal terms. |
Freedom to study | Availability of source code. |
Freedom to modify | Allow fine-tuning only OR Information and configuration to rebuild the model (incl. tuning and aligment). |
Freedom to share | Legal terms. |
Training data | Sufficiently detailed information about training data? OR Training data itself is openly accessible? |
Both the data aspect and the modification of the system (not only using fine-tuning) are very complex. Personally, I am much more concerned about the data for the following reasons:
Where does the data come from and how can it be traced?
How was the data selected to avoid bias and meet ethical standards?
How was the data prepared, classified and anonymized?
How is the data licensed and is the value generated shared?
Do the owners know that is is being used, and can they opt-out?
With all this in mind, I leave the floor to an upcoming foundation model with Swiss societal values, that is currently in a collaboration between EPFL and ETH. Besides their nice hardware, they will start with an open-source corpus to train. Let’s see where they will end up…
And finally, how do you define your sovereign the next time you use the word? An opportunity or really complicated?