Making Quality Visible: Franziska Weindauer on the Path to Reliable Low-Risk AI
Artificial intelligence is finding its way into more and more corporate processes – but how can its quality be reliably assessed, especially in areas where no legal requirements apply? Together with partners from research and industry, MISSION KI is developing a voluntary quality standard for low-risk AI that provides orientation and gives companies a structured procedure to work with.
A key role in this is played by the TÜV AI Lab. In this conversation, Franziska Weindauer, CEO of the TÜV AI Lab, explains why such a standard is particularly important right now, which requirements are decisive from a testing perspective, and how an approach can be created that is both robust and practical – for large organizations as well as for startups and SMEs.
Ms. Weindauer, the TÜV AI Lab was actively involved in developing the MISSION KI quality standard. What role did you and your team take on specifically – and why was this an important step for you?
At the TÜV AI Lab, we made two main contributions: firstly, we contributed our testing perspective – in other words, the question of how requirements for trustworthy AI can be formulated so that they are genuinely testable and practicable later on. Secondly, we translated our experience in conformity assessment, testing procedures, and risk management into the development of the standard. This included preparing for the execution of tests, such as determining which test criteria should be applied and which tools can be used for the evaluation.
This step was important because low-risk AI is hardly addressed in regulation, yet in practice constitutes the majority of deployed systems. But even here, we need to make quality measurable at an early stage. This prevents trust issues from emerging only once a system is already widely distributed in the market. The standard closes a gap between voluntary quality initiatives and the binding requirements of the AI Act – and provides SMEs in particular with a manageable tool.
The quality standard aims to make trust measurable. How do you translate this ambition into practice?
With the MISSION KI quality standard, we work in a very practice-oriented way: for each system, a so-called protection needs analysis is carried out across various dimensions, such as transparency or reliability. This consists of simple questions and predefined answer options, which are assigned values from 1 to 3. From this, the protection needs level in each dimension is automatically calculated.
Based on this, we then derive which measures a provider must take in order to meet the respective protection needs and which evidence they can provide. This may include technical evidence, such as test results on robustness, accuracy, or cybersecurity, but also organizational evidence, such as documentation, governance, or human oversight mechanisms.
Crucially, we do not evaluate individual technological components but always the entire system. Only in this way does an analytical overall picture emerge, which can later be reported in a transparent and comprehensible way.
How do you ensure that the standard works in practice – especially for SMEs or startups that do not have large compliance teams?
That was a core objective from the very beginning. The standard is designed to be scalable: the higher the protection needs, the more substantial the depth of the testing. Because if a startup develops products with high protection needs, the associated requirements and the effort required for quality assurance naturally increase. The adaptability to different protection needs of AI-based products reflects the diversity of the technology while still enabling valid statements about system quality.
Moreover, the language of the standard is intentionally pragmatic. Following the motto “guidance, not bureaucracy,” we place great emphasis on clear and manageable requirements.
For startups and SMEs, it is particularly important that the standard enables them to demonstrate quality at an early stage. This gives them genuine competitive advantages.
Germany wants to play a more significant international role in artificial intelligence. How can standardization and testing expertise support this ambition?
Standardization is the foundation of international competitiveness. Those who set standards shape markets. This is even more true for AI than for traditional technologies, because development here progresses so rapidly. Testing expertise ensures that these standards do not remain merely on paper but are reliably implemented in practice. This is precisely where the traditional strengths of the German TIC landscape lie. It is known for its neutrality, technical expertise, and established reputation for quality assurance.
If we transfer these competencies to the world of AI – for example through tests for robustness, transparency, or data governance – a clear added value emerges for manufacturers. Companies can examine the quality of their products, demonstrate it, and identify risks early on. This strengthens not only individual companies but also Europe’s strategic position as a hub for trustworthy AI overall.
What has personally influenced or surprised you the most during this process?
I was impressed by how heterogeneous the systems in the low-risk area are, and how important it therefore is to have a testing procedure that remains flexible while still providing clear guidance. The close exchange between science, industry, and testing organizations within MISSION KI has shown how much innovative potential arises when quality considerations are incorporated early.
From your perspective, how far along is Europe in harmonizing AI standards, and where is further progress needed?
Europe is on a good path. The AI Act creates a unified legal foundation for the first time. At the same time, numerous standards are being developed at the European and international levels – for example within CEN/CENELEC standardization committees or at ISO and IEC. We would like to see faster progress here. The reason: there is often a gap between legislative text and engineering practice. More harmonized testing methods, benchmarks, and test procedures are needed that are recognized across Europe. Established standards are a crucial first step.
From your perspective, how far along is Europe in harmonizing AI standards, and where do we still need to catch up?
I assume that we will see a twofold system emerge: firstly, a highly formalized regime for high-risk AI, including conformity assessment and market surveillance; and secondly, a growing ecosystem of voluntary evaluations for low-risk AI – similar to what we already know from IT security or data protection.
Testing organizations will increasingly act as “translators” between regulation and technical practice, and AI testing will become more automated and scalable. In short: the testing landscape will become more digital and more dynamic.