MISSION KI
Quality Standard for Low-Risk AI
A Clear Framework for AI Quality
The EU AI Act regulates high-risk AI — but many AI applications fall below this threshold. Nevertheless, customers, partners, and investors expect reliable evidence of quality, safety, and fairness.
To strengthen the competitiveness of German AI development and deployment, MISSION KI, together with leading stakeholders from research, standardization, and testing, has developed a quality standard that makes AI quality systematically measurable and verifiable. The process enables companies to document their quality commitments in a structured way — supporting faster trust-building with customers and investors, more efficient procurement, and targeted preparation for regulatory requirements such as the EU AI Act.
Documents for Download
The Standard
The quality standard translates key principles of trustworthy AI — such as transparency, non-discrimination, and reliability — into six specific quality dimensions, which are systematically converted into verifiable criteria and measurable requirements. This creates a transparent evaluation framework for assessing how well an AI system meets quality requirements.
The document describes the complete assessment process — from the use-case description and protection needs analysis through to the overall evaluation — and provides directly usable templates for implementation. It is aimed at AI providers, AI deployers, and procurement authorities, and serves both as a basis for self-assessments and as a reference for external validation.
Quality Standard – Complete Document
All chapters and appendices in a single file
Download (PDF)
Quality Standard – Individual Documents
All appendices as standalone, ready-to-use templates for the assessment process.
Download (.zip Datei)
Quality Standard on GitHub
The Assessment Process
Structure and Process of the Quality Assessment
Assessing AI quality is challenging: Which aspects matter, how are they measured, and when can a result be considered resilient? The assessment process answers these questions with a scientifically grounded methodology.
It combines established approaches (VDE SPEC 90012, Fraunhofer IAIS Prüfkatalog, Joint Specification for Trustworthy AI Systems) into an integrated assessment framework. The overview shows how the procedure leads from the definition of the AI system to the final assessment statement.
1. Use Case Description
The AI system is described precisely: What purpose does it fulfill? In which context is it deployed? Which technical components are part of it, and which are not? Specifically: application context, limits of the application domain, output format, mode of operation (local/cloud).
This description defines what the entire assessment refers to – and ensures that the quality statement is clearly bound to a defined area of application.
2. Protection Needs Analysis
Not every quality criterion is equally important for every AI system. The protection needs analysis evaluates: What concrete harms could arise if requirements are not met? (e.g., health risks, discrimination, data protection breaches).
Result: Each criterion is classified as low, moderate, high – or as not applicable. This makes it clear where especially thorough assessment is required and where basic requirements are sufficient.
3. Classification according to the assessment requirements
Which quality measures has the company actually implemented? The assessment follows the VCIO system, a four-level hierarchy from top (abstract) to bottom (concretely measurable):
Quality dimensions (e.g., “Reliability,” “Transparency”)
Criteria (e.g., “Performance and robustness”)
Indicators (concrete actions: analyses, measures, evaluations)
Observables (measurable evidence with assessment)
The classification: For each relevant indicator, it is determined which observable level the AI system meets:
A = best practice
B = advanced
C = basic
D = not fulfilled.
The principle is: The higher the protection need from Step 2, the higher the requirements. The selected classification must be supported by evidence – this takes place in Step 4
4. Provision of evidence
The selected classifications must be systematically substantiated. Depending on the level of protection need, different requirements apply:
Level C (low): Basic documentation + test results
Level B (moderate): Detailed documentation + tests with metadata (e.g., dataset versions, configurations)
Level A (high): Reproducible tests + audit trail + full traceability
Possible evidence includes: Risk analyses, test reports, process descriptions, existing certifications (ISO 27001, GDPR).
5. Validation of the evidence
The classifications and evidence are systematically reviewed. The requirements for the reviewing persons increase with the level of protection need:
Low: Self-validation by the responsible team
Moderate: Review by uninvolved internal experts
High: Validation according to the four-eyes principle by internal reviewers who are hierarchically independent
Optional: External assessment bodies may be involved.
Objective: To ensure that the documented measures actually exist and are plausible.
6. Overall assessment
Have the requirements been met? The achieved quality levels are summarized for each criterion and compared against the identified level of protection need.
The rule:
High protection need → at least Level A required
Moderate protection need → at least Level B required
Low protection need → at least Level C required
Passed? Yes, if all relevant criteria reach the minimum level. Over-fulfillment is documented.
7. Documentation of results:
The assessment report summarizes:
System information: application context, components, data use
Protection needs: which criteria were critical, which were not applicable?
Quality levels: achieved levels vs. required levels per criterion
Measures & tests: which quality safeguards were implemented?
Validation: which level of review was carried out?
Statement: which criteria were met / not met / exceeded
The report follows a standard template and is signed by the responsible parties, with a formal declaration of correctness.
8. Monitoring of validity
The assessment is point-in-time based: the quality statement applies only to the evaluated system version.
Validity ends when there is:
A change in the intended purpose
Significant technical changes (software/hardware)
External changes (e.g., concept drift, data drift: changes in the input data)
Validity remains in case of: Organizational adjustments (e.g., new responsibilities), hardware updates without functional impact.
Objective: To ensure, through regular monitoring, that the quality statement remains up to date.
Practical implementation
Efficient assessment with the MISSION KI assessment portal
MISSION KI Assessment Portal (Beta)
Your central tool for quality assessment: The assessment portal is now available in a beta version and digitally maps the assessment process. Guided processes and integrated support features facilitate self-assessment for organizations and make it more efficient.
Guide to quality assessment
The guide complements the assessment process in the quality standard with practical guidance and shows, using an example AI system, what the individual assessment steps look like in practice — from the use case description to the validated evidence.
Toolbox for technical tests
A curated collection of proven test methods for bias detection, robustness testing, performance measurement, and more. Each method is directly assigned to the quality dimensions and provided with concrete instructions. Searchable by system type and application domain.
Strategic benefit
Quality as a competitive advantage
In an increasingly regulated AI market, structured proof of quality pays off in multiple ways: through market differentiation, greater acceptance by customers and investors, and reduced compliance risks. The MISSION KI Quality Standard has been developed so that these benefits can be achieved with a reasonable level of effort. Six guiding principles ensure a balance between the resilience of the assessment results and the efficiency of the process:
Voluntariness
The assessment is done voluntarily and primarily driven by economic motivation. The EU AI Act primarily regulates high-risk AI — for most other AI applications, there are no comparable quality requirements. However, customers, investors, and partners increasingly expect resilient evidence of quality, safety, and fairness. The standard enables you to proactively build trust and differentiate yourself through demonstrable quality — especially valuable in tenders and investor pitches.
Practical orientation & efficiency
The standard was specifically designed for use in organizations with limited assessment resources, particularly start-ups and SMEs. The use-case-specific protection needs analysis focuses assessment efforts on truly critical areas. The digital assessment portal automates routine steps and provides context-specific guidance. This results in structured proof of quality with a reasonable level of effort. At the same time, the standard builds on established approaches such as VDE SPEC 90012 and the Fraunhofer IAIS Assessment Catalogue.
Regulatory consistency
The standard was deliberately developed to be compatible with key regulations and standards. The six quality dimensions are based on the EU Ethics Guidelines for Trustworthy AI and are harmonized with the high-risk requirements of the EU AI Act. The processes and documentation produced as part of the assessment — for example on protection needs analysis, data quality, robustness, or human oversight and control — support you in preparing for regulatory requirements. Existing evidence from ISO 27001, GDPR compliance, or other standards is recognized and can be integrated.
Robustness
Resilient proof of quality requires objective criteria and traceable procedures. The standard uses a four-level assessment system that defines clear requirements for each quality aspect — from minimum standard to best practice. Technical test methods are documented with reproducible parameters so that results remain verifiable. The test depth can be scaled: from self-assessment to internal validation according to the four-eyes principle, up to the optional involvement of external assessment bodies. The assessment report transparently documents which quality measures were implemented and to what extent.
Comparability
The standard creates comparability through uniform assessment criteria. All assessments follow the same system: six quality dimensions are translated into concrete criteria and measurable requirements. The protection needs analysis ensures that different use contexts are taken into account — while the evaluation logic remains comparable. For procurement authorities, customers, and investors, this means that AI offerings can be assessed and compared in a structured way — regardless of industry, AI provider, or system type.
Accessibility
The standard was deliberately designed to have a low entry barrier in order to make it easier for start-ups and SMEs to get started with systematic AI quality management. The assessment portal guides users step by step through the evaluation process, integrated templates reduce documentation effort, and the Test Method Catalogue provides directly applicable procedures with concrete instructions. Assessment results are communicated in an understandable form — for internal governance as well as for customers, partners, and investors. This makes professional quality management possible even without a dedicated compliance department.
News on trustworthy AI:
Pilot testing of the assessment procedure
Cross-sector applicability
During its development, the MISSION KI Quality Standard was tested on five real AI systems from the fields of mobility, medical technology, agricultural technology, industry, and the financial sector. The pilot partners evaluated the clarity, feasibility, and efficiency of the process. Their feedback led to clearer assessment steps, more precise requirements, and a more coherent interplay between the protection needs analysis, criteria, and evidence. The result is a practice-refined standard that works across sectors and reflects real development and operational conditions.
BeIntelli
Assessed application
An AI system supports automated buses by detecting relevant traffic lights, assessing their status, and preparing the decision “stop or proceed” - a safety-critical component in urban traffic.
Benefit for BeIntelli
The structured assessment demonstrated how quality requirements, in particular transparency, traceability, and human oversight, can be practically integrated into technological development processes. In particular, safety-critical workflows could be documented more clearly.
Contribution to the standard
The use case contributed to a more precise definition of criteria and assessment steps for safety-relevant applications, especially in the dimensions of transparency, reliability, and human oversight.
CLAAS
Assessed application
An AI system automatically reads and posts delivery notes in goods receiving — a central process for stable workflows and error-free postings in production.
Benefit for CLAAS
The pilot served to
– prepare for upcoming requirements of the EU AI Act,
– enable transparent documentation for the works council and compliance,
– systematically validate the AI models in use.
The standard also helped make existing control mechanisms more visible and strengthen internal governance.
Contribution to the standard
The use case particularly sharpened the Human Oversight & Control dimension and helped define evidence formats more consistently.
enamentis
Assessed application
Cir.Log® uses AI-based image analysis to identify and document surgical instruments, a safety-relevant step in surgical preparation and set management.
Benefit for enamentis
The assessment helped to systematically evaluate the robustness of the recognition methods, data quality, and documentation processes. Particularly valuable: the structured derivation of requirements for transparency, reliability, and AI-specific cybersecurity.
Contribution to the standard
The use case provided key input for the refinement of criteria in medical-related applications, especially with regard to evidence on data quality and robustness.
KfW
Assessed application
The AI application KARINA analyzes news and market information in order to identify potential credit risks at an early stage, a central building block for the bank’s data-driven decision-making.
Benefit for KfW
Participation helped to
– systematically review transparency and reliability,
– clearly document role distributions between humans and AI,
– benchmark internal quality measures against future certification requirements.
Contribution to the standard
The use case provided concrete input for sharpening the criteria for text analysis systems, in particular with regard to transparency, explainability, and reliability.
ZEISS
Assessed application
An AI system forecasts daily order volumes to optimize production planning and capacity. Reliability and the protection of proprietary data are critical in this context.
Benefit for ZEISS
The validation supported:
– identification of blind spots,
– further development of existing quality processes,
– assessment of cybersecurity around sensitive data.
Contribution to the standard
ZEISS provided important input for the concretization of requirements for forecasting models and AI-specific cybersecurity.
FAQs
Who is the standard intended for? It supports AI providers in:
The MISSION KI Quality Standard is primarily aimed at AI providers, in particular start-ups and small and medium-sized enterprises (SMEs) that want to demonstrate and systematically improve the quality and trustworthiness of their AI applications. The standard provides a structured approach for the internal assessment, documentation, and external communication of quality measures.
It supports AI providers in:
systematizing processes and responsibilities in dealing with AI,
creating transparency for customers, partners, and supervisory authorities,
and preparing early for future regulatory requirements.
The MISSION KI Quality Standard thus helps innovative companies in particular to build trust, reduce risks, and establish quality as a competitive advantage.
Can other stakeholders also benefit from the standard?
Yes. In addition to AI providers, customers, investors, supervisory authorities, and procurement authorities also benefit from the application of the standard. Uniform evaluation criteria create comparability, foster trust, and facilitate well-informed decisions in procurement, investment, and assessment processes.
For contracting authorities and partner companies, the standard provides an objective basis for assessing the quality and reliability of AI systems. Investors, in turn, gain greater transparency into a company’s maturity level and risk management.
In this way, the MISSION KI Quality Standard not only contributes to quality assurance for individual systems, but also strengthens trust, market transparency, and accountability across the entire AI ecosystem.
Is the application of the MISSION KI standard mandatory?
No. The MISSION KI Quality Standard is voluntary. Its value lies in building trust, reducing risks, and preparing early for regulatory requirements – especially with regard to the AI Act and future certification frameworks
How do I carry out an assessment according to the MISSION KI quality standard?
The MISSION KI Quality Standard describes a complete assessment process and provides all the necessary templates – including use case description, protection needs analysis, Assessment Catalogue, Test Method Catalogue, and assessment report template. Organizations can carry out the assessment independently on the basis of these documents.
In addition, the MISSION KI assessment portal is available to facilitate the application. The portal digitizes the assessment process and guides users step by step through the entire workflow.
Can the assessment be carried out independently by organizations themselves?
Yes. The standard serves as a basis for organizations to carry out self-assessments of their AI systems and can optionally be supplemented by validation from external assessors or assessment bodies.
How much time should be planned for carrying out an assessment?
The time required for an assessment typically ranges between four and eight working days. The exact duration depends on several factors:
Existing documentation: Organizations with established quality management processes and existing technical documentation can build on available materials and thus significantly reduce the effort.
System complexity: The type and scope of the AI system, as well as the number of relevant quality dimensions, influence the duration of the assessment.
Available evidence: Tests already carried out, existing metrics, and existing risk analyses speed up the process.
Team availability: The timely availability of relevant contacts and experts from development and operations has a direct impact on the turnaround time.
For well-prepared organizations with structured processes, the effort tends toward the lower end of the range, while initial assessments without established documentation structures require correspondingly more time.
Which documents and evidence are required for the assessment?
The following are needed for the assessment:
the assessment procedure document for the MISSION KI quality standard, including the explanations and annexes contained therein
existing documentation of your AI system, as well as, if applicable, of your IT landscape
access to data, processes, and tests
How are my data protected in the MISSION KI assessment portal?
The MISSION KI assessment portal has been designed in such a way that the collection of sensitive data is avoided. The self-assessment primarily takes place in your own system environment. Documented evidence and assessment results remain under your control. The portal serves as a structuring tool and stores only the data necessary for carrying out the assessment.
Who has access to my data during the assessment?
In a self-assessment, only the persons designated and authorized by your organization have access to the assessment data. Depending on the chosen test depth, these may include members of the responsible team, independent internal assessors, or – in the case of voluntary external validation – accredited assessment bodies. Access rights are controlled and documented by your organization.
How is the AI quality standard related to other regulations and standards?
The MISSION KI Quality Standard is closely linked to other regulations and standards. It serves as a practice-oriented complement to and concretization of existing legal requirements, international standards, and assessment catalogues, such as the EU AI Act, the Fraunhofer IAIS Assessment Catalogue, VDE SPEC 90012, or standards of ISO/IEC JTC 1/SC 42. The standard provides guidance for companies to facilitate compliance with regulatory requirements. At the same time, it ensures that AI systems are developed and operated in accordance with global best practices and ethical principles.
Through its regulatory consistency with other regulations (e.g., the GDPR or MDR) and standards, the AI Quality Standard supports companies in harmonizing their processes and ensuring a consistent and trustworthy implementation of AI technologies.
What does “compatible with the AI Act” mean?
“Compatible with the AI Act” means that the standard is aligned with the European AI Act. Although the standard goes into more detail than the abstract regulatory requirements, it nevertheless includes provisions that are substantively linked to the legal text. In addition, the requirements of the Quality Standard do not conflict with the AI Act.
Companies can therefore use the standard as a reference point for building effective AI compliance management. At the same time, due to this compatibility, meeting the legal requirements or established standards also makes it easier to achieve compliance with the Quality Standard.
Beyond the AI Act itself, for example in sector-specific regulation, attention is also paid during the development of the standard to ensuring compatibility.