Guide to
Quality Assessment
Overview
This guide explains the assessment procedure of the MISSION KI Quality Standard step by step and uses the example system “PeWi” to illustrate what each step looks like in practice. The aim is to make the standard easier to apply and to give organisations clear, traceable guidance for conducting their own quality assessment.
Who this is for
Start-ups and SMEs with low-risk AI systems
Product, engineering/development, data, quality and compliance teams
Organisations that want to carry out an independent quality assessment of their AI systems
What you need
MISSION KI Quality Standard
Documentation of your AI system
Access to data, processes and tests
Example system PeWi
PeWi is an AI-based forecasting tool developed by a German weather-service start-up. The system combines data from weather models with local information (e.g. buildings, road network) to forecast near-ground temperatures and precipitation at street level.
Municipal and private winter road maintenance services use these granular forecasts to optimise duty rosters and staff deployment based on weather conditions – with the goal of ensuring road safety while keeping overtime and premium pay costs under control. PeWi thus stands for practical low-risk AI systems with a high operational impact but without an immediate high-risk classification under the EU AI Act.
All of the following assessment steps are explained using this PeWi example.
Course of the assessment procedure
Disclaimer: English documents will follow shortly.
Please refer to the German site (DE) in the meantime.
1. Use case description
What happens in this step
You document the purpose, functionality, inputs, outputs, boundaries and the application context of your AI system. This description forms the basis for all subsequent process steps (protection needs analysis, Assessment Catalogue and assessment report).
PeWi example
In the case of PeWi, the entire application context was first described in order to clearly record what the system is used for and which data it operates on. Among other things, the following aspects were documented:
Inputs: weather model data, geodata, measurement stations
Outputs: map-based temperature and precipitation forecasts
User roles: operations management / dispatching for winter road maintenance
System boundaries: no forecast is generated if required data sources are not available
Practical tips for step 1
☑️ Answer all questions completely and truthfully.
☑️ Do not treat the word limit too strictly if you consider a more detailed description necessary.
☑️ Clarify uncertainties and ambiguities with the responsible colleagues and resolve them.
2. Protection needs analysis
What happens in this step
For every criterion of the quality standard you check whether it is applicable to the AI system and which protection need applies (e.g. low, moderate, high). The protection needs analysis thus determines how deeply the criteria have to be assessed later on and what requirements apply to validation.
PeWi example
As part of the assessment of PeWi, the criteria were systematically reviewed in the protection needs analysis and assigned protection needs. Among others, the following classifications were made:
Data quality → protection need moderate (B)
Human oversight and control → protection need high (A)
AI-specific cybersecurity → protection need high (A)
Fairness → not applicable (D), as no personal characteristics are processed
In the PeWi practical example, the questions were filled in and commented on as completely as possible in order to provide many examples. Such a highly granular way of working is not always strictly necessary. Please also note the practical tips for step 2.
Importance of the protection needs analysis for the further process
The protection needs analysis determines how rigorously the evidence must later be validated. The assessment depth relates only to validation, not to the creation of the content.
D – not applicable: the criterion is omitted entirely
C – low protection need: validation can be carried out by subject-matter experts from the project team
B – moderate protection need: validation by internal, independent persons (not involved in development)
A – high protection need: validation according to the four-eyes principle by two organisationally independent persons – at least one with audit experience and at least one with AI/technical expertise
Practical tips for step 2
☑️ Answer the questions for each criterion in the order Applicability Questions (1) → Basic Questions (2) → Extension Questions (3). If the Applicability Questions result in “not applicable”, the criterion is omitted entirely. If one of the Basic Questions is answered with “3”, the criterion is necessarily applicable – further detailed assessment via the Extension Questions is then not required and saves time.
☑️ Use the comment function where helpful – both to ensure later traceability for others and to critically reflect on your own answers in borderline cases.
3. Rating against the assessment requirements
What happens in this step
In this step, the assessment depths determined in the protection needs analysis are applied to the Assessment Catalogue. For each applicable criterion:
the associated indicators are considered,
the assessment requirements of the required level (A–D) are read,
a control statement is formulated describing which measures have been implemented in the AI system,
suitable evidence is assigned, and
the level achieved is determined.
The Assessment Catalogue clearly specifies the assessment requirements that must be fulfilled for each level. The rating results from comparing
the required level (from the PNA) and
the demonstrably achieved level (based on measures and evidence).
Step 3 is therefore the core of the quality assessment.
PeWi example:
As part of the assessment of PeWi, three indicators were fully developed and evaluated as illustrative examples. The corresponding control statements were documented in the reference document in column D, “Control statement (completed by the auditee)”. The control statements record, for each indicator, which measures have been implemented and how PeWi meets the required assessment requirements of the respective level:
VE1.1 – Risk analysis for performance & robustness (Level C)
The control statement describes annual qualitative risk analyses, prioritisation of hazards, and designated responsibilities.VE1.2 – Performance: metrics & tests (Level C)
The control statement documents that suitable metrics and tests have been defined to verify whether the AI system achieves its intended functionality.DA1.1 – Data quality, protection & governance (Level B)
The control statement describes the complete documentation of the characteristics of each dataset used, as well as all processing steps (e.g., feature explanation, labeling, cleaning). In addition, systematic quality checks are carried out for each dataset, including checks for completeness, consistency, and plausibility.
Practical tips for step 3
☑️ For each criterion: assessment depth taken over from the PNA
☑️ Observables of the respective level read in full and understood in your own context
☑️ One concrete, system-specific control statement per indicator
☑️ Evidence clearly assigned
☑️ Achieved level (A–D) justified in a logically consistent way
☑️ Note: One piece of evidence can cover multiple indicators. It does not have to be listed multiple times; you can refer to it via a cross-reference.
4. Providing evidence
What happens in this step:
All evidence demonstrating that the described measures have actually been implemented is collected and documented in an evidence list (e.g., documents, reports, tests, process descriptions).
PeWi example:
As part of the assessment of PeWi, suitable evidence was provided for the three illustrative indicators (see step 3). This evidence was documented in the reference document in column E, “Illustrative content excerpt from evidence”.
For PeWi, the following evidence was recorded, among others:
VE1.1 – Risk analysis for performance & robustness
Evidence: Excerpt from the risk analysis report, including among other things:identified risks
designated responsibilities
qualitative impacts
prioritised hazards
recommended measures
VE1.2 – Performance: metrics & tests
Evidence: Excerpt from the test plan, including among other things:defined quality metrics
established thresholds
subject-matter justifications
regular evaluations
supplementary user assessments
DA1.1 – Data quality
Evidence: Excerpt from the dataset description, including among other things:scope and structure of the datasets
data types used
origin and purpose of the data
quality assurance measures
documented processing steps
Practical tips for step 4:
☑️ Record all evidence mentioned in the control statements.
☑️ Give evidence items clear names (including version/date) and ideally link them directly to internal documents — this makes later review easier.
☑️ Document your technical tests in a reproducible way (e.g., data splits, seeds, parameters).
☑️ Combine organisational and technical evidence in a reasonable way.
☑️ Ensure that the central evidence list is complete and up to date.
5. Validation of evidence
What happens in this step
The collected evidence is reviewed: Is it complete, plausible, up to date and consistent with the rating in the Assessment Catalogue? Who is allowed to carry out this validation depends on the assessment depth determined in the PNA.
PeWi example
For PeWi, the responsible persons for validation were derived directly from the protection needs and assessment depths. The assessment report documents how this was implemented:
For criteria with assessment depth A (high protection need), the CTO (with audit experience) and the administrator validate jointly, thereby implementing the required four-eyes principle.
For criteria with assessment depth B (moderate protection need), the CTO, as an independent subject-matter expert, carries out the validation.
For criteria with assessment depth C (low protection need), validation is performed by subject-matter experts in the project team.
Practical tips for step 5
☑️ Make sure that relevant and current evidence exists for every criterion.
☑️ Ensure that evidence covers the aspects required in the observables and that the rating in the Assessment Catalogue is plausibly supported by this evidence.
☑️ Document the validation results with date, persons involved and outcomes.
6. Overall assessment
What happens in this step
All ratings at indicator level are aggregated per criterion and compared with the protection need from the PNA. The minimum principle applies: the lowest achieved level of a criterion is decisive. The outcome is a statement on whether all applicable criteria are fulfilled.
PeWi example
For PeWi, the ratings for all applicable criteria were consolidated. The result:
10 out of 10 applicable criteria are fulfilled,
several criteria are even exceeded (a higher level was achieved than required).
Practical tips for step 6
☑️ Aggregate the indicators correctly to obtain one level per criterion.
☑️ Compare protection need (target) and achieved level (actual) for each criterion.
☑️ Document clearly whether each criterion is fulfilled, exceeded or not fulfilled.
☑️ Highlight any over-fulfilment as a particularly positive feature, which you can, for example, use in external communication.
☑️ Draw up appropriate action plans for any criteria that are not fulfilled.
7. Results documentation
What happens in this step
All results of the quality assessment are compiled in an assessment report. The report serves internal documentation purposes and can also be used externally to build trust (e.g. with customers, partners, investors).
PeWi example
The PeWi assessment report includes, among other things:
Description of the AI system (purpose, version, components, data sources)
Summary of the protection needs analysis
Presentation of the ratings per criterion
Overview of key evidence
Overall assessment and conclusion on the fulfillment of quality requirements
Information on validity and required re-assessments
Signatures of the responsible persons
Practical tips for step 7
☑️ Transfer the data carefully to the existing assessment documents and ensure that all mandatory sections of the assessment report are completed.
☑️ State clearly the assessed system version of your AI system and any conditions of validity.
☑️ Ensure that relevant evidence is referenced in the report. Even if external parties cannot or can only partially view it, this level of detail supports transparency.
☑️ Use the opportunity to highlight over-fulfilment and particular strengths. The assessment report also serves as a means of external communication.
☑️ Obtain the signatures of the responsible persons – only then does your assessment report become valid
8. Monitoring validity
What happens in this step
You define the conditions under which the validity of the assessment ends and when a re-assessment is required (e.g. in the case of major technical changes, new data sources or detected drifts).
PeWi example
For PeWi, the assessment report records that the assessment statement is tied to a specific system version and specific data sources (e.g. the weather model used). If the model, data sources or application context change significantly, a renewed assessment of the affected criteria is required.
Checklist for step 8
☑️ Define clear rules for when a re-assessment is required (e.g. model change, new data sources, changed application context). You can refer back to the validity conditions specified in the assessment report.
☑️ Establish appropriate monitoring for your AI system in order to detect irregularities as early as possible (e.g. performance drift).
MISSION KI Assessment Portal (Beta)
Your central tool for quality assessment: The assessment portal is now available in a beta version and digitally maps the assessment process. Guided processes and integrated support features facilitate self-assessment for organizations and make it more efficient.
Toolbox for technical tests
A curated collection of proven test methods for bias detection, robustness testing, performance measurement, and more. Each method is directly assigned to the quality dimensions and provided with concrete instructions. Searchable by system type and application domain.
The MISSION KI
Quality Standard
To strengthen the competitiveness of German AI development and deployment, MISSION KI, together with leading stakeholders from research, standardization, and testing, has developed a quality standard that makes AI quality systematically measurable and verifiable.
FAQs
Who is the standard intended for? It supports AI providers in:
The MISSION KI Quality Standard is primarily aimed at AI providers, in particular start-ups and small and medium-sized enterprises (SMEs) that want to demonstrate and systematically improve the quality and trustworthiness of their AI applications. The standard provides a structured approach for the internal assessment, documentation, and external communication of quality measures.
It supports AI providers in:
systematizing processes and responsibilities in dealing with AI,
creating transparency for customers, partners, and supervisory authorities,
and preparing early for future regulatory requirements.
The MISSION KI Quality Standard thus helps innovative companies in particular to build trust, reduce risks, and establish quality as a competitive advantage.
Can other stakeholders also benefit from the standard?
Yes. In addition to AI providers, customers, investors, supervisory authorities, and procurement authorities also benefit from the application of the standard. Uniform evaluation criteria create comparability, foster trust, and facilitate well-informed decisions in procurement, investment, and assessment processes.
For contracting authorities and partner companies, the standard provides an objective basis for assessing the quality and reliability of AI systems. Investors, in turn, gain greater transparency into a company’s maturity level and risk management.
In this way, the MISSION KI Quality Standard not only contributes to quality assurance for individual systems, but also strengthens trust, market transparency, and accountability across the entire AI ecosystem.
Is the application of the MISSION KI standard mandatory?
No. The MISSION KI Quality Standard is voluntary. Its value lies in building trust, reducing risks, and preparing early for regulatory requirements – especially with regard to the AI Act and future certification frameworks
How do I carry out an assessment according to the MISSION KI quality standard?
The MISSION KI Quality Standard describes a complete assessment process and provides all the necessary templates – including use case description, protection needs analysis, Assessment Catalogue, Test Method Catalogue, and assessment report template. Organizations can carry out the assessment independently on the basis of these documents.
In addition, the MISSION KI assessment portal is available to facilitate the application. The portal digitizes the assessment process and guides users step by step through the entire workflow.
Can the assessment be carried out independently by organizations themselves?
Yes. The standard serves as a basis for organizations to carry out self-assessments of their AI systems and can optionally be supplemented by validation from external assessors or assessment bodies.
How much time should be planned for carrying out an assessment?
The time required for an assessment typically ranges between four and eight working days. The exact duration depends on several factors:
Existing documentation: Organizations with established quality management processes and existing technical documentation can build on available materials and thus significantly reduce the effort.
System complexity: The type and scope of the AI system, as well as the number of relevant quality dimensions, influence the duration of the assessment.
Available evidence: Tests already carried out, existing metrics, and existing risk analyses speed up the process.
Team availability: The timely availability of relevant contacts and experts from development and operations has a direct impact on the turnaround time.
For well-prepared organizations with structured processes, the effort tends toward the lower end of the range, while initial assessments without established documentation structures require correspondingly more time.
Which documents and evidence are required for the assessment?
The following are needed for the assessment:
the assessment procedure document for the MISSION KI quality standard, including the explanations and annexes contained therein
existing documentation of your AI system, as well as, if applicable, of your IT landscape
access to data, processes, and tests
How are my data protected in the MISSION KI assessment portal?
The MISSION KI assessment portal has been designed in such a way that the collection of sensitive data is avoided. The self-assessment primarily takes place in your own system environment. Documented evidence and assessment results remain under your control. The portal serves as a structuring tool and stores only the data necessary for carrying out the assessment.
Who has access to my data during the assessment?
In a self-assessment, only the persons designated and authorized by your organization have access to the assessment data. Depending on the chosen test depth, these may include members of the responsible team, independent internal assessors, or – in the case of voluntary external validation – accredited assessment bodies. Access rights are controlled and documented by your organization.
How is the AI quality standard related to other regulations and standards?
The MISSION KI Quality Standard is closely linked to other regulations and standards. It serves as a practice-oriented complement to and concretization of existing legal requirements, international standards, and assessment catalogues, such as the EU AI Act, the Fraunhofer IAIS Assessment Catalogue, VDE SPEC 90012, or standards of ISO/IEC JTC 1/SC 42. The standard provides guidance for companies to facilitate compliance with regulatory requirements. At the same time, it ensures that AI systems are developed and operated in accordance with global best practices and ethical principles.
Through its regulatory consistency with other regulations (e.g., the GDPR or MDR) and standards, the AI Quality Standard supports companies in harmonizing their processes and ensuring a consistent and trustworthy implementation of AI technologies.
What does “compatible with the AI Act” mean?
“Compatible with the AI Act” means that the standard is aligned with the European AI Act. Although the standard goes into more detail than the abstract regulatory requirements, it nevertheless includes provisions that are substantively linked to the legal text. In addition, the requirements of the Quality Standard do not conflict with the AI Act.
Companies can therefore use the standard as a reference point for building effective AI compliance management. At the same time, due to this compatibility, meeting the legal requirements or established standards also makes it easier to achieve compliance with the Quality Standard.
Beyond the AI Act itself, for example in sector-specific regulation, attention is also paid during the development of the standard to ensuring compatibility.
Implementation partners
The development of our MISSION KI Quality Standard is supported by a strong partnership of leading institutions