Skip to content
MISSION KI Quality Standard

Guide to
Quality Assessment

Overview

This guide explains the assessment procedure of the MISSION KI Quality Standard step by step and uses the example system “PeWi” to illustrate what each step looks like in practice. The aim is to make the standard easier to apply and to give organisations clear, traceable guidance for conducting their own quality assessment.

Who this is for

  • Start-ups and SMEs with low-risk AI systems

  • Product, engineering/development, data, quality and compliance teams

  • Organisations that want to carry out an independent quality assessment of their AI systems

What you need

  • MISSION KI Quality Standard

  • Documentation of your AI system

  • Access to data, processes and tests

Example system PeWi

PeWi is an AI-based forecasting tool developed by a German weather-service start-up. The system combines data from weather models with local information (e.g. buildings, road network) to forecast near-ground temperatures and precipitation at street level.

Municipal and private winter road maintenance services use these granular forecasts to optimise duty rosters and staff deployment based on weather conditions – with the goal of ensuring road safety while keeping overtime and premium pay costs under control. PeWi thus stands for practical low-risk AI systems with a high operational impact but without an immediate high-risk classification under the EU AI Act.

All of the following assessment steps are explained using this PeWi example.

Course of the assessment procedure

Disclaimer: English documents will follow shortly. 
Please refer to the German site (DE) in the meantime.

1. Use case description

What happens in this step
You document the purpose, functionality, inputs, outputs, boundaries and the application context of your AI system. This description forms the basis for all subsequent process steps (protection needs analysis, Assessment Catalogue and assessment report).

PeWi example
In the case of PeWi, the entire application context was first described in order to clearly record what the system is used for and which data it operates on. Among other things, the following aspects were documented:

  • Inputs: weather model data, geodata, measurement stations

  • Outputs: map-based temperature and precipitation forecasts

  • User roles: operations management / dispatching for winter road maintenance

  • System boundaries: no forecast is generated if required data sources are not available

Practical tips for step 1

  • ☑️ Answer all questions completely and truthfully.

  • ☑️ Do not treat the word limit too strictly if you consider a more detailed description necessary.

  • ☑️ Clarify uncertainties and ambiguities with the responsible colleagues and resolve them.

2. Protection needs analysis

What happens in this step
For every criterion of the quality standard you check whether it is applicable to the AI system and which protection need applies (e.g. low, moderate, high). The protection needs analysis thus determines how deeply the criteria have to be assessed later on and what requirements apply to validation.

PeWi example
As part of the assessment of PeWi, the criteria were systematically reviewed in the protection needs analysis and assigned protection needs. Among others, the following classifications were made:

  • Data quality → protection need moderate (B)

  • Human oversight and control → protection need high (A)

  • AI-specific cybersecurity → protection need high (A)

  • Fairness → not applicable (D), as no personal characteristics are processed

In the PeWi practical example, the questions were filled in and commented on as completely as possible in order to provide many examples. Such a highly granular way of working is not always strictly necessary. Please also note the practical tips for step 2.

Importance of the protection needs analysis for the further process
The protection needs analysis determines how rigorously the evidence must later be validated. The assessment depth relates only to validation, not to the creation of the content.

  • D – not applicable: the criterion is omitted entirely

  • C – low protection need: validation can be carried out by subject-matter experts from the project team

  • B – moderate protection need: validation by internal, independent persons (not involved in development)

  • A – high protection need: validation according to the four-eyes principle by two organisationally independent persons – at least one with audit experience and at least one with AI/technical expertise

Practical tips for step 2

☑️ Answer the questions for each criterion in the order Applicability Questions (1) → Basic Questions (2) → Extension Questions (3). If the Applicability Questions result in “not applicable”, the criterion is omitted entirely. If one of the Basic Questions is answered with “3”, the criterion is necessarily applicable – further detailed assessment via the Extension Questions is then not required and saves time.

☑️ Use the comment function where helpful – both to ensure later traceability for others and to critically reflect on your own answers in borderline cases.

3. Rating against the assessment requirements

What happens in this step
In this step, the assessment depths determined in the protection needs analysis are applied to the Assessment Catalogue. For each applicable criterion:

  • the associated indicators are considered,

  • the assessment requirements of the required level (A–D) are read,

  • a control statement is formulated describing which measures have been implemented in the AI system,

  • suitable evidence is assigned, and

  • the level achieved is determined.

The Assessment Catalogue clearly specifies the assessment requirements that must be fulfilled for each level. The rating results from comparing

  • the required level (from the PNA) and

  • the demonstrably achieved level (based on measures and evidence).

Step 3 is therefore the core of the quality assessment.

PeWi example:
As part of the assessment of PeWi, three indicators were fully developed and evaluated as illustrative examples. The corresponding control statements were documented in the reference document in column D, “Control statement (completed by the auditee)”. The control statements record, for each indicator, which measures have been implemented and how PeWi meets the required assessment requirements of the respective level:

  • VE1.1 – Risk analysis for performance & robustness (Level C)
    The control statement describes annual qualitative risk analyses, prioritisation of hazards, and designated responsibilities.

  • VE1.2 – Performance: metrics & tests (Level C)
    The control statement documents that suitable metrics and tests have been defined to verify whether the AI system achieves its intended functionality.

  • DA1.1 – Data quality, protection & governance (Level B)
    The control statement describes the complete documentation of the characteristics of each dataset used, as well as all processing steps (e.g., feature explanation, labeling, cleaning). In addition, systematic quality checks are carried out for each dataset, including checks for completeness, consistency, and plausibility.

Practical tips for step 3

☑️ For each criterion: assessment depth taken over from the PNA
☑️ Observables of the respective level read in full and understood in your own context
☑️ One concrete, system-specific control statement per indicator
☑️ Evidence clearly assigned
☑️ Achieved level (A–D) justified in a logically consistent way
☑️ Note: One piece of evidence can cover multiple indicators. It does not have to be listed multiple times; you can refer to it via a cross-reference.

4. Providing evidence

What happens in this step:
All evidence demonstrating that the described measures have actually been implemented is collected and documented in an evidence list (e.g., documents, reports, tests, process descriptions).

PeWi example:
As part of the assessment of PeWi, suitable evidence was provided for the three illustrative indicators (see step 3). This evidence was documented in the reference document in column E, “Illustrative content excerpt from evidence”.

For PeWi, the following evidence was recorded, among others:

  • VE1.1 – Risk analysis for performance & robustness
    Evidence: Excerpt from the risk analysis report, including among other things:

    • identified risks

    • designated responsibilities

    • qualitative impacts

    • prioritised hazards

    • recommended measures

  • VE1.2 – Performance: metrics & tests
    Evidence: Excerpt from the test plan, including among other things:

    • defined quality metrics

    • established thresholds

    • subject-matter justifications

    • regular evaluations

    • supplementary user assessments

  • DA1.1 – Data quality
    Evidence: Excerpt from the dataset description, including among other things:

    • scope and structure of the datasets

    • data types used

    • origin and purpose of the data

    • quality assurance measures

    • documented processing steps

Practical tips for step 4:

☑️ Record all evidence mentioned in the control statements.
☑️ Give evidence items clear names (including version/date) and ideally link them directly to internal documents — this makes later review easier.
☑️ Document your technical tests in a reproducible way (e.g., data splits, seeds, parameters).
☑️ Combine organisational and technical evidence in a reasonable way.
☑️ Ensure that the central evidence list is complete and up to date.

5. Validation of evidence

What happens in this step
The collected evidence is reviewed: Is it complete, plausible, up to date and consistent with the rating in the Assessment Catalogue? Who is allowed to carry out this validation depends on the assessment depth determined in the PNA.

PeWi example
For PeWi, the responsible persons for validation were derived directly from the protection needs and assessment depths. The assessment report documents how this was implemented:

  • For criteria with assessment depth A (high protection need), the CTO (with audit experience) and the administrator validate jointly, thereby implementing the required four-eyes principle.

  • For criteria with assessment depth B (moderate protection need), the CTO, as an independent subject-matter expert, carries out the validation.

  • For criteria with assessment depth C (low protection need), validation is performed by subject-matter experts in the project team.

Practical tips for step 5

☑️ Make sure that relevant and current evidence exists for every criterion.

☑️ Ensure that evidence covers the aspects required in the observables and that the rating in the Assessment Catalogue is plausibly supported by this evidence.

☑️ Document the validation results with date, persons involved and outcomes.

6. Overall assessment

What happens in this step
All ratings at indicator level are aggregated per criterion and compared with the protection need from the PNA. The minimum principle applies: the lowest achieved level of a criterion is decisive. The outcome is a statement on whether all applicable criteria are fulfilled.

PeWi example
For PeWi, the ratings for all applicable criteria were consolidated. The result:

  • 10 out of 10 applicable criteria are fulfilled,

  • several criteria are even exceeded (a higher level was achieved than required).

Practical tips for step 6

☑️ Aggregate the indicators correctly to obtain one level per criterion.

☑️ Compare protection need (target) and achieved level (actual) for each criterion.

☑️ Document clearly whether each criterion is fulfilled, exceeded or not fulfilled.

☑️ Highlight any over-fulfilment as a particularly positive feature, which you can, for example, use in external communication.

☑️ Draw up appropriate action plans for any criteria that are not fulfilled.

7. Results documentation

What happens in this step
All results of the quality assessment are compiled in an assessment report. The report serves internal documentation purposes and can also be used externally to build trust (e.g. with customers, partners, investors).

PeWi example
The PeWi assessment report includes, among other things:

  • Description of the AI system (purpose, version, components, data sources)

  • Summary of the protection needs analysis

  • Presentation of the ratings per criterion

  • Overview of key evidence

  • Overall assessment and conclusion on the fulfillment of quality requirements

  • Information on validity and required re-assessments

  • Signatures of the responsible persons

Practical tips for step 7

☑️ Transfer the data carefully to the existing assessment documents and ensure that all mandatory sections of the assessment report are completed.
☑️ State clearly the assessed system version of your AI system and any conditions of validity.
☑️ Ensure that relevant evidence is referenced in the report. Even if external parties cannot or can only partially view it, this level of detail supports transparency.
☑️ Use the opportunity to highlight over-fulfilment and particular strengths. The assessment report also serves as a means of external communication.
☑️ Obtain the signatures of the responsible persons – only then does your assessment report become valid

8. Monitoring validity

What happens in this step
You define the conditions under which the validity of the assessment ends and when a re-assessment is required (e.g. in the case of major technical changes, new data sources or detected drifts).

PeWi example
For PeWi, the assessment report records that the assessment statement is tied to a specific system version and specific data sources (e.g. the weather model used). If the model, data sources or application context change significantly, a renewed assessment of the affected criteria is required.

Checklist for step 8

☑️ Define clear rules for when a re-assessment is required (e.g. model change, new data sources, changed application context). You can refer back to the validity conditions specified in the assessment report.
☑️ Establish appropriate monitoring for your AI system in order to detect irregularities as early as possible (e.g. performance drift).

MISSION KI Assessment Portal (Beta)

Your central tool for quality assessment: The assessment portal is now available in a beta version and digitally maps the assessment process. Guided processes and integrated support features facilitate self-assessment for organizations and make it more efficient.

Toolbox for technical tests


A curated collection of proven test methods for bias detection, robustness testing, performance measurement, and more. Each method is directly assigned to the quality dimensions and provided with concrete instructions. Searchable by system type and application domain.

The MISSION KI
Quality Standard

To strengthen the competitiveness of German AI development and deployment, MISSION KI, together with leading stakeholders from research, standardization, and testing, has developed a quality standard that makes AI quality systematically measurable and verifiable.

FAQs

Who is the standard intended for? It supports AI providers in:

The MISSION KI Quality Standard is primarily aimed at AI providers, in particular start-ups and small and medium-sized enterprises (SMEs) that want to demonstrate and systematically improve the quality and trustworthiness of their AI applications. The standard provides a structured approach for the internal assessment, documentation, and external communication of quality measures.

It supports AI providers in:

  • systematizing processes and responsibilities in dealing with AI,

  • creating transparency for customers, partners, and supervisory authorities,

  • and preparing early for future regulatory requirements.

The MISSION KI Quality Standard thus helps innovative companies in particular to build trust, reduce risks, and establish quality as a competitive advantage.

Can other stakeholders also benefit from the standard?

Yes. In addition to AI providers, customers, investors, supervisory authorities, and procurement authorities also benefit from the application of the standard. Uniform evaluation criteria create comparability, foster trust, and facilitate well-informed decisions in procurement, investment, and assessment processes.

For contracting authorities and partner companies, the standard provides an objective basis for assessing the quality and reliability of AI systems. Investors, in turn, gain greater transparency into a company’s maturity level and risk management.

In this way, the MISSION KI Quality Standard not only contributes to quality assurance for individual systems, but also strengthens trust, market transparency, and accountability across the entire AI ecosystem.

Is the application of the MISSION KI standard mandatory?

No. The MISSION KI Quality Standard is voluntary. Its value lies in building trust, reducing risks, and preparing early for regulatory requirements – especially with regard to the AI Act and future certification frameworks

How do I carry out an assessment according to the MISSION KI quality standard?

The MISSION KI Quality Standard describes a complete assessment process and provides all the necessary templates – including use case description, protection needs analysis, Assessment Catalogue, Test Method Catalogue, and assessment report template. Organizations can carry out the assessment independently on the basis of these documents.

In addition, the MISSION KI assessment portal is available to facilitate the application. The portal digitizes the assessment process and guides users step by step through the entire workflow.

Can the assessment be carried out independently by organizations themselves?

Yes. The standard serves as a basis for organizations to carry out self-assessments of their AI systems and can optionally be supplemented by validation from external assessors or assessment bodies.

How much time should be planned for carrying out an assessment?

The time required for an assessment typically ranges between four and eight working days. The exact duration depends on several factors:

  • Existing documentation: Organizations with established quality management processes and existing technical documentation can build on available materials and thus significantly reduce the effort.

  • System complexity: The type and scope of the AI system, as well as the number of relevant quality dimensions, influence the duration of the assessment.

  • Available evidence: Tests already carried out, existing metrics, and existing risk analyses speed up the process.

  • Team availability: The timely availability of relevant contacts and experts from development and operations has a direct impact on the turnaround time.

For well-prepared organizations with structured processes, the effort tends toward the lower end of the range, while initial assessments without established documentation structures require correspondingly more time.

Which documents and evidence are required for the assessment?

The following are needed for the assessment:

  • the assessment procedure document for the MISSION KI quality standard, including the explanations and annexes contained therein

  • existing documentation of your AI system, as well as, if applicable, of your IT landscape

  • access to data, processes, and tests

How are my data protected in the MISSION KI assessment portal?

The MISSION KI assessment portal has been designed in such a way that the collection of sensitive data is avoided. The self-assessment primarily takes place in your own system environment. Documented evidence and assessment results remain under your control. The portal serves as a structuring tool and stores only the data necessary for carrying out the assessment.

Who has access to my data during the assessment?

In a self-assessment, only the persons designated and authorized by your organization have access to the assessment data. Depending on the chosen test depth, these may include members of the responsible team, independent internal assessors, or – in the case of voluntary external validation – accredited assessment bodies. Access rights are controlled and documented by your organization.

How is the AI quality standard related to other regulations and standards?

The MISSION KI Quality Standard is closely linked to other regulations and standards. It serves as a practice-oriented complement to and concretization of existing legal requirements, international standards, and assessment catalogues, such as the EU AI Act, the Fraunhofer IAIS Assessment Catalogue, VDE SPEC 90012, or standards of ISO/IEC JTC 1/SC 42. The standard provides guidance for companies to facilitate compliance with regulatory requirements. At the same time, it ensures that AI systems are developed and operated in accordance with global best practices and ethical principles.

Through its regulatory consistency with other regulations (e.g., the GDPR or MDR) and standards, the AI Quality Standard supports companies in harmonizing their processes and ensuring a consistent and trustworthy implementation of AI technologies.

What does “compatible with the AI Act” mean?

“Compatible with the AI Act” means that the standard is aligned with the European AI Act. Although the standard goes into more detail than the abstract regulatory requirements, it nevertheless includes provisions that are substantively linked to the legal text. In addition, the requirements of the Quality Standard do not conflict with the AI Act.

Companies can therefore use the standard as a reference point for building effective AI compliance management. At the same time, due to this compatibility, meeting the legal requirements or established standards also makes it easier to achieve compliance with the Quality Standard.

Beyond the AI Act itself, for example in sector-specific regulation, attention is also paid during the development of the standard to ensuring compatibility.

Implementation partners

The development of our MISSION KI Quality Standard is supported by a strong partnership of leading institutions