EU AI Act compliance guide

Evidence-Based vs. Questionnaire Compliance: Why Auditors Reject Assertions

Questionnaire-based AI Act compliance tools produce self-attestations that enterprise security reviewers and auditors cannot verify. This guide explains the technical and practical difference between assertion-based and evidence-based approaches — and why it matters for closing enterprise deals.

·9 min read

The compliance theater problem

A new category of compliance software has emerged to help companies respond to EU AI Act requirements: questionnaire tools that ask a series of questions and produce a PDF. The provider fills in boxes, the tool generates a report that says "Compliant," and the PDF lands in the procurement folder.

This approach has a fundamental problem: none of the claims can be independently verified.

This guide explains the technical difference between assertion-based and evidence-based compliance, why sophisticated auditors reject the assertion-based approach, and what evidence-linked documentation actually looks like.


What is assertion-based compliance?

Assertion-based compliance is the approach taken by questionnaire tools and many Excel-based self-assessment frameworks. The provider answers a series of yes/no or free-text questions:

  • "Do you validate your models before deployment?" → Yes
  • "Do you have a data governance process?" → Yes
  • "Is there monitoring in place for production models?" → Yes

The tool aggregates these answers into a report. The report states that the provider asserts compliance.

What an assertion-based report actually says

Every claim in an assertion-based report is epistemically the same statement: "We say we do this."

There is no mechanism to verify the claim. The security reviewer cannot check whether validation actually happened, cannot examine the validation data, and cannot determine whether the monitoring described is meaningful or performative.


What is evidence-based compliance?

Evidence-based compliance links every claim in the Annex IV technical documentation to a verifiable artefact — a live artefact that an engineer or auditor outside the company can examine.

ClaimAssertionEvidence
"Models are validated before deployment"Free-text declarationMLflow experiment #447, 2026-05-12: AUC-ROC 0.893 on n=12,847 held-out set
"Training data is documented""Yes, we maintain data sheets"Link to data sheet v2.1 with provenance, annotation methodology, and bias audit
"Performance is monitored in production""We have a monitoring dashboard"Grafana dashboard with p95 latency, drift detection, and alert thresholds defined
"Cybersecurity measures are in place""We follow security best practices"Penetration test report dated 2026-04-15; adversarial robustness evaluation #88

The difference is not about what the provider says — it is about what a reviewer can verify independently.


Why auditors reject assertions

1. Security reviewers are trained to distrust self-attestation

Enterprise security teams at regulated buyers (financial services, healthcare, government) conduct vendor due diligence using the same scepticism they apply to any externally provided claim. In security work, "we follow best practices" is considered an empty statement without proof.

The same framework now applies to AI compliance. A security reviewer who has been asked to evaluate whether a vendor's AI system meets EU AI Act requirements cannot accept a provider's self-declaration as evidence of compliance — any more than they would accept a vendor's claim to have a firewall without being shown evidence of one.

2. Notified bodies require reproducible evidence

For systems that do require third-party conformity assessment, notified bodies have published guidance making clear that assertion-based files will not be accepted. Even for internal conformity assessment (which most HR-tech providers are allowed to conduct), the standard of evidence required is the same.

The practical question to ask about any claim in your technical file is: could a notified body verify this independently? If the answer is no, the evidence is insufficient.

3. Article 13 transparency obligations flow to deployers

Article 13 of the EU AI Act requires providers to design high-risk AI systems such that deployers can understand and oversee the system. This implies that the provider must be able to demonstrate — not assert — what the system does and how it performs.

A deployer who cannot obtain verifiable evidence of a provider's compliance cannot fulfil their own monitoring and oversight obligations. This creates a procurement risk that sophisticated legal teams now catch.

4. Assertions decay; evidence is versioned

A questionnaire-based report reflects a point-in-time statement of intent. Models are retrained, datasets change, and monitoring configurations evolve. Without a versioned, evidence-linked technical file, the compliance report is stale within weeks of production changes.

Evidence-linked documentation is tied to specific artefacts (experiment run IDs, dataset version hashes, dashboard snapshots) and can be updated automatically when systems change. Assertions must be manually re-attested, creating compliance drift.


The three failure modes of questionnaire compliance

Failure mode 1: The questionnaire asks the wrong questions

Commercial questionnaire tools are designed for a wide audience. They ask generic questions that avoid the technical specificity required for Annex IV technical documentation. A provider of a candidate-ranking system and a provider of a medical diagnostic system get the same questions. Neither file ends up with the technical depth required for their specific context.

Failure mode 2: The answers cannot be tested

When a security reviewer asks "can you show me the experiment log behind this validation claim?", an assertion-based provider has two options: go find the data (if it exists and is accessible) or admit they do not have it. The second outcome is more common than it should be.

Evidence-linked documentation is designed for this scenario. Every claim has a source reference. The review conversation becomes: "here is the claim, here is the artefact, here is how to read it."

Failure mode 3: No update trigger exists

When a model is retrained, the questionnaire-based compliance report is immediately out of date. There is no mechanism to detect this — no version control, no artefact linkage, no automated notification.

Evidence-linked documentation has a direct relationship with the model registry: when the model version changes, the technical file reflects it. Sections that are affected by the change are flagged for review.


What enterprise security reviewers are actually checking

Based on the security review process at large enterprise buyers in the financial services and healthcare sectors, the review of an AI vendor's compliance documentation typically follows this pattern:

  1. Request the Annex IV technical file — if the vendor produces a questionnaire PDF, the review typically escalates; reviewers expect a structured technical document
  2. Check the validation section — reviewers look for experiment IDs or test run references that can be cross-checked; performance numbers without a source are flagged
  3. Check the data governance section — reviewers look for evidence of dataset provenance and bias evaluation; "we curate our training data carefully" is not acceptable
  4. Check the monitoring section — reviewers ask to see the monitoring dashboard; "we monitor performance" without evidence of what is measured and at what threshold is flagged
  5. Check the change management section — reviewers ask what happens when the model is retrained; "we review before release" without a documented process and version history is flagged

The pattern is consistent: reviewers are looking for evidence they can verify, not assertions they cannot.


The commercial impact

The security review is now part of the enterprise sales cycle for EU AI products. Deal timelines extend when the compliance review cannot be closed quickly. Deals are lost when reviewers cannot verify claims.

The average time a compliance questionnaire file spends in a security review is 6–11 weeks. The typical outcome is an escalation to the CISO, a request for additional evidence, and a procurement hold.

Evidence-linked documentation closes the compliance review stage faster because the reviewer can verify claims in the session rather than requesting additional evidence across multiple email cycles.

The providers winning enterprise deals are not the ones with the most comprehensive questionnaire answers. They are the ones who can open a live artefact in response to every question.


How to build evidence-linked documentation

  1. Start with your model registry. Every model version in production should have a corresponding experiment run ID. If you use MLflow, Weights & Biases, or a similar tool, this infrastructure already exists.

  2. Create data sheets for every training dataset. Document provenance, annotation methodology, known limitations, and bias evaluation results. Link to the actual dataset version, not a general description.

  3. Connect your monitoring dashboard to the technical file. Take a snapshot of the dashboard configuration at the time of each documentation update. Record the metrics tracked and the alert thresholds.

  4. Use version control for the technical file itself. The file should live in git or equivalent, with a change history that reflects model version changes.

  5. Automate what you can. modeldocs connects to your ML infrastructure via read-only API and automatically maps live artefacts to the Annex IV sections that require them. When the model changes, the affected sections are flagged.

→ Run the free Readiness Check to see which sections of your current documentation are assertion-based and which are evidence-linked, or read the complete Annex IV requirements breakdown.

See where your documentation stands

Nine questions. Two minutes. Instant gap analysis across every Annex IV section — free, no signup.

Run the readiness check