By Donal Kerr, CEO, Run Audit
In March 2026, the Financial Reporting Council published something that has no precedent anywhere in the world: formal guidance for audit firms on the use of generative and agentic AI in audit engagements. It’s the first document of its kind from any audit regulator globally, and if you work in IT audit, GRC, or audit technology, it deserves your full attention.
Not because it’s prescriptive — it deliberately isn’t. Not because it threatens sanctions for firms that get it wrong — at least not directly. But because it articulates, clearly and authoritatively, the framework through which AI tools deployed on audit engagements will be evaluated. This is the intellectual foundation that the profession has been operating without. And now it exists.
Three Risks the FRC Wants Firms to Take Seriously
The guidance opens with a statement that should be obvious but frequently isn’t: generative and agentic AI tools have the potential to significantly enhance audit quality and simultaneously pose risks to audit quality. Both things are true. Both matter.
The first risk is deficient output — the risk that what the AI system produces is simply wrong. The guidance identifies five failure modes: hallucinations (fabricated information), omissions (missing information), distortions (misrepresentation of meaning), faulty reasoning (unsupported conclusions), and inconsistencies (contradictory outputs). For agentic systems, this risk is amplified because errors compound across steps.
The second is misuse of output — the risk that an appropriate output is used inappropriately. Two variants: misinterpretation (the audit team misunderstands what the output means) and misunderstanding of methodology (the team draws inappropriate conclusions from it). This is subtler but arguably more dangerous — because the AI has done its job correctly and the error lies entirely with human response.
The third is non-compliant methodology — the risk that the firm’s audit methodology, when it incorporates AI-enabled procedures, no longer meets auditing standards. AI-enabled procedures may produce outputs relating to entire populations rather than samples, making direct comparison with traditional substantive procedures difficult.
The Four Mitigations — and the One That Matters Most
The framework has four pillars: system design and development, certification, staff education and governance, and human in the loop review and oversight. But one principle runs through all four like a spine: human accountability doesn’t change.
The guidance is explicit — it is people, the firms and Responsible Individuals, who remain accountable for audit quality. The technology changes. The regulatory framework does not. You cannot outsource audit judgement to an AI system. The conclusion is yours. The methodology is yours. The professional liability is yours.
What the guidance calls human in the loop isn’t a checkbox. It’s the mechanism by which accountability is actually exercised — a human who directs, authorises, or reviews the system’s actions or outputs at runtime.
The Agentic AI Question
An agentic AI system is one that can orchestrate and execute multiple components and tasks toward a goal, with some degree of autonomy. That’s materially different from a single LLM you prompt and review. In an agentic system, an orchestrator LLM interprets the goal, designs a work programme, integrates outputs of other components, and manages iteration. Each of those responsibilities creates its own risk of deficient output — and those risks interact. The FRC uses the phrase combination risk to describe how individually minor errors in one component can be amplified as they pass through subsequent components.
The guidance doesn’t say don’t use agentic AI. It says: understand the risks specifically, design your mitigations specifically, and maintain meaningful human oversight at control points where the consequences of deficient output are greatest. That’s a demanding standard. It’s also a reasonable one.
What This Means for You
If you’re a practitioner evaluating AI tools for your engagements, the FRC’s framework gives you a precise vocabulary for due diligence. When a vendor claims their tool automates compliance or generates findings automatically, the right question is: where does the human judgement live in this workflow? What control points exist? How are deficient outputs identified? What does the certification process look like?
These aren’t hostile questions. They’re the questions any serious practitioner should be asking. The FRC has now given you the vocabulary to ask them precisely.
In Part 2, I’ll explain how we’ve built Run Audit with these principles at its core: what the FRC’s framework looks like when translated into the specific design decisions of an AI-native audit platform, and why the guiding principle — automate the grind, preserve the judgement — isn’t just a slogan but a system architecture.
