I, Rulebot: How AI is turning compliance into software

Policies create work, but AI stands to lighten the load—and it may be the start of a bigger transformation
I Rulebot How AI is turning compliance into software
Jörn Kaspuhl

Sometimes it can seem like generative AI is everywhere and nowhere all at once.

Billions of dollars are pouring into infrastructure to train and run foundation models, ChatGPT has 800 million weekly active users, and new developments (OpenClaw! Auto Browse!) seem to arrive every week.

Yet at the same time we’re still not seeing widespread, scaled adoption of generative AI in business settings. Recent data from the US Census Bureau suggested that only 11 per cent of firms were using AI to produce goods and services—a percentage point lower than the year before.

Part of the issue is that adopting generative AI takes engineering, not just to build the tools but to deploy them into the complex realities of most enterprise tech estates. Another problem is that people and culture take time to recalibrate. But, perhaps most crucially, there is significant risk-aversion. LLMs make things up and get things wrong; they are unpredictable and they can be convinced to do things that they should not. When it comes to regulated industries or activities, where “going wrong” is unacceptable, the challenge is especially pronounced.

“Generative AI is probabilistic. It's non-deterministic by design, and that's a problem for high-stakes settings,” says Colin Payne, head of innovation at the UK’s Financial Conduct Authority (FCA). In other words, traditional software uses rules to produce a repeatable, predictable response, whereas generative AI’s is functionally dependent on randomness, so the same input will produce a different output. “That immediately raises questions around explainability, auditability, responsibility. Those three things are probably what keeps us up at night as a regulator, as protecting trust is the keystone of what we do.”

So where does that leave leaders who want to make use of generative AI in areas where failure is not an option? It’s an opportunity—if it can be done safely.

From subjective and opaque to objective and auditable

In trying to tackle this problem, one emerging strategy is to flip the usual proposition, and think about AI less as a decision-maker than as a generator of systems that make decisions. After all, the main reason generative AI works so well in software development is that it creates a deterministic product that can be tested to assess whether it works as it should. The question is: where else can this approach be applied?

A use case that is attracting particular interest is automating regulatory workflows in sectors such as financial services, telcos, government, and healthcare. Compliance processes are well suited because they are both rules-based and time-consuming: Traditionally they involve a great deal of manual intervention and incorporate multiple documents and processes.

“We were working with a bank, and they told us that 60 per cent of their total cost was spent on compliance—and yet doing it better than the competition doesn’t confer a differential advantage,” says Patrick Gormley, senior vice president of data science and AI consulting lead at technology services provider Kyndryl. “But much of what happens in a compliance workflow are actually “swivelchair” processes—check this database, check that form—that are easily automatable.”

Enter: Kyndryl’s policy as code capability, which the firm is using to automate regulatory workflows. The ambition is not only to accelerate those workflows, but achieve new levels of observability, auditability, and control.

I Rulebot How AI is turning compliance into software
Jörn Kaspuhl

How it works

The first stage is to use AI to break down the content of a written policybook into a chain of discrete steps that a computer can follow. These are visualized as a decision tree through which the compliance process can automatically flow. Doing this involves LLMs alongside other advanced, multi-modal data ingestion technologies, such as optical character recognition and vision models.

The requirements within each node of this tree are enshrined as machine-readable rules. “If X is true, then move to step 2; if X is false, then move to step 3,” and so on. The decision at each step is binary—there is no room for “maybe”. This creates a deterministic workflow, which is reviewed and approved by a human.

Often the process running within a node will resemble traditional software, using logic, API calls, or conventional machine learning to determine whether a specific policy requirement has been met. Sometimes, what’s happening within a node may constitute an “agent”. For Kyndryl, this term encompasses everything from a straightforward LLM call to an LLM dynamically choosing what tools to use to complete the task and make a decision.

But agents are used sparingly. “We value the deterministic approach,” says Lasma Alberte, associate director of artificial intelligence at Kyndryl. “So wherever it can be deterministic we want it to be.” Where agents are used, the system builder can set clear parameters around what data and tools those agents can access, and can make it a requirement for a human to review and approve the output.

Kyndryl devised this codified compliance approach in 2025 in response to a brief from a Middle Eastern government asking for help automating the workflow for business trade licensing. The decision tree produced by Kyndryl’s LLM-driven workflow generation tool comprised more than 60 nodes. When someone wants to register a business, the website takes them through the process and the application travels through these nodes until it gets to an internal review stage. Then it’s either approved or rejected. This approach yields both improved speed and accuracy, and removes friction from the overall user experience—friction that previously led to users failing to complete applications.

Rule-based automation is nothing new, but translating lengthy, complex regulations into sets of machine-executable rules was once time consuming. LLMs now make this possible at scale. “Transformer architectures like LLMs have a probabilistic contextual 'understanding' of the text, based on all the training data that the models have been exposed to,” says Alberte. “It’s not just matching some words and phrases. That is what has changed to enable this right now.”

To Dr Shaun Barney, VP and global head of agentic AI at Kyndryl, the codified compliance framework is especially notable for how it governs agents. “The genuinely differentiated capability here is an enforceable control layer that sits between the LLM and the toolset and deterministically governs what agents are allowed to do,” he says. “This means LLMs can reason freely, but policy decides what can execute; tool access is gated, auditable, and deterministic.” The shift is from thinking about governance as something to be enforced before execution, not after the fact.

“This whole approach is new,” says Gormley. “It is a unique bundling of capabilities and it all comes together in a uniquely rich user interface. Plus it’s backed up with the enterprise technology chops of Kyndryl, which operates more than 50 per cent of the world’s mainframes. We believe we're the only people on the planet who can offer that package at the moment, and the implication for organizations is significant.”

Why ‘better’ beats ‘flawless’

The UK FCA agrees that machine-readable policies are the future of regulation. It is already experimenting with the idea itself. “The FCA has a complex rule book. It used to be the case that it was measured in feet and inches, if you printed it all out,” says Payne. “Now, since it’s machine readable, AI can quickly highlight where something a business is doing might be an issue.”

Policy as code is not a silver bullet, however. Policies that leave a lot of room for interpretation can be hard to enshrine as rules, and can lead to decision trees that branch to an unmanageable degree.

There is also a non-zero risk that an agent may still make an error in judgment, and that the error may even slip through the net at the review stage. But Kyndryl makes the argument that error rates need to be considered in relative rather than absolute terms. “Often the starting position is, ‘If it’s high-risk, don’t give it to AI,’ but ironically the opposite is true, since people can be more fallible than machines,” says Gormley. “If a person is manually reviewing an application, and it's 5pm on Friday, they might be skipping over it and not being so careful. The crucial thing is that the system as a whole is more effective than a human.”

Making compliance smarter

The broader goals of codified compliance are not simply about efficiency, but effectiveness.

On the one hand that’s about making policies themselves more effective. “Leveraging policy as code lets the business gain more insights into their documented policies,” says Alberte. “It gives them an impartial view of issues like conflicts or inconsistencies.” There is also an opportunity, she continues, to combine an automated policy with other data to simulate the consequences of changing rules. For instance, how would tweaking a policy impact the rejection rate across different market segments and change the make-up of your customer base?

Policy as code also stands to make the application of policies more effective. “Take sanctions screening, for example,” says Payne. “It is incredibly difficult, and it's often done on a periodic basis. You do it every few weeks or months. But if you could do that in real time, with a system executing every hour, every day, you would avoid the gaps of time that give you vulnerabilities.”

At a more fundamental level, the hope is that once companies see the viability of this approach—of using AI to make deterministic scaffolding for workflow automations—it may spur a wider transformation. “It is a significant door-opener to more AI-native deployment, and for anyone who wants to stay ahead of the competition that is essential,” says Gormley, who argues that a great deal of working life is effectively workflows. Kyndryl is in discussion with a financial services company, for instance, about using policy as code to help automate mandated “know-your-customer” onboarding processes. It is also working with a healthcare provider on a proof-of-concept for managing radiology patients through clinical pathways.

“But for anyone exploring bringing this idea into their organization, my advice would be that this is execution-dependent,” says Gormley. “You need a provider that has world-class engineering, mission-critical expertise, and who can get you an MVP at speed. And do all that in a way that stands up to the scrutiny of people like CSOs at global banks. It has got to be absolutely bulletproof.”

Want to find out more about Kyndryl's policy as code?