Agentic AI in AML: What It Is and How It's Changing Compliance
AI has been used in Anti Money Laundering (AML) for a long time. That part is not new. The change is about how AI in AML compliance works. From tools that help with one task at a time to systems that can move through a workflow, make choices along the way, use multiple tools, and give a human reviewer a case that is almost done. That's what agentic AI really promises. That's also why the word is suddenly everywhere. According to McKinsey, agentic systems are a step up from reactive Generative Artificial Intelligence (GenAI) tools because they automate complicated tasks instead of just responding to prompts. They do this by combining autonomy, planning, memory, and integration. The Bank of England (BoE) says that 75% of businesses already use AI in some way. This means that the compliance function is no longer asking if AI will show up. It has already happened. Now the question is what kind of AI will come next.
In the sections below, we look at where agentic Artificial Intelligence (AI) fits into Anti-Money Laundering (AML) work, where it adds value, and where firms still need to be careful.
- What Is Agentic AI? Explaining the Buzzword
- How Agentic AI Differs from Traditional AI, Machine Learning (ML), and Generative Artificial Intelligence (GenAI)
- How Agentic AI Works in AML, Step by Step
- AML Use Cases Where Agentic AI Adds Real Value
- The Human-in-the-Loop Imperative
- Risks and Limitations to Be Aware Of
- Is Your Organization Ready for Agentic AI?
What is AI that is agentic? What the Buzzword Means
The best way to explain agentic AI is to say that it is AI systems that can pursue a goal with little human prompting, make decisions based on the situation, use tools, remember things, and change based on feedback. McKinsey sees agents as proactive digital collaborators instead of passive helpers. In its work on financial crime, it talks about a "workforce" of AI agents that can do tasks from start to finish on their own while humans are still responsible for oversight and exceptions.
That sounds vague until you put it next to older tools. A traditional AML model might give a score to a deal. A GenAI system could give a short summary of the case. An agentic system can do more than just one thing such as getting the transaction history, checking the customer record, runing Politically Exposed Person (PEP) and sanctions checks, searching adverse media, comparing the pattern to known typologies, writing a story, and then deciding whether the file should go to a human investigator. It does not take the place of the analyst. But it acts a lot more like one.
The simplest analogy is still the best one. Agentic AI is like an analyst who uses calculators, databases, search tools, and checklists to do an investigation. Traditional AI is like a calculator. That is why independence is important here. Not autonomy in the sense that "the machine does the compliance work." Autonomy in the sense that the system can move through a chain of tasks without having to wait for a person to click "next" every 30 seconds.
What makes Agentic AI different from GenAI, ML, and AI
People use four words that mean the same thing, which is a big part of the problem.
When people talk about traditional AI or machine learning (ML) in AML, they usually mean models that classify, score, rank, or predict. They are good at finding unusual things, scoring risks, analyzing networks, or sending out alerts. They are not usually made to handle an entire investigation from beginning to end.
Generative Artificial Intelligence (GenAI) gives you a new ability. It makes drafts, summaries, explanations, or text. In order to comply, this could mean writing a Suspicious Activity Report (SAR), summarizing a Know Your Customer (KYC) file, or turning investigator notes into a clearer case narrative. Usually, the benefit is that it is faster and easier to read. GenAI can sound sure of itself even when it's wrong, which is a weakness.
When the system stops being just a model and starts acting like a workflow operator, it becomes agentic AI. It can collect evidence, use tools from outside the company, pass work from one agent to another, check outputs, and raise the alarm when confidence drops. The example of financial crime from McKinsey is helpful here because it shows how different groups of agents can pass work along a chain instead of one model giving one answer.
In real life AML terms, the difference between how the models behave looks like this:
- Old school AI/ML: "This alert seems to be high risk."
- GenAI: "This is a short summary of the case that you can read."
- Agentic AI: "I pulled the history, checked sanctions, looked at bad media, compared the pattern to previous alerts, wrote the story, and moved the case up because the evidence chain crossed your risk threshold."
That's the real change. It’s not smarter text. It’s intelligent orchestration.
How Agentic AI works in AML
Following a scenario is the easiest way to understand.
Think about what happens when a transaction monitoring alert goes off. In a normal setup, a human analyst opens the case, gets the account history, checks the customer data, runs sanctions and PEP screening, looks for bad news, and then writes up the results. In an agentic setup, those steps can be divided into roles for agents.
One agent gets the most recent transaction history and finds accounts that are linked to it. Another checks the customer's KYC profile and compares names to lists of people who are on sanctions and PEP databases. A third party looks through negative media and public records. A fourth person puts together the evidence and makes a risk view. A fifth person writes the investigation narrative and decides if the case should be closed, moved up, or filed. Then a person looks over the package, questions weak reasoning, agrees with the conclusion, and makes the final choice.
This isn't just a thought experiment. McKinsey talks about a global bank that built a "factory" for end to end KYC with ten agent squads, each with four or five agents. Different groups were in charge of checking the corporate registry, extracting data, analyzing beneficial ownership, screening for sanctions and PEPs, checking the purpose and nature of transactions, screening for negative media, and putting together the final memo for human review. McKinsey also says that in this kind of model, each human worker can usually supervise 20 or more AI agents.
The part that compliance teams should pay attention to is not the newness. It's the way it's built. The work of AML is already modular. Alerts, KYC updates, sanctions reviews, adverse media checks, and writing memos are all separate steps. Agentic AI works when it's easy to assign, sequence, validate, and log those steps. In other words, it works best when the process flows are already happening.
AML Use Cases Where Agentic AI Really Helps
Alert triage and investigation is still the best use case. The boring middle of AML work is where analysts spend time gathering context before they even start thinking. Agentic systems are great for this. An agent can look at past activity, find previous alerts, compare current behavior to expected behavior, and send cases that are low risk one way and cases that could be serious another. McKinsey says that more basic AI and GenAI support can boost productivity by 15% to 20%. However, agentic models have a much bigger upside because they can change the entire workflow instead of just helping with one step.
KYC refresh and periodic review are also a good fit. Agentic systems can collect new filings, pull ownership data, screen directors, review negative media, and make an event-driven refresh pack without having to wait for a calendar date. McKinsey's bank example was based on that very change, going from periodic review to digital, event-driven due diligence. The role of people doesn't go away. It's more about making decisions and overseeing things.
Writing SAR narratives is a third area where the value and risk are both clear. It takes time to write good stories. Agentic workflows can put evidence in order, organize the timeline, and write a first draft much faster than a person who starts with a blank page. But this is also where the risk of hallucinations becomes too high. A 2025 arXiv paper on agentic AML narrative generation says that Large Language Models (LLMs) are fluent but have problems with factual hallucination and poor explainability. The authors respond by keeping human investigators firmly in the loop.
Next is the rescreening of sanctions updates. This work isn't glamorous, but it is painful for the business. When new designations come in, a company may need to quickly re-screen a lot of customers, find possible matches, get customer context, and set priorities for investigations. Agentic orchestration can help by directing the work. One agent takes care of adding new lists, another checks names and IDs, another collects customer context, another writes the match memo, and a person looks over cases that are on the edge. The value here isn't so much in magic as it is in speed, consistency, and being able to check things.
Lastly, there is the investigation of the complicated case. These are the ones that are messy. Many things. Movement across borders. Information that is open source. Signals that don't agree. Agentic AI is helpful here because the problem isn't just summarizing. It is coordination. Different agents can get documents, look at groups of transactions, compare public records, and make structured findings. A human investigator, on the other hand, can only look at the parts that still need judgment. At that point, the technology starts to feel less like a chatbot and more like a group of people.
The Human in the Loop Rule
This is the most important part, and it's usually the part that gets hyped up.
AML still needs human in the loop control for responsible agentic AI. According to McKinsey, people are needed for coaching, oversight, and handling exceptions. In a different but still useful AI governance context, NYDFS says that both senior management and the board are still responsible for the results of AI systems, not just for giving them the green light and leaving. The principle fits perfectly into compliance as the model does not take responsibility away from the institution.
There are three things that happen as a result. First, agents need limited freedom. They can collect, sort, compare, and write, but their rights should be limited. Second, outputs need to be easy to understand. If an examiner asks why a case was escalated, the company needs more than just "the model thought so." Third, people must still be the ones who are ultimately responsible for important compliance decisions. According to a survey by the Bank of England and the Financial Conduct Authority, only 2% of AI use cases in UK financial services are fully autonomous. 24% are semi autonomous and still need human oversight for important or unclear decisions. That's a smart way to go.
“Replacement” is not the best way to frame agentic AI. It is a "force multiplier." It's better to have both a good compliance analyst and good tools than just one of them.
Things to Keep in Mind About Risks and Limitations
The first danger is seeing things that aren't there. This is not a hypothetical matter. The U.S. Treasury's 2024 report on AI in financial services says that people who answered the survey pointed out hallucinations as a risk that only GenAI models face, along with bias and explainability. The 2025 arXiv paper on AML narratives makes the same point even more clearly, which is that it is not okay to have factual hallucinations in content that is important for compliance. It's not just a matter of looks if an agent writes a SAR narrative that includes made up facts. It is a rule.
The second risk is how good the data is. BoE and FCA say data privacy and protection, data quality, and data security are among the biggest current AI risks in financial services. That fits with what we know about AML. If your customer data is scattered, your bad media feeds are noisy, and your alert history is inconsistent, agentic AI won't help. It will make the mess bigger.
The third is fairness and bias. The Treasury report says that bias is a major issue, and the 2024 AI circular letter from the NYDFS makes it clear that governance, proxy discrimination, and board level accountability are still important when AI is used to make decisions that are regulated. That circular is about insurance underwriting, not AML. But the lesson about governance still holds. Companies need to check if their models include harmful or unfair bias and if they can defend the results.
The fourth is the gap in explainability. The Treasury report says that explainability includes how hard it is to understand how models make output. That matters quickly in AML. When investigators question an escalation, it matters. When an internal audit asks why an alert was closed, it matters. And it really matters when a regulator wants to know why an AI assisted process came to a conclusion. If the answer is unclear, trust goes down.
The fifth is how hard it is to put into action. According to the BoE and FCA, one third of AI use cases are third party implementations, the top three cloud providers make up 73% of all named providers, and companies see third party dependency, model complexity, and hidden models as growing risks. That is not a small technical issue. It is a compliance issue. Agentic systems are not often just one model. They are made up of stacks of models, tools, APIs, permissions, prompts, logs, and vendors.
And last but not least, there is uncertainty about the rules. Right now, official guidance still talks about AI in general terms like safety, fairness, explainability, automated decision making, third party dependence, and accountability. As a category, agentic AI still doesn't have a full set of AML specific rules for the market. That doesn't mean businesses should wait. It does mean they should plan for people to look at their work from the start. That is an inference based on the current state of regulations, not a promise that more detailed information will be available on a certain date.
Is your business ready for agentic AI?
Some companies are ready to try new things now. Not everyone is. The gap is usually more about the basics than the budget.
You are closer to being ready if you have data that is fairly clean and organized, clear AML workflows, clear escalation paths, and compliance staff who can check and question AI generated work. You are also better off if your current system already sends out too many alerts or false positives, which makes it hard to sort through them and gather evidence. That's where agentic models usually make their first money.
If your process map is still fuzzy, your data isn't consistent across systems, and your AI governance model is mostly a wish list, you probably aren't ready yet. If that's the case, going straight to agentic AI might not be the best idea. Traditional Machine Learning (ML), Natural Language Processing (NLP), or more specific GenAI applications may be the more prudent initial approach. BoE and FCA data show that the market is still in the early stages of that journey. Only a small number of current use cases are fully autonomous, and businesses are still focused on explainability, data governance, and third party risk.
That's probably the best thing to think about at the end. There is real agentic AI in AML. It's not science fiction, and it's not just a new name for automation. But it's not a way to get around bad data, bad governance, or bad human judgment either. The companies that will benefit first are the ones that treat it like a decision about how to run their business, not just a purchase of technology.
