Most AI assistants sit idle until you ask them something. You type a question, they answer it, and then they wait. In trade compliance, that model has a fundamental problem: risks don't wait for someone to ask about them.
At Atlas Verified, we built a system of autonomous cloud agents that continuously monitor supply chains, scan government databases, and surface compliance risks on a schedule — without anyone typing a prompt. They run in the background, across every organization on our platform, and deliver structured intelligence directly to a dashboard.
This post covers the engineering behind always-on AI agents: how they differ from conversational AI, why structured output matters, and the orchestration challenges of running autonomous agents at scale.
From Reactive to Proactive
Conversational AI is inherently reactive. A user asks a question, the model reasons about it, calls some tools, and returns an answer. This works well for ad-hoc research and document analysis. But compliance monitoring demands proactive intelligence.
Consider a scenario: a supplier's organic certification is suspended by the USDA overnight. A shipment from that supplier is already in transit. A new OFAC designation adds a freight forwarder to the sanctions list. An FDA import alert is issued for a commodity you're actively importing.
None of these events wait for a compliance officer to log in and ask the right question. By the time someone thinks to check, the window for proactive action may have closed. The goods are at the port. The paperwork is filed. The exposure is real.
Cloud agents solve this by inverting the relationship. Instead of the user driving the AI, the AI drives itself on a schedule, checks everything that matters, and pushes findings to the user.
Agent-Per-Quadrant Architecture
When we first prototyped cloud agents, we tried a single agent that covered everything: regulatory changes, supplier status, shipment tracking, and document processing. It produced unfocused, rambling output. The agent would chase one interesting thread and neglect entire categories of intelligence.
The solution was to divide the problem space into focused domains, which we call quadrants:
Intelligence. Regulatory alerts, market trends, and cross-cutting risks. This agent monitors government databases for changes that affect the organization — new FDA import alerts, OFAC list updates, tariff changes, organic certification status changes.
Suppliers. Supplier network health. This agent reviews supplier verification status, compliance gaps, certification expirations, and network concentration risks. If 80% of your organic supply comes from a single certifying agent, this agent flags it.
Shipments. Active shipment monitoring. This agent tracks vessel positions, monitors route deviations, checks for delays, and validates that shipment documentation is complete. It surfaces issues like containers sitting at port longer than expected or vessels deviating from declared routes.
Workspace. Operational health. This agent reviews document processing status, pending verifications, team activity, and platform usage patterns. It surfaces things like documents that failed processing and need attention, or verifications that have been pending for too long.
Each agent runs with a focused system prompt and a constrained set of tools relevant to its domain. This produces dramatically better output than a single omniscient agent, for the same reason that a team of specialists outperforms a single generalist: depth of focus.
Sharing a Tool Layer with the Chat Agent
One of the most important architectural decisions was making cloud agents use the exact same tool layer as our conversational chat agent. The same OFAC screening, USDA lookups, vessel tracking, trade data queries, and document analysis tools power both systems.
This has significant engineering advantages:
No tool duplication. When we add a new data source or improve an existing adapter, both the chat agent and cloud agents benefit immediately. There's one implementation, one set of tests, one maintenance surface.
Consistent results. A sanctions check run by a cloud agent and a sanctions check run by a user in chat call the same function with the same logic. Results are directly comparable and cross-referenceable.
The challenge is that tools designed for interactive use don't always work the same way in batch context. A chat agent can ask the user for clarification when a query is ambiguous. A cloud agent can't. We handle this by giving cloud agents more specific prompts and higher recursion limits — they have more time to self-correct because no one is waiting in real time for a response.
Structured Output: From Free Text to Dashboard Cards
This is where cloud agents diverge most sharply from chat agents. A chat agent produces narrative text, sometimes with embedded structured cards. A cloud agent must produce exclusively structured output that renders directly on a dashboard.
Each agent run produces an array of intelligence cards. Each card has a defined type and schema:
Counters show key metrics with trend indicators — total active shipments, suppliers with expiring certifications, pending verifications.
Risk flags highlight specific compliance concerns that need attention — a supplier with a suspended certification, a sanctions match, an FDA import alert.
Alerts communicate time-sensitive information — regulatory changes, new OFAC designations, tariff updates that affect active shipments.
Charts and tables present analytical data — shipment volume trends, trade lane distribution, verification completion rates.
Maps show geographic intelligence — vessel positions, port activity, supplier distribution.
The engineering challenge is constraining an LLM to produce valid structured output consistently. Language models want to narrate. They want to explain and qualify. Getting them to produce a clean JSON array of cards with the right types, the right fields, and the right data requires careful prompt engineering and output validation.
When the agent's output doesn't match the expected card schema, we parse what we can and discard malformed cards rather than failing the entire run. Partial intelligence is better than no intelligence.
Custom Agents: User-Defined Autonomous Monitors
Beyond the four system agents, users can create their own cloud agents. A custom agent has a name, a system prompt (written by the user), and a schedule. It uses the same tool layer and produces the same card types, but its focus is entirely user-defined.
A user might create an agent that monitors a specific trade lane for disruptions. Or one that checks a particular set of suppliers against sanctions lists daily. Or one that tracks price movements for a specific commodity using trade data sources. The possibilities are limited only by what the tool layer can access.
Making this safe and useful required several engineering decisions:
Organizational scoping. Each custom agent belongs to an organization and can only access that organization's data. An agent can't cross organizational boundaries, even if the user's prompt tries to make it.
Tool capability filtering. Custom agents have access to the same verification and research tools as system agents, but they inherit the organization's permission boundaries. A custom agent can't access tools or data that the user themselves couldn't access through the chat interface.
Schedule management. Custom agents run on user-defined schedules, but within platform constraints. This prevents a single organization from consuming disproportionate compute resources by scheduling dozens of agents to run every hour.
Orchestration at Scale
Running autonomous agents across many organizations on a daily schedule introduces orchestration challenges that don't exist in a chat-based system.
Deduplication. If the scheduler fires and an agent for Organization A already ran within the deduplication window, it's skipped. This prevents double-runs from retries, overlapping schedules, or scheduler hiccups. Each agent-organization pair is treated as a unique scheduling unit.
Failure isolation. If one organization's agent fails — because a tool returns an error, or the LLM produces unparseable output, or a dependent service is down — it must not affect other organizations' agent runs. Each run is an independent job. Failures are logged, the run is marked as failed, and the scheduler moves on.
Card lifecycle. Intelligence cards have a time-to-live. Old cards expire and are replaced when the agent runs again. This ensures the dashboard always shows current intelligence rather than accumulating stale findings. When an agent produces new cards, the previous cards for that quadrant are atomically replaced.
Resource budgeting. Agent runs consume LLM tokens, external API calls, and database connections. We track token usage per run and per organization to manage costs and identify agents that are consuming disproportionate resources. A runaway agent that makes 50 tool calls when 10 would suffice gets flagged.
Bridging Cloud Agents and Chat
Cloud agents and the chat agent are not isolated systems. They form a feedback loop.
The chat agent has the ability to query cloud agent output. When a user asks "what did my agents find?" or "are there any compliance risks I should know about?", the chat agent retrieves the latest intelligence cards and presents them conversationally. It can explain a risk flag, provide additional context, or help the user take action on a finding.
This creates a natural workflow: cloud agents surface issues overnight, the user reviews the dashboard in the morning, and when they want to dig deeper into a finding, they open a chat conversation. The chat agent already knows what the cloud agents found and can pick up where they left off.
The reverse path also works: patterns that emerge from user chat interactions inform how we tune cloud agent prompts. If users frequently ask about a specific type of risk that cloud agents aren't surfacing, that's a signal to adjust the agent's focus.
What We've Learned
Building autonomous AI agents has taught us lessons that aren't obvious from building conversational ones:
Structured output is harder than free text. Getting an LLM to consistently produce valid, schema-compliant JSON is a meaningfully different challenge than getting it to write a good paragraph. The models want to explain, and you need to channel that analytical energy into structured fields.
Scope is everything. A focused agent with a clear domain produces better intelligence than a broad agent with access to everything. The agent-per-quadrant architecture was the single biggest quality improvement we made.
Partial results beat failed runs. In a batch system, you can't ask the user what to do when something goes wrong. Graceful degradation — producing the cards you can and noting the gaps — is essential.
Users want control. Custom agents turned out to be one of the most requested features. Compliance professionals know their specific risks better than any general-purpose system. Giving them the ability to program autonomous monitors in natural language unlocked significant value.
The feedback loop matters. Cloud agents and chat agents are more valuable together than either is alone. Proactive intelligence sets the agenda; conversational AI provides the depth. Designing for this bidirectional flow from the start shaped both systems for the better.
Looking Forward
Cloud agents represent a fundamental shift in how we think about AI in compliance. The question isn't "what can the AI answer when asked?" but "what should the AI be checking without being asked?"
As we expand the tool layer with more data sources and as the regulatory landscape grows more complex, the value of autonomous monitoring increases. The compliance professional's day shouldn't start with manually checking six government databases and three tracking services. It should start with a dashboard that tells them exactly what changed overnight and what needs their attention.
That's what cloud agents deliver. And the engineering challenge of making that reliable, scalable, and useful is one of the most rewarding problems we get to work on.
Atlas Verified's cloud agents autonomously monitor supply chains, scan government databases, and surface compliance risks — so compliance professionals can focus on decisions, not data gathering.