How Conversational AI Integration with Legacy Systems Works (CRM, ERP, APIs)

Discover the engineering behind Autonomous AI Agents: how Language Models (LLMs) communicate in real time with legacy CRMs and ERPs through corporate APIs.

Marlos Carmo

June 6, 2026

7 min read

How Conversational AI Integration with Legacy Systems Works (CRM, ERP, APIs)

TL;DR

The true utility of corporate Artificial Intelligence is not in generating text but in executing actions. This requires Autonomous Agents (LLMs) to connect to legacy systems (ERPs, CRMs, legacy databases). Through a technique called **Function Calling (Tool Calling)**, AI converts the customer's natural language into validated JSON payloads, fires REST/SOAP requests to the company's APIs, and translates the technical response (e.g., shipping status) back into empathetic human dialogue.

When we discuss the impact of Artificial Intelligence on the corporate market, the general focus falls on language fluency: how the machine sounds natural, how it demonstrates empathy, and how it understands complex contexts. But for CTOs and software architects, the real conversation is different.

An AI that only knows how to "talk" beautifully is useless for a large-scale operation. If the customer wants to cancel an internet plan, the AI needs to physically (digitally) go to the company's legacy billing system, pause the credit card charge, verify contractual penalties in the ERP, and register the cancellation in the system.

In this deep engineering guide, we will deconstruct how, behind the technical scenes, the integration of Advanced Conversational AI with the underworld of Corporate Legacy Systems works.

The Agent Paradigm: How AI Learns to "Act"

Historically, connecting a chat interface to a database required rigid paths. If the customer clicked the "My Orders" button, the system triggered by the button fired a static GET to the API /api/orders/{userId} and returned the status on screen. Everything was predictable, rigid, and hardcoded.

With Conversational AI (state-of-the-art LLMs, such as GPT-4o or Claude 3.5), the architecture changed drastically thanks to a mechanism known as Function Calling (Tool Calling / Tools).

The machine has no rigid routes. The AI Agent is fed not only with the conversation history but also with the API Schema of your legacy system. The model reads your API documentation and "learns" what it can do.

The technical sequence happens like this:

Prompt Reception: The customer sends on WhatsApp: "Hey, the installment for my motorcycle this month still hasn't cleared in the app; can you check if something went wrong at the bank?"
Internal Reasoning: The LLM understands the intent. It concludes: "I need to find the financial status of the CPF linked to this phone number for the current month."
Tool Selection: The AI analyzes the list of "Tools" that developers connected to it. It finds the tool fetch_invoice_status_erp.
Payload Generation: The AI translates the request into a perfect, structured JSON (e.g., {"cpf": "12345678900", "reference_month": "06-2026"}) and returns this JSON to the orchestration layer.
The Execution: The platform's orchestration (such as Tolky's engine) takes this JSON, fires the secure HTTPS request against the client's legacy ERP, and receives the raw return (e.g., "status": "pending_clearing_bank").
The Natural Response: The Agent receives the technical data, understands the meaning, and formulates the customer's response: "I checked here in the system! Your payment is processing (awaiting bank clearance). Since you paid via bank slip yesterday, it takes up to 48 business hours. Rest assured it's already on our radar!"

The Legacy Systems Abyss: ERPs, CRMs, and Mainframe Databases

In theory, Function Calling is beautiful. In daily corporate practice, APIs are not modern, clean, and fast. Major automakers, banks, logistics industries, and health plans operate on top of "silicon dinosaurs": legacy systems developed 10, 15, or 20 years ago.

Integrating the modern fluency of AI with these systems requires robust failure-workaround strategies.

The SOAP and XML Problem

While AI documentation expects JSON and modern RESTful APIs, half of corporate Brazil still runs on SOAP and intricate XML responses. The solution: Enterprise AI platforms (like Tolky) use Middleware layers (Integration Orchestrators). The AI Agent always generates and reads JSON. The Middleware intercepts the call, wraps it in XML, authenticates to the legacy server via SOAP, and when receiving the brutal ERP response, translates it back to JSON so the Agent can read the information without hallucinating.

The Extreme Latency Challenge

LLMs already have natural latency (the time to generate the response, known as Time to First Token). When you add a legacy ERP that takes 12 seconds to respond to a query on a gigantic database, the customer sits in a void on WhatsApp thinking the bot "froze." The solution: Asynchronous Conversational UX. Agents are programmed to send organic holding messages. While the request runs in the background in the legacy system, the AI Agent types: "Just a moment, I'm accessing your region's logistics server..." This calms the user and technically masks the infrastructure delay.

Security, Governance, and Damage Limitation (Guardrails)

One of a CTO's biggest concerns when deploying Conversational AI is the corporate security nightmare (Compliance). The classic question is: "What if the AI goes rogue and starts refunding everyone in the financial system?"

In enterprise integration architecture, the LLM never has direct database access. The model runs in an isolated, sterile zone.

The critical concept here is building Guardrails (Protection Fences) at the API level.

Orchestration Protection Layers:

Passive Authentication: The Agent does not have the administrator password. It sends the intent of the action, but the orchestration intercepts and applies the restricted OAuth token associated solely with that specific session client. If the AI tries to request client B's data while talking to client A, the company's API will reject it with Error 403 (Forbidden).
Mathematical Hard-Limits in Code: The AI might try to execute the command apply_discount({"value": 90}). But the integration middleware has a rule in closed code (not influenceable by prompts) saying: if (discount > 30) throw Error. The AI hits an error and is forced to say to the customer: "Sorry, my system limit for discounts is 30%."
Human-in-the-Loop (Critical Decisions): For integrations with high-risk systems (e.g., canceling a life insurance policy in the legacy system), the API requires human approval. The AI Agent packages all data, creates the request, and sends it to the human supervisor's screen. The supervisor clicks "Approve" and the legacy system executes the action.

The Intelligence Cycle: Feeding Back into the CRM

A successful integration is not just a one-way street (where AI fetches data from the system). The real power is when AI writes intelligence back into your systems.

Historically, the richest data in a company (the customer's voice and pain in daily conversations) died in unstructured log databases, impossible for managers to read.

With deep integration, the Conversational AI Agent acts as an autonomous data filler: After finishing a WhatsApp service interaction that lasted 40 minutes resolving a complex return in the legacy ERP, the Agent generates a 3-line synthetic summary ("Customer dissatisfied with logistics delay, but retained subscription after 15% discount") and triggers via API the Corporate CRM (such as Salesforce, HubSpot, or Tolky's native AI CRM) to update the customer card.

The salesperson or director no longer needs to read hundreds of messages in the panel. The AI extracted, summarized, and reinjected the "juice" of intelligence directly into the veins of the executive management system.

The Technical Verdict

The technical complexity of scaling artificial intelligence in corporate environments does not lie in choosing the OpenAI or Anthropic model. The abyss lies in reliable integration and API orchestration in fragmented architectures.

It doesn't help to invest millions in the world's best LLM if your data pipes don't support the traffic or lack the necessary mathematical security fences.

Modern architecture requires platforms that abstract this burden from the company's engineering team. A solution like Tolky differentiates itself precisely at this obscure layer: it acts as the central nervous system. Tolky handles the complexity of the LLM, Rate Limiting, Timeouts from slow APIs, and secure packaging, allowing legacy companies to become hyper-modern giants without needing to rewrite their systems from scratch.

The true turning point of 2026 is clear: software doesn't just converse. Software works.