Build a WhatsApp AI support agent with n8n: RAG answers, per-contact memory, auto-replies
Wire the WhatsApp Business Cloud API into an n8n RAG agent that answers from your own docs, remembers each contact's conversation, and replies automatically, with a sales handoff built in.
WhatsApp is where a lot of inbound actually lands, and most businesses answer it with a human copying canned replies or, worse, leaving messages on read for hours. The fix is not a dumb autoresponder that says "we'll get back to you." It is an agent that reads the question, answers from your real knowledge base, remembers the conversation, and nudges interested people toward a booking.
This guide builds exactly that in n8n: an inbound WhatsApp agent on the official Business Cloud API, grounded in your own documents with retrieval-augmented generation (RAG), with per-contact memory so it never forgets what the customer just said.
Available resources#
This build uses one workflow plus a small amount of backing infrastructure:
- n8n workflow "WhatsApp Client Engagement" (webhook verification + message handling + RAG reply).
- A Supabase vector store (
whatsapp_documents) holding your embedded knowledge base. - A Postgres table (
whatsapp_n8n_chat_histories) for per-contact conversation memory.
What you'll need#
Before you begin, make sure you have:
- An n8n account (cloud or self-hosted) with the workflow editor.
- A Meta WhatsApp Business app with the Cloud API enabled: a phone number ID, a permanent access token, and a verify token you choose.
- An OpenAI API key for both the chat model (
gpt-4o-mini) and embeddings (text-embedding-3-small). - A Supabase project with
pgvectorenabled, awhatsapp_documentstable, and thematch_documentsquery function. - A Postgres database (the Supabase one works) for the chat memory table.
Overview of the automation#
There are two jobs in one workflow. They share a single webhook path (fusionsk).
- Verification. Meta sends a one-time
GEThandshake to confirm you own the webhook. The workflow echoes back thehub.challengevalue. - Message handling. Every inbound message hits the same path as a
POST. The workflow extracts the text and sender, runs a RAG agent grounded in your docs and the conversation history, and sends the reply back over the WhatsApp Cloud API.
Verification: echo hub.challenge back to Meta once.
Receive inbound message events from the Cloud API.
Extract chatInput (message text) and sessionid (wa_id); drop empty events.
OpenAI chat model + Postgres memory + Supabase document retrieval.
Reply to the original sender with the agent output.
- WhatsApp: automatic, grounded reply per message
- Postgres + Supabase: memory and knowledge that persist
The design decision that makes this reliable is grounding plus memory. RAG keeps answers tied to your real documents instead of the model's imagination, and per-contact memory keyed on the WhatsApp ID means a back-and-forth conversation actually holds together.
Step-by-step setup#
1. Stand up the knowledge base in Supabase#
Create a whatsapp_documents table with a vector column and the match_documents function (the standard Supabase + LangChain pgvector setup). Embed your source material (service descriptions, FAQs, pricing notes, policies) with text-embedding-3-small and load it in.
2. Handle the Meta verification handshake#
Add a Webhook node on path fusionsk set to respond via a response node. Meta calls it once with a GET and a hub.challenge query parameter. Wire it to a Respond to Webhook node that returns that value as plain text:
{{ $json.query['hub.challenge'] }}Enter the same verify token in n8n and in the Meta dashboard. Once Meta gets the challenge back, the webhook is verified.
3. Receive inbound messages#
Add a second Webhook node on the same fusionsk path, this time POST. This is where the Cloud API delivers message events. A Set node pulls the two fields the agent needs out of Meta's deeply nested payload:
chatInput = {{ $('Received New Message').first().json.body.entry[0].changes[0].value.messages[0].text.body }}
sessionid = {{ $('Received New Message').first().json.body.entry[0].changes[0].value.contacts[0].wa_id }}sessionid is the contact's WhatsApp ID. It becomes the memory key, so each person gets their own conversation thread.
4. Filter out noise#
Meta sends webhook events for delivery receipts, read status, and more, not just text. A Filter node only proceeds when both chatInput and sessionid are non-empty, so status callbacks do not trigger a pointless agent run.
5. Build the RAG agent#
Add an AI Agent node and wire four things into it:
- Chat model.
OpenAI Chat Model(gpt-4o-mini) as the language model. - Memory.
Postgres Chat Memoryusingsessionidas the key and tablewhatsapp_n8n_chat_histories. This is what gives the bot continuity per contact. - Document retrieval. A
Vector Storetool nameduser_documentsbacked by theSupabase Vector Store(tablewhatsapp_documents, querymatch_documents) withEmbeddings OpenAI. The agent calls this tool to fetch relevant context before answering. - System prompt. Your agent persona and rules.
The prompt does the sales work. It answers service questions, gauges interest, and branches:
Interested -> share the booking link (cal.com/your-handle)
Not interested -> point to the website and leave the door open6. Send the reply#
The agent output goes to the WhatsApp Business Cloud node (operation send). Set your phoneNumberId, and address the reply to the original sender:
recipient = {{ $('Received New Message').first().json.body.entry[0].changes[0].value.messages[0].from }}
text = {{ $json.output }}That closes the loop: message in, grounded answer out, conversation remembered.
Testing the workflow#
- Verify the webhook. Add the
fusionskURL and your verify token in the Meta dashboard. Confirm the subscription turns green (thehub.challengeecho succeeded). - Send a real message. Message your WhatsApp Business number from a personal phone. You should get an answer within a few seconds.
- Check grounding. Ask something only answerable from your docs. Confirm the reply uses that content rather than a generic guess. If it guesses, your vector store is empty or
match_documentsis misconfigured. - Check memory. Send a follow-up like "and how much is that?" without repeating context. The agent should understand it from history. Confirm a row exists in
whatsapp_n8n_chat_historiesfor yourwa_id. - Check the handoff. Express interest and confirm it shares the booking link; decline and confirm it shares the website instead.
Customization options#
- Swap the model. Replace
gpt-4o-miniwith another OpenAI model or a different provider; the agent wiring stays the same. - Handle media and buttons. Extend the
Setnode to read image, audio, or interactive button payloads, not justtext.body. - Add a human handoff. Detect intent like "talk to a person" and route the conversation to a Slack channel or a live-agent queue.
- Log leads. Append qualified conversations to a CRM or Google Sheet so interest does not live only in chat history.
- Add guardrails. Cap response length, add a profanity or off-topic filter, and set a fallback message for when retrieval returns nothing relevant.
Common mistakes that quietly break this#
- Skipping the empty-event filter. Without it, delivery and read receipts trigger the agent and burn tokens on nothing.
- Reusing one global memory key. Memory must key on
wa_id. A shared key bleeds one customer's context into another's conversation. - Returning markdown to WhatsApp. Asterisks and hashes show up literally. Force plain text in the prompt.
- An empty or stale vector store. RAG only grounds answers if documents are actually embedded and current. Re-index when your offering changes.
- Mismatched verify tokens. If the token in n8n and Meta differ, the handshake fails and no messages ever arrive.
Conclusion#
You now have a WhatsApp agent that behaves like a sharp first responder: it answers from your real knowledge base, keeps each conversation straight with per-contact memory, replies in seconds over the official Cloud API, and steers interested people toward a booking. Point your knowledge base at it, verify the webhook once, and inbound WhatsApp stops being a queue someone has to babysit.
Keep reading
- Business processn8nImplementationBusiness process
How to Automate Customer Review Management with n8n, Gmail, and Gemini AI
Install an n8n workflow that emails review requests, filters unhappy feedback privately, logs ratings in Google Sheets, and drafts Google review replies with Gemini AI.
12 min · - Slackn8nImplementationSlack
Cal.com meeting reminders in Slack: nudge your team 3 times before every booking
Build an n8n workflow that catches every Cal.com booking, logs it to Google Sheets, and pings Slack 1.5h, 1h, and 30 min before the call so no lead goes unattended.
9 min · - AI agents & LLMn8nImplementationAI agents & LLM
How to Automate Lead Follow-Ups with n8n, Gemini AI, Slack, and Gmail
Install an n8n workflow that captures website form leads, drafts a reply with one AI agent, then runs a Slack approve or rewrite loop before saving a Gmail draft, logged in Google Sheets.
12 min ·