Enterprise Multi-Agent Slack Assistant

A multi-agent AI platform embedded in Slack that automates high-demand internal processes — document generation, credit-line increases, and assignment workflows — using Google ADK for agent orchestration, GPT-4.1 models for reasoning, and Azure Container Apps for serverless execution.

Project Overview

Internal teams — especially the Risk department — were overwhelmed by repetitive, time-consuming processes that required cross-area coordination: generating templated documents that depend on database lookups, evaluating credit-line increase requests from ticket data, processing line assignments, and reviewing pending balances against unregistered deposits.

Since Slack was already the company’s primary workspace, the project embedded an intelligent multi-agent assistant directly into it. Employees simply mention the bot in any channel, describe what they need in natural language, and the system orchestrates the appropriate AI agent to handle the request — querying databases, running evaluations, and returning results as threaded Slack messages or PDF documents.

Google Gemini was initially evaluated as the LLM backbone but was discarded due to frequent availability issues, even on the paid tier. The final implementation uses Azure-hosted OpenAI models.

Architecture

High-Level Flow

Infrastructure

Azure Container App — Serverless deployment without the execution-time constraints of Azure Functions. This was critical because agent reasoning chains and database queries can take significant time. It also allows installing custom libraries and maintaining a structured codebase.
FastAPI — Serves as the webhook receiver for Slack events and the HTTP layer managing request/response cycles.
Slack Webhook — Configured to trigger only when the bot is mentioned, preventing unnecessary invocations and keeping conversations contextual.

Response Flow

A user mentions the bot in a Slack channel or thread.
The Azure Container App receives the webhook and sends an intermediate message to Slack (e.g., “Processing your request…”).
The request is forwarded to the ADK orchestrator agent.
The orchestrator routes to the appropriate sub-agent.
The sub-agent may ask follow-up questions back in the Slack thread if information is missing.
Once complete, the result is posted as a threaded reply — either as text or a generated PDF document.

Agent Architecture

The system uses Google’s Agent Development Kit (ADK) to implement a hierarchical agent orchestration pattern with session memory and external tool integration.

Agent Hierarchy

graph TD O["🧠 Orchestrator Agent\n(GPT-4.1 mini — Router)"] O --> S1["📄 Promissory Notes Agent\n(GPT-4.1)"] O --> S2["📊 Line Assignment Agent\n(GPT-4.1)"] O --> S3["📈 Line Increase Agent\n(GPT-4.1)"] S1 --> T1["🔧 Tools:\nDB queries, PDF generation,\ntemplate rendering"] S2 --> T2["🔧 Tools:\nAssignment rules engine,\nDB lookups"] S3 --> T3["🔧 Tools:\nEvaluation pipeline,\nticket data retrieval"]

Agent	Role	Model
Orchestrator	Routes incoming messages to the correct sub-agent based on intent	GPT-4.1 mini
Promissory Notes Agent	Generates legal documents from templates, querying the database for required fields	GPT-4.1
Line Assignment Agent	Processes credit-line assignment requests through the evaluation workflow	GPT-4.1
Line Increase Agent	Evaluates increase requests based on ticket data and risk criteria	GPT-4.1

Agent Design Principles

Conversational sub-agents — Each sub-agent has clear instructions in its prompt defining when and how to ask the user for missing information. The conversation continues within the Slack thread until all required data is gathered.
Function calling via .md specs — Each sub-agent’s available tools are defined in markdown files that describe the function signatures, parameters, and expected behavior — serving as both documentation and function-calling schemas.
Return to orchestrator — Once a sub-agent completes its task, control returns to the orchestrator, which can route subsequent messages to a different agent if needed.

Standardized Folder Structure

The codebase was organized for easy expansion — adding a new agent means creating a new folder following the established pattern:

agents/
├── orchestrator.py
├── tools/
│   └── shared_utils.py
└── sub_agents/
    ├── promissory_notes/
    │   ├── agent.py
    │   ├── prompt.md
    │   └── tools/
    │       ├── db_queries.py
    │       └── pdf_generator.py
    ├── line_assignment/
    │   ├── agent.py
    │   ├── prompt.md
    │   └── tools/
    │       └── assignment_engine.py
    └── line_increase/
        ├── agent.py
        ├── prompt.md
        └── tools/
            └── evaluation_pipeline.py

Session Memory

Session memory was stored in-memory (RAM) per area request, capturing both the conversation history and extracted key information from each interaction. This allowed the assistant to maintain context across multiple exchanges — even across different days — so employees didn’t need to repeat themselves.

A migration to a vector database for persistent memory was planned to survive container restarts and enable cross-session knowledge retrieval, but was not implemented during this phase.

External Integrations

Beyond the core agent logic, the platform also leveraged Dify — a low-code orchestration tool similar to N8N — for specific sequential workflows that didn’t require the full agent reasoning loop. These drag-and-drop pipelines handled deterministic, step-by-step processes without consuming agent compute resources.

Results & Lessons Learned

The system was deployed to production and successfully automated several high-demand workflows for the Risk team. However, production usage revealed challenges:

Slack thread processing caused duplicate messages when multiple webhook events fired for the same interaction.
The assignment and line-increase agents occasionally confused user messages in threaded conversations, leading to misrouted requests.

The system was kept under active observation for iterative improvements. Key takeaways:

Slack’s threading model requires careful deduplication logic at the webhook layer.
Agent routing accuracy improves significantly with stricter intent-classification prompts and conversation state tracking.
In-memory session storage, while fast, creates fragility in serverless environments where containers can be recycled.

Tech Stack

Layer	Technology
Interface	Slack API (Webhook)
Compute	Azure Container Apps (Serverless)
Framework	FastAPI (Python)
Agent Orchestration	Google ADK
LLM (Router)	GPT-4.1 mini
LLM (Agents)	GPT-4.1
Workflow Automation	Dify
Storage	Internal Databases, In-Memory Session Store

Conclusion

This project demonstrated that embedding AI agents directly into the tools employees already use (Slack) dramatically lowers the adoption barrier for automation. The hierarchical agent architecture — with a lightweight router and specialized sub-agents — proved effective for handling diverse business processes within a single conversational interface. The lessons learned from production deployment, particularly around Slack event deduplication and agent routing precision, provide a clear roadmap for the next iteration.