VoC Agent Architecture Patterns

How to structure a Voice of the Customer agent system — from data infrastructure to agent orchestration.

Data Infrastructure First

Before deploying any agents, you need a unified data layer:

CDP Stack (Ramp's choice): Snowflake + dbt + Hightouch
- Snowflake: warehouse + Cortex AI for natural language queries
- dbt: transforms raw data into queryable models
- Hightouch: reverse ETL pushes insights back into tools (Salesforce, Slack, etc.)
Unstructured data handling: Gong transcripts, support tickets, emails — all flowing into the warehouse
PII controls: Exclude SSNs, partial identifiers — build privacy in from day one
MCP Server: Snowflake now supports Model Context Protocol for direct LLM connectivity

Parallel Agent Architecture

Ramp's core pattern — launch multiple specialized agents simultaneously:

[Coordinator Agent]
    |
    ├── [Gong Agent] ——> scans call transcripts
    ├── [Zendesk Agent] ——> scans support tickets
    ├── [Competitor Agent] ——> scrapes competitor sites
    ├── [Codebase Agent] ——> searches internal code
    ├── [Salesforce Agent] ——> pulls CRM data
    └── [Usage Agent] ——> analyzes product analytics
    |
    v
[Synthesis Agent] ——> merges artifacts into insights

Each agent writes a markdown artifact with its findings
A coordinator agent applies map-reduce synthesis — identifies patterns across all artifacts
6-10 agents per research task is Ramp's sweet spot

Agent Design Principles

Constrained Decision Spaces

Learned from Ramp's Tour Guide agent: the most effective improvement is constraining what the agent can do
Give agents narrow, well-defined tasks rather than broad mandates
Single-step action generation — agent produces one action at a time based on current state

L0-L3 AI Adoption Framework

Level	Description	Example
L0	No AI	Manual research
L1	AI assists human	Copilot suggests, human decides
L2	AI acts, human reviews	Agent drafts spec, PM edits
L3	Full autonomy	Agent handles end-to-end

Ramp pushes every function toward L2-L3.

Two-Tier Model Strategy

Tier	Data	Use Case
General	Aggregated across 15K+ customers	Benchmarks, trends, category insights
Sensitive	Opt-in, in-context only	Customer-specific analysis, no persistent training

RAG Pipeline for Customer Data

For ingesting Gong transcripts and similar unstructured sources:

Chunking: Fixed-size 512-1024 tokens (recommended for transcripts)
Metadata enrichment: Call ID, participants, timestamps, topic labels, sentiment
Speaker mapping: Map speaker IDs to actual names
Namespace partitioning: Isolate by tenant/customer for security
Vector indexing: Embed chunks with metadata for semantic search
Sync strategy: Historical bulk load + webhook/polling for ongoing updates

Key Takeaways

Data infrastructure is prerequisite #1 — agents without clean, unified data just hallucinate faster
Parallel agent execution with map-reduce synthesis is the dominant pattern for multi-source VoC
Constrain agent decision spaces aggressively — narrow tasks >> broad mandates
Build two model tiers (general aggregated vs. sensitive opt-in) to handle data privacy
RAG with 512-1024 token chunks, rich metadata, and namespace isolation is the standard approach for transcript data

VoC Agent Architecture Patterns ​

Data Infrastructure First ​

Parallel Agent Architecture ​

Agent Design Principles ​

Constrained Decision Spaces ​

L0-L3 AI Adoption Framework ​

Two-Tier Model Strategy ​

RAG Pipeline for Customer Data ​

Key Takeaways ​

Related ​