Gong Integration for VoC

Technical guide to pulling customer intelligence from Gong — API endpoints, transcript ingestion, RAG strategies, and the new MCP integration.

Why Gong Data Matters

Sales calls are the richest source of unstructured customer feedback — objections, feature requests, competitor mentions, pain points
Ramp processes hundreds of thousands of call transcripts annually through their data cloud
Gong transcripts feed directly into the parallel agent architecture for VoC synthesis

Key API Endpoints

Endpoint	Method	Purpose
`/v2/calls`	GET	List calls with date range, user, and custom filters
`/v2/calls/transcript`	POST	Retrieve call transcripts (speaker turns + timestamps)
`/v2/calls/extensive`	POST	Full call metadata — participants, duration, outcomes
`/v2/users/{id}/stats`	GET	User activity metrics
`/crm/object`	POST	Bidirectional CRM sync
`/crm/map-fields`	POST	Field mapping configuration

Authentication

Method: Basic auth with Base64-encoded access_key:access_key_secret
Rate limits: ~1,000 requests/hour per API key
Scopes: Configure in Gong admin — limit to read-only for VoC use cases

Transcript Data Structure

Each transcript response contains:

Speaker turns: Array of { speakerID, topic, sentences: [{ start, end, text }] }
Speaker IDs: Numeric — map to actual names via /v2/calls/extensive participant data
Topics: Gong's auto-detected topic labels (pricing, competition, timeline, etc.)

Data Pipeline for RAG

Step 1: Historical Sync

Pull all calls via GET /v2/calls with date range filters
Paginate through results (cursor-based)

Step 2: Transcript Retrieval

Batch POST /v2/calls/transcript with call IDs
Respect rate limits — queue with backoff

Step 3: Speaker Mapping

POST /v2/calls/extensive returns participant details
Map speakerID to name, email, company
Tag internal vs. external speakers

Step 4: Chunking

Recommended: Fixed-size 512-1024 tokens
Preserve speaker boundaries where possible
Include speaker label in each chunk: [John Smith, Acme Corp]: "We've been looking at..."

Step 5: Metadata Enrichment

Attach to each chunk: call ID, date, participants, deal stage, topic labels, sentiment score
This metadata enables filtered retrieval (e.g., "show me all competitor mentions from enterprise deals in Q1")

Step 6: Vector Indexing

Namespace-partition by tenant/customer for data isolation
Standard embedding models (OpenAI, Cohere, or Snowflake Arctic)

Step 7: Ongoing Sync

Webhook: Gong can notify on new call completion (preferred)
Polling: Fallback — poll /v2/calls for new entries every 15-30 min

Gong MCP Integration (New)

Gong now supports Model Context Protocol (MCP) for direct AI agent connectivity
Enables natural language access to meeting data without custom API plumbing
Compatible with Claude, Cursor, and other MCP-enabled tools
Reduces integration complexity significantly for agent-based architectures

Ramp's Gong Usage

Direct

Parallel agents query Gong transcripts alongside Zendesk and competitor data
Transcripts flow into Snowflake, queryable via Cortex AI natural language

Via Actively.ai

Actively.ai layers on top of Gong + Salesforce
Continuously learns from transcripts to identify account signals
Example: when Ramp launched Procurement, Actively analyzed historical calls to find accounts that discussed centralized purchasing
Provides "why you, why you now" hypotheses from call analysis

Key Takeaways

Gong transcripts are the highest-signal VoC source — rich, unstructured, and full of competitive intelligence
Build the RAG pipeline with 512-1024 token chunks, speaker labels, and rich metadata
Use namespace partitioning for multi-tenant data isolation
The new MCP integration dramatically simplifies agent connectivity
Actively.ai adds an intelligence layer on top of raw Gong data — worth evaluating vs. building custom