Skip to content

Gong Integration for VoC

Technical guide to pulling customer intelligence from Gong — API endpoints, transcript ingestion, RAG strategies, and the new MCP integration.

Why Gong Data Matters

  • Sales calls are the richest source of unstructured customer feedback — objections, feature requests, competitor mentions, pain points
  • Ramp processes hundreds of thousands of call transcripts annually through their data cloud
  • Gong transcripts feed directly into the parallel agent architecture for VoC synthesis

Key API Endpoints

EndpointMethodPurpose
/v2/callsGETList calls with date range, user, and custom filters
/v2/calls/transcriptPOSTRetrieve call transcripts (speaker turns + timestamps)
/v2/calls/extensivePOSTFull call metadata — participants, duration, outcomes
/v2/users/{id}/statsGETUser activity metrics
/crm/objectPOSTBidirectional CRM sync
/crm/map-fieldsPOSTField mapping configuration

Authentication

  • Method: Basic auth with Base64-encoded access_key:access_key_secret
  • Rate limits: ~1,000 requests/hour per API key
  • Scopes: Configure in Gong admin — limit to read-only for VoC use cases

Transcript Data Structure

Each transcript response contains:

  • Speaker turns: Array of { speakerID, topic, sentences: [{ start, end, text }] }
  • Speaker IDs: Numeric — map to actual names via /v2/calls/extensive participant data
  • Topics: Gong's auto-detected topic labels (pricing, competition, timeline, etc.)

Data Pipeline for RAG

Step 1: Historical Sync

  • Pull all calls via GET /v2/calls with date range filters
  • Paginate through results (cursor-based)

Step 2: Transcript Retrieval

  • Batch POST /v2/calls/transcript with call IDs
  • Respect rate limits — queue with backoff

Step 3: Speaker Mapping

  • POST /v2/calls/extensive returns participant details
  • Map speakerID to name, email, company
  • Tag internal vs. external speakers

Step 4: Chunking

  • Recommended: Fixed-size 512-1024 tokens
  • Preserve speaker boundaries where possible
  • Include speaker label in each chunk: [John Smith, Acme Corp]: "We've been looking at..."

Step 5: Metadata Enrichment

  • Attach to each chunk: call ID, date, participants, deal stage, topic labels, sentiment score
  • This metadata enables filtered retrieval (e.g., "show me all competitor mentions from enterprise deals in Q1")

Step 6: Vector Indexing

  • Namespace-partition by tenant/customer for data isolation
  • Standard embedding models (OpenAI, Cohere, or Snowflake Arctic)

Step 7: Ongoing Sync

  • Webhook: Gong can notify on new call completion (preferred)
  • Polling: Fallback — poll /v2/calls for new entries every 15-30 min

Gong MCP Integration (New)

  • Gong now supports Model Context Protocol (MCP) for direct AI agent connectivity
  • Enables natural language access to meeting data without custom API plumbing
  • Compatible with Claude, Cursor, and other MCP-enabled tools
  • Reduces integration complexity significantly for agent-based architectures

Ramp's Gong Usage

Direct

  • Parallel agents query Gong transcripts alongside Zendesk and competitor data
  • Transcripts flow into Snowflake, queryable via Cortex AI natural language

Via Actively.ai

  • Actively.ai layers on top of Gong + Salesforce
  • Continuously learns from transcripts to identify account signals
  • Example: when Ramp launched Procurement, Actively analyzed historical calls to find accounts that discussed centralized purchasing
  • Provides "why you, why you now" hypotheses from call analysis

Key Takeaways

  • Gong transcripts are the highest-signal VoC source — rich, unstructured, and full of competitive intelligence
  • Build the RAG pipeline with 512-1024 token chunks, speaker labels, and rich metadata
  • Use namespace partitioning for multi-tenant data isolation
  • The new MCP integration dramatically simplifies agent connectivity
  • Actively.ai adds an intelligence layer on top of raw Gong data — worth evaluating vs. building custom