Skip to main content
Test your custom voice agents by hosting them in a sandbox environment with WebRTC support. UserTrace will interact with your voice agent using WebRTC protocols for real-time voice communication.

API Requirements

Your sandbox must expose a WebRTC-compatible endpoint, plus a health check endpoint.

1. WebRTC Voice Endpoint

Endpoint: WebSocket: wss://your-domain.com/voice Connection Handshake:
{
  "type": "start_call",
  "user_id": "sim_user_abc123",
  "session_id": "session_xyz789",
  "metadata": {
    "scenario_id": "customer_support",
    "persona": "frustrated_customer",
    "audio_config": {
      "codec": "opus",
      "sampleRate": 48000
    }
  }
}
Response Format:
{
  "type": "call_ready",
  "session_id": "session_xyz789",
  "call_id": "call_456def",
  "webrtc_offer": {
    "type": "offer",
    "sdp": "v=0\r\no=- 123456789 123456789 IN IP4..."
  },
  "metadata": {
    "agent_ready": true,
    "audio_codec": "opus",
    "estimated_setup_time": "2s"
  },
  "evaluation_metadata": {
    "connection_quality": "excellent",
    "setup_success": true,
    "audio_path_established": true
  }
}

2. Health Check Endpoint

Endpoint: GET /health Request:
curl -X GET https://your-sandbox.com/health
Response Format:
{
  "status": "healthy",
  "timestamp": "2024-01-21T10:30:00Z",
  "version": "1.0.0",
  "webrtc_status": "operational"
}

Advanced Configuration

Optional Parameters

user_id and session_id are optional in requests. If your agent doesn’t require session management, you can omit them. metadata is optional and can contain any scenario or context information your agent needs. Request with minimal parameters:
{
  "type": "start_call",
  "audio_config": {
    "codec": "opus"
  }
}
Request with full parameters:
{
  "type": "start_call",
  "user_id": "sim_user_abc123",
  "session_id": "session_xyz789",
  "metadata": {
    "scenario_id": "customer_support",
    "persona": "frustrated_customer",
    "context": {
      "issue_type": "billing",
      "urgency": "high",
      "customer_tier": "premium"
    }
  }
}

Implementation Guide

1. Set Up Your Sandbox

Deploy your voice agent to a publicly accessible WebSocket endpoint with WSS (secure WebSocket) support. Required Headers:
  • Sec-WebSocket-Protocol: voice-agent
  • Authorization: Bearer <your-api-key> (best practice for security)

2. Test Your Endpoints

WebRTC Connection Test:
curl --include \
     --no-buffer \
     --header "Connection: Upgrade" \
     --header "Upgrade: websocket" \
     --header "Sec-WebSocket-Key: SGVsbG8sIHdvcmxkIQ==" \
     --header "Sec-WebSocket-Version: 13" \
     --header "Sec-WebSocket-Protocol: voice-agent" \
     wss://your-sandbox.com/voice
Health Check Test:
curl -X GET https://your-sandbox.com/health

3. Configure in UserTrace

  1. Add your sandbox URL in the UserTrace dashboard
  2. Set authentication if required (API keys, tokens)
  3. Configure audio settings (codec preferences, quality settings)
  4. Test connection using the built-in connectivity checker

Best Practices

Performance

Response Optimization• Target < 200ms audio latency • Support 100 concurrent connections • Implement proper error handling • Add connection quality monitoring

Security

WebRTC Security• Use WSS (secure WebSocket) only • Implement DTLS for media encryption • Validate all signaling messages • Use secure TURN servers

Error Handling

Your API should return appropriate WebSocket close codes and error messages: Connection Errors:
{
  "type": "error",
  "error": {
    "code": "connection_failed",
    "message": "Failed to establish WebRTC connection",
    "details": "ICE connectivity check failed"
  }
}
Audio Errors:
{
  "type": "error", 
  "error": {
    "code": "audio_error",
    "message": "Audio stream initialization failed",
    "details": "Unsupported codec: G.722"
  }
}

Voice Processing Support

For agents that handle voice conversations, your response should include both the voice interaction results and conversation metadata: Voice Response Example:
{
  "type": "conversation_update",
  "session_id": "session_xyz789",
  "call_id": "call_456def",
  "transcript": {
    "user": "Hi, I need help with my billing issue",
    "agent": "I'd be happy to help you with your billing. Can you provide your account number?"
  },
  "metadata": {
    "audio_quality": "excellent",
    "response_time_ms": 1800,
    "confidence_score": 0.94
  },
  "evaluation_metadata": {
    "understood_intent": true,
    "appropriate_response": true,
    "tone": "professional",
    "next_action": "gather_account_info"
  }
}

Common Integration Patterns

  • Speech Recognition: Real-time transcription with confidence scores
  • Voice Synthesis: Text-to-speech with natural voice generation
  • Intent Detection: Understanding user requests from voice input
  • Context Management: Maintaining conversation state across turns
  • Quality Monitoring: Real-time audio quality and connection metrics

Troubleshooting

Common Issues:
  • Connection not establishing: Check WebSocket endpoint and SSL certificates
  • Audio not flowing: Verify WebRTC signaling and ICE connectivity
  • Poor audio quality: Optimize codec settings and network configuration
  • High latency: Review server location and network routing
Need help with WebRTC implementation? Our team can assist with technical setup and optimization. Contact [email protected].