← Back to Getting Started

rtcStats architecture

Explore the technical architecture of rtcStats, including the client-side SDK, mediation server, and SaaS analysis engine.

Understanding the architecture of rtcStats is key to successfully integrating it into your application's tracing and observability workflow. The platform is designed with a privacy-first, modular approach, allowing you to collect, process, and visualize WebRTC data without exposing sensitive user information.

You can get quite far with our open source and freemium offering without incurring any ongoing costs.

The architecture consists of three primary layers: Collection, Mediation, and Analysis.

The three-layer flow

rtcStats consists of 3 layers, each in charge of a different piece of your WebRTC observability workflow:

  1. The Collection Layer (rtcstats.js): Our open-source library SDK, rtcstats.js, wraps the standard WebRTC RTCPeerConnection. It periodically calls getStats() as well as internal events (like state changes or track additions)

    • Key Function: Intercepts and collects raw WebRTC metrics and traces in real-time
    • Payload: Generates a series of JSON "dumps" containing the trace data collected locally, to be sent to the mediation layer
  2. The Mediation Layer (rtcstats-server & rtcstats-features): This is the "buffer" between your application and the rtcStats analysis engine. It consists of two primary components geared towards maintaining your data sovereignty:

    • Data Collection (rtcstats-server): You host your own instances of rtcstats-server to collect incoming traces. It writes the raw "dumps" to S3-compatible storage and records metadata in a Postgres database.
    • Anonymization: The server can be configured to strip or mask sensitive data (like IP addresses, URLs, or device labels) before it is stored.
    • Feature Extraction (rtcstats-features): A separate, "offline" process that polls the database for new dumps, downloads them from storage, and extracts call-related features. This separation ensures that the time-critical data collection is not impacted by CPU-intensive analysis.
  3. The Analysis Layer (rtcstats.com): Handles the deep analysis of WebRTC metrics through its Observations, as well as a powerful visualization engine for efficient troubleshooting and debugging. It accesses the extracted features and raw dumps to provide:

    • Inference Engine: This is where Observations and Deductions happen. The system analyzes metrics and trace events, figuring out root causes and issues
    • AI Driven: An innovative AI model reviews the results and offers its analysis and summaries based on our best practices
    • Collaboration: Offers collaborative capabilities such as commenting and public sharing of results

rtcStats flow diagram

Below is a high-level representation of how data moves from a user's device to a shareable report:

       +-----------------+
       |   rtcstats-js   |
       |  (client-side)  |
       +--------+--------+
                |
                | Data Collection (WebSocket)
                |
       +--------v----------+
       |  rtcstats-server  | (1..N)
       +-------------------+
                |
       +--------+----------+
       |                   |
+------v-------+    +------v--------+      +------------------+
|   Database   |    |    Storage    |<-----|  Visualization   |
|  (Postgres)  |    |   (e.g. S3)   |      |  (rtcstats.com)  |
+------^-------+    +------^--------+      +------------------+
       |                   |
       |  Dump Processing  |
       +--------+----------+
                |
       +--------v----------+
       | rtcstats-features | (1..M)
       +-------------------+

Understanding the flow

  1. Collection: The rtcstats-js SDK runs in the user's browser, transparently wrapping WebRTC APIs. It sends real-time metrics and event traces to your mediation server via a secure WebSocket connection.
  2. Storage: Your rtcstats-server instances receive the data. Metadata (like session start/end and identifiers) is saved to Postgres, while the high-volume raw trace (the "dump") is uploaded to S3-compatible storage in a compressed JSON Lines format.
  3. Processing: The rtcstats-features component works "offline" to avoid impacting live collection. It identifies new dumps in the database, downloads them from S3, extracts critical WebRTC features (like ICE failures or packet loss patterns), and writes these back to the database.
  4. Analysis: When you open a report on rtcstats.com, the analysis engine securely accesses the extracted features and raw dumps. It then applies our inference engine and AI models to provide the visualizations and root-cause analysis you see in the dashboard.

Key architectural benefits

We've built rtcstats with scale and real world use in mind. The following have been baked into every aspect of our service:

  • Reduces your workload: Getting from hours down to minutes for debugging and troubleshooting WebRTC issues and figuring out root causes

  • Low Overhead: The collection SDK library is designed to be “non-blocking” and lean, ensuring that monitoring your WebRTC sessions doesn't actually degrade the performance of the call itself or take up much processing (CPU and network use are kept to the minimum)

  • Privacy by design: By using a self-hosted mediation server, rtcStats never has to see PII (Personally Identifiable Information) unless you explicitly allow it. You control and own your data, deciding if, what and when to share it

  • Scalability: The architecture supports everything from individual manual uploads to high-volume platforms processing millions of concurrent sessions. It scales horizontally with common DevOps tooling

Was this page helpful?