SamKnows is now part of Cisco | Learn more

Platform Schematic

Our measurement platform is used by the world’s biggest ISPs and supports many millions of devices in a single logical database. The data is protected by world-class security provided by Google BigQuery. You can choose where your data is stored to meet compliance and regulatory requirements. You can stream measurement data almost instantly to be alerted as soon as there is any changes.

Data pipeline

A key component of any platform that deals with large amounts of data is its data pipeline. We have invested significantly in ours over the years, ensuring that it’s robust and can scale to the ever increasing demands of our customers. Our latest iteration of the pipeline adds speed as a requirement. This latest pipeline allows us to deliver measurement results almost instantly to our customers and provides a firm grounding for future products that will act upon real-time data.

Flume

Flume acts as our gateway for incoming data. It receives data over HTTP (over TLS), validates and authenticates it, and then publishes it on a Kafka topic.

Kafka

Kafka is an event-streaming data store. We use it to stream realtime measurement data and perform some transformations on it (such as splicing in metadata).

BigQuery

BigQuery is the data store that SamKnows uses for long-term storage of measurement data and metadata. This provides very high scalability.

Google Datacenters

SamKnows uses GCP (Google Cloud Platform) to host its core data pipeline. Google’s cloud provides the resilience and failover automatically.

We’ve re-architected the entire pipeline to remove batch processing; we are now streaming data. But we still retain the ability to buffer data at any point in the pipeline if connectivity to the subsequent part is temporarily lost.

SamKnows services

These are some of our propriety tools and technology used to control and access the measurement platform and data.

MySQL

MySQL is a relational database we use to store user data, Whitebox data, CPE data, app data and metadata.

SamKnows One

Provides users with an interface to access the measurement results and manage their devices.

Test Agent Management

View and edit Agents and Metadata. This also allows you to import and export Metadata for Agents.

Triggered Testing Controller

Gateway between the Agent (Whitebox or Router Agent) and the Instant Test / RealSpeed functionality.

Scheduled Test

The ability for users to set a test schedule in SamKnows One for tests to run on an Enabled Router against a pre-defined schedule. 

The cornerstone of SamKnows services is SamKnows One, our cloud-based analytics system. Here, you can visualise your data and manage your measurement platform.

Programmatic interfaces 

We have a powerful set of APIs that let you interact with the SamKnows One backend programmatically.

Metadata API

The Metadata API allows you to attach supporting metadata to Whiteboxes or CPE. This includes items such as ISP, package, or service tier.

Data API

Raw and aggregated measurement results intended for clients who wish to integrate our data into their internal platforms or backend systems.

Instant Test API

The Instant Test API allows tests to be executed remotely in realtime.

Agent Activation API

The Agent Activation API allows you to activate or deactivate a CPE for testing.

It’s important to assigning custom metadata to your CPE or Whiteboxes, this allows you to split and filter your aggregated data for complex analysis. Our Metadata API makes this process quick and efficient.