Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

README.md

OONI Probe specification

  • version: 3.0.1
  • date: 2019-06-05
  • authors: Simone Basso

The purpose of this document is to explain how OONI Probe works. We're using version 3.0.0+ because it's intended that version 2.0.0 is described by the existing implementations, and this version is meant to be the reference for upcoming code changes to match this spec.

This document should serve as an introduction for the reader interested in the OONI-verse. We will strive to keep it current, but it will inherently age quicker than more specific specifications. Please, let us know if some parts of this document have become obsolete and we didn't notice.

Architecture

Architecture

Probe

The probe is the software running network tests (aka nettests). The probe is an app for mobile or desktop. Current implementations are:

Engine

The engine is the piece of code running nettests. A specific implementation of the probe uses an engine. Current implementations are:

The operations discussed here are valid for all implementations.

Orchestra

The orchestra is a set of servers used to provide probes with input for automatic network tests. This is currently experimental.

Geolookup

The geolookup is a set of servers and databases used to discover the probe's IP, ASN (autonomous system number), CC (country code), and network name (name of the entity owning the ASN).

Bouncer

The bouncer is a set of servers used by the probe to discover the collector and the test helper.

Collector

The collector is a set of servers to which the probe submits the results of nettests.

Test helpers

The test helpers are a set of servers useful to perform specific nettests. Their specs is available as part of this repository. We only consider test helpers the servers that are under OONI control. As we will see later, there are other servers we don't control that are part of our testing (e.g., when we test a specific URL for censorship, the server being tested is obviously part of the testing process but is also most likely not under our control).

Nettest flow

Orchestra

Nettests are either user initiated or automatically initiated when using the orchestra. Interaction (0) describes when the probe communicates with orchestra to get information, such as what test to run and with which input. Users can choose whether to enable orchestra or not. The specific policy for doing that depends on the app. (As of this writing, we have not finished implementing all of orchestra yet).

Discovering the input for the test is also part of orchestra. For example, there is an orchestra endpoint for discovering the list of URLs that needs to be tested when performing Web Connectivity tests. We aim to use this functionality to decide which URLs to test, rather than using static URLs shipped inside of the mobile and desktop apps.

When the test name and its input are known, we can move forward with the following steps.

Bouncer

The engine contacts the bouncer, as shown in interaction (1). This will tell the engine the available collectors and test helpers.

Geolookup

Unless configured to skip this step, the engine will perform a geolookup as shown in interaction (2). The purpose of geolookup is to know the user IP, which by default is not included in the report, and information that can be guessed from the IP, like the ASN, the CC, etc. Knowing the IP also allows the engine to attempt to scrub the IP from the results, when the user has requested the engine not to include their IP address (which is the default). In this document we don't get into the details of our Data Policy, which you can read separately; when in doubt, the Data Policy will always have precedence over this document, which is mainly meant to explain to new developers how all the pieces fit together.

Opening a report

At this point, the engine will contact the collector, interaction (3), to open a report for the specific nettest. This means that the collector will be prepared for receiving and storing the results of the nettest. In code terms, this means the collector will tell the engine the ID of the report, to be used to submit measurements as part of this report.

Nettesting

When the report is open, the engine will perform the nettest. It may or may not use test helpers, depending on the nettest. This is modeled by interaction (4). Depending on the nettest, there will be or will not be inputs, and there will be or not be test helpers. Two examples:

  1. If you run Web Connectivity, this will require one or more URLs as input. The engine will access those URLs and use a specific test helper to also access those URLs and do a comparison. The results of comparing the engine and the test helper measurement will become the result of the web measurement;

  2. If you run a NDT test, there will be no input and no OONI controlled test helper. However, the test will measure the performance between the engine and a measurement server (which we don't consider a test helper because it is not directly controlled by OONI, but rather is provided by Measurement Lab). The performance measurements will be included in the results.

Nettests that require input produce one measurement for each input. Instead, when there is no input, the nettest produces a single measurement. In this context, a measurement is a JSON document. The specification of the data format used by measurements is described in this repository and every nettest includes its specific pieces of data on top of the general data format.

Submitting measurements

Measurements produced by nettests are submitted to the OONI collector in the context of the previously openned report. This is again interaction (3), where the ID of the report is used to submit measurements.

Closing report and beyond

Finally, the engine tells the collector to close the report (again interaction 3). This means that the report will not accept further measurements using the previously communicated report ID. This will also trigger the automatic archiving and processing of the measurements. These actions are performed by the OONI pipeline. Data is accessible through the OONI API and browseable using OONI explorer. The OONI sysadmin repository contains the rules that we use to deploy and provision all the servers we control.