
.. index::
   double: subsystems; Random peer sampling
   see: RPS; Random peer sampling

.. _RPS-Subsystem:

RPS — Random peer sampling
==========================

In literature, Random Peer Sampling (RPS) refers to the problem of
reliably [1]_ drawing random samples from an unstructured p2p network.

Doing so in a reliable manner is not only hard because of inherent
problems but also because of possible malicious peers that could try to
bias the selection.

It is useful for all kind of gossip protocols that require the selection
of random peers in the whole network like gathering statistics,
spreading and aggregating information in the network, load balancing and
overlay topology management.

The approach chosen in the RPS service implementation in GNUnet follows
the `Brahms <https://bib.gnunet.org/full/date.html\#2009_5f0>`__ design.

The current state is \"work in progress\". There are a lot of things
that need to be done, primarily finishing the experimental evaluation
and a re-design of the API.

The abstract idea is to subscribe to connect to/start the RPS service
and request random peers that will be returned when they represent a
random selection from the whole network with high probability.

An additional feature to the original Brahms-design is the selection of
sub-groups: The GNUnet implementation of RPS enables clients to ask for
random peers from a group that is defined by a common shared secret.
(The secret could of course also be public, depending on the use-case.)

Another addition to the original protocol was made: The sampler
mechanism that was introduced in Brahms was slightly adapted and used to
actually sample the peers and returned to the client. This is necessary
as the original design only keeps peers connected to random other peers
in the network. In order to return random peers to client requests
independently random, they cannot be drawn from the connected peers. The
adapted sampler makes sure that each request for random peers is
independent from the others.

.. _Brahms:

Brahms
------

The high-level concept of Brahms is two-fold: Combining push-pull gossip
with locally fixing a assumed bias using cryptographic min-wise
permutations. The central data structure is the view - a peer's current
local sample. This view is used to select peers to push to and pull
from. This simple mechanism can be biased easily. For this reason Brahms
'fixes' the bias by using the so-called sampler. A data structure that
takes a list of elements as input and outputs a random one of them
independently of the frequency in the input set. Both an element that
was put into the sampler a single time and an element that was put into
it a million times have the same probability of being the output. This
is achieved with exploiting min-wise independent permutations. In the
RPS service we use HMACs: On the initialisation of a sampler element, a
key is chosen at random. On each input the HMAC with the random key is
computed. The sampler element keeps the element with the minimal HMAC.

In order to fix the bias in the view, a fraction of the elements in the
view are sampled through the sampler from the random stream of peer IDs.

According to the theoretical analysis of Bortnikov et al. this suffices
to keep the network connected and having random peers in the view.

.. [1]
   \"Reliable\" in this context means having no bias, neither spatial,
   nor temporal, nor through malicious activity.
