Have you ever wanted to share files at light speed but did not find such a service? No!? I mean... really!? Is your bandwidth too low? Well, too bad! That's what MP2P aims to do.

 

Bandwidth meme

 

More precisely?

MP2P is an online object storage. It aims to enable you to upload and download your files by harnessing all the bandwidth you have, this way you share your files faster.

 

Main objectives

Objectives of MP2P are as follows:

This project was developed by Julien Dubiel, Francis Visoiu Mistrih and Mathieu Corre during EPITA's third year. It is no longer maintained. 

 

How does it work?

MP2P is divided into 3 applications:

MP2P Architecture

  

File storage

On MP2P,  each file is stored as multiple chunks. As we want to be redundant and blazingly fast, we put multiple replications of each chunk on different storages. Hence,  when the client app. uploads and downloads, it opens as many connexions as it needs to store and get the file's chunks, consequently it uses its whole bandwidth. We consider that MP2P has enough servers to handle high bandwidth connexions.

Download and upload with MP2P

 

 Upload

When a client wants to upload a file to MP2P, the client app. asks the master app. how many chunks this file must be split into and where to upload each of them. Those chunks are then uploaded to multiple storage apps. When a storage app. successfully received a chunk, it notifies the master app. which updates metadata about this file and notifies the client that the transfer was fine.

 

File upload

Download 

When a client wants to download a file from MP2P, the client app. asks the master app. for file's chunks locations and download those from corresponding multiple storage apps.

 File download

 

And programmatically?

In order to realize this project, we needed to create our own protocol. Also, we needed to store metadata about each file consisting of chunks locations, the hash of each chunk (sha1), file name and file size: we needed a database! We choose Couchbase to try out a NoSQL database and because it is providing a really simple way to do database replication. We choose to use C++ for the network (with Boost::asio) and hashing performances (using OpenSSL/sha), but also because we were all comfortable with it.

Here are the details about the protocol and database usage depending on file size (with replication = 3) :

Protocol blocks

Protocol

Database usage

 

 

How did we test?

The setup

As being simple students that do not have access to powerful machines with Gigabit Ethernet, we thought about an easier solution which only consists of Raspberry Pis as Storages servers. Take a simple Raspberry Pi v1 or v2, the maximum bandwidth you can get is limited to approx. 45Mbps. So now let's say we have 5 of them, if MP2P is working well, then we should get 5*45~=225Mbps. We also need a Gigabit switch, a Master server with the Database (we used one laptop) and a client (we used another laptop). We used one laptop as the network router.

Testing MP2P

The results

 As expected, results showed a real improvement in transfer speed:

chart speed by number of storages

 

We also monitored the client app's CPU & RAM usage, the laptop (i7-3632QM) was running it inside an Archlinux virtual machine:

Chart of CPU and RAM usage

We also discovered that when we were using MP2P with a small file, it is not as fast as with a larger one. This is simply due to the multiple requests needed to create and retrieve metadata from the master app.:

chart Speed by file size

 

What improvements could be made?

The development is currently stopped, but here are a few things we thought could be implemented to improve our project:

 

Tools used

The programming language used is C++ 14. We like C++ for its performance and we’re always looking forward to learning how to write clean C++.

The Boost.Asio library is the second most-used tool for MP2P. Boost.Asio allows us to use a C++ approach for networking. This means, writing clean code, and keeping the performances that we could have achieved using C.

Couchbase: a NoSQL database, featuring a Master-Master duplication.

Catch: our unit-test framework.

Cmake: our build system.

 

More info

Github: https://github.com/MP2P/MP2P

Published by Julien Dubiel on 01/06/2017

Last modified on 12/09/2018