Using market mechanism in BitTorrent (part 1)

A few months ago I wondered how BitTorrent deals with resource allocation. I didn’t find so much and I thought about that myself. I think that is just a market. But let’s start at the beginning.

What is BitTorrent?

BitTorrent is a Peer-to-Peer protocol for file sharing. File sharing means down- and uploading files and peer-to-peer means that you don’t connect to a (dedicated) server but to some other person in the network.

However, it neglects one important actor in BitTorrent and that is trackers. If you want to read the BitTorrent specs.

What about trackers?

morehawes.co.uk created an excellent simple overview about BitTorrent and especially trackers. I also took this illustration from there:trackerI quote from the site:

its main role is allow peers to ‘find each other’ and start communication, i.e. to find peers with the data they require. Peers know nothing of each other until a response is received from the tracker. Whenever a peer contacts the tracker, it reports which pieces of a file they have. That way, when another peer queries the tracker, it can provide a random list of peers who are participating in the torrent, and have the required piece.

Which information is available?

Generally we know which pieces (data) a client has and which he needs. In economic terms we have supply and demand.

How is this handled?

There are two main methods used currently. The client requests a random piece or the rarest piece.

Using market mechanism to allocate resources

At the first sight it looks like a market. On the second it doesn’t because you don’t lose your goods if you give it to another (upload one piece). On the third look it looks like a market again because the same applies to digital goods.

So, we have actors in our market and goods (pieces of data). We know that the first actor alone on the market has all data.

What is the objective?

This is a really important question to ask. What do we want? The great thing is that we work with machines (clients) instead of people which means we can build a mechanism which assumes rationality to an extremely high level.

I think the most important objective is distribution. We want a file as quickly distributed and available as possible. So we have two measurements of success:

  1. How many actors (client) do have the complete file after some time period?
  2. How much data was transferred after some time period?

Why do we need two? The first one helps us to measure the main objective, i.e. file distribution. The second one helps us to get rid of some artificial time period. Let’s say we have an allocation mechanism which allocates 99% of the data as fast as possible but waits until every client has 99%. In this case it would suck on measurement one but rock on measurement two. You can imagine the other way around.

Create prices

Prices are subjective. Clients aren’t. So we have to build some which are subjective to some degree. Most interesting is the subjectivity in relation to needs.

For example, if there are 2000 clients at 95% in the network only you and the original uploader have a piece of data that is missing – all the other missing pieces are already distributed evenly. I think it’s pretty clear that they need this piece from you or the original uploader more than the other pieces.

This is our first factor which I call distribution level.

The other factor becomes evidently if we look at our measurements. We want speed. Again, it’s better that a client which has a high upload speed to get an important piece than one with a low upload speed. I name our second factor speed level.

Problem: No information about speed

Yeah, we don’t know the speed level and even if we knew how fast the internet connect is we don’t know how much a client can send. We have two options at this point:

  1. Throw away the speed level and work without it
  2. Try to estimate it

Let’s talk about the second option. We know which pieces a client requests, which ones he has, how much he uploaded and downloaded. We can estimate the network speed level. We could try to maximize this with experiments. We could work with assumptions (e.g. avg upload speed ~ 1/8 avg download speed) etc. Because this is more a thought experiment than a real implementation, I will just make speed levels up.

I will stop here and continue this post tomorrow or so.

Shopping for a language

I’ve written my first line of Python about 8 or 9 years ago. I remember buying a small book about it. Having written some PHP it was awesome. The language looked so clean and intentionally designed.

Then about 6 years ago I dived deeper into Python and used it since then quite regularly. Most of my code was written in Python, then some code in JS, Stata / R and C. Now Python 3 is the official standard and yesterday I read a post called Python 3 is killing Python.

I share one side of the argument which is also represented in this post. That is that Python 3 wasn’t a good thing. There should have been Python 2.8 but I don’t want to talk about it.

This article lead me to revise my belief about languages. I was so deep into Python that I didn’t take a look at new languages like Scala, Clojure, Dart and Go. Heck, I even just wrote my first jQuery code a few weeks ago.

The mystical unicorn

The first times I heard about functional programming must have been in esr’s How to Become a Hacker:

LISP is worth learning for a different reason — the profound enlightenment experience you will have when you finally get it. That experience will make you a better programmer for the rest of your days, even if you never actually use LISP itself a lot.

Norvig’s Teach yourself programming in ten years:

[Learn languages …] one that supports functional abstraction (like Lisp or ML), one that supports syntactic abstraction (like Lisp), one that supports declarative specifications (like Prolog or C++ templates), one that supports coroutines (like Icon or Scheme),

And of course Paul Graham. Interesting enough all are role models for me.

Finally doing it

At the time Haskell became more popular and with it some tutorials. So I started learning Haskell. At the time I’ve written code in PHP, C and Python and the abilities of Haskell were insane. It was so elegant and simple.

Later I wrote some Scheme working through Concrete Abstractions. It was really nice. However, all this code was no production code – just for learning.

So I looked around to see if there are books which are more focused on production and found one on Common Lisp. However, one thing that turned me off was the lack of a standard – not a Lisp standard but a standard work environment. There were thousands of lisp implements and dialects and so I gave it up. Because not writing lisp is easy than writing it.

Going shopping

So yesterday I’ve gone language shopping. I wanted to learn a new one. There were a few names which I’ve seem lately which I wanted to look at closer: Go, Scala, Clojure, Rust and Julia.

My “survey” was totally unscientific and I just looked at the project and wiki pages of each language and decided if I want to know more.

My final two languages were Scala and Clojure. Both run on the JVM which is fast and has tons of libraries. Also Go and Rust are more low-level and Julia is more for scientific calculations.

I looked at code of both and Clojure reminded me immediately of Lisp. It really liked it. Here are a few things which are also great about Clojure – so far:

  • Dynamically typed
  • Prototyping should be easy
  • JVM is fast and tons of libraries
  • Code looks great
  • REPL
  • I never learned about lisp macros

An investment?

I read a few articles on Clojure and one talked about how it’s going to explode. I don’t think so. Lisp exists for such a long time and it never really took off. I think a lot of companies just decide that it’s easier to use C / PHP because there are more hiring opportunities. Also developers still find FP harder. I don’t think that FP languages are inherit hard but that developers start with a C-like language (C++, Java, Ruby, Python, etc.) and the transition isn’t easy.

Would I recommend learning Clojure for a job? Nope. There are basically no jobs with Clojure. To be fair there were also basically none in Python when I first started to learn it. But I think it will stay a niche language. I want to end this with a quote which I already posted but it’s still great:

“So next time I hear the “you can’t get the programmers” line I’m going to respond with something like this:

“If you post an advert for a Haskell developer you will get 20
applicants. All of those people will be the kind of developer who
learns new programming languages to improve their own abilities and
stretch themselves, because nobody yet learns Haskell just to get a job.”

“If you post an advert for a Java developer you will get 200
applicants. Most of them will be the kind of developer who learned
Java because there are lots of Java jobs out there, and as long as
they know enough to hold down a job then they see no reason to learn
anything.”” —haskell.org

 

AI / Automate: Flappy Bird

So I had the urge to program something yesterday and somehow decided to take a look at flappy bird. I used a HTML5 version by uralozden. I wanted to write an algorithm that plays flappy bird.

The first idea was to try random moves and record the good ones. But before that I changed the game a bit so that I can get access to the flap() function and interject an AI function. This version of flappy bird is written with Phaser.

Making AI possible

I created a Timer which looks like this:

    aiTimer = new Phaser.Timer(game);
    aiTimer.onEvent.add(randomAI);
    aiTimer.start();
    aiTimer.add(0.2);

In line 2 I insert my function which handles the AI in this case randomAI.

Creating the random AI
The basic idea of the random AI is super easy. Pseudo code looks like this:

If best score > current score:
    use saved moves
else:
    use random moves

I used this approach so that I won’t take forever to get a good enough series of moves. How do we get saved moves? It’s pretty easy. I added code to the score function which works as follows:

if best score > current score:
    do nothing
else:
    best moves = current moves

At that’s basically it. Here you can see at video of the bot playing (started recording mid simulation). It’s a bit laggy but super easy to record thanks to recordit.co:


Using (perfect) information

This version didn’t work so great. The timing was really important. Even if the timer had a few milliseconds lag the bird didn’t make it. You can see this in the video pretty good. The great thing is that we can get access to all the data. Two things would interest us especially. Firstly, the y coordinate of our bird and secondly, the y coordinate of the pipes.

Again I looked around and we can get access to these two data points pretty easily.

birdie.body.y // position of the bird

fingers.forEachAlive(function(finger) {
    finger.y // position of all pipes
}

I don’t want to get into details but Phaser has some internal workings which doesn’t makes it so easy to access the next finger aka. pipe.

The next step was to write a function which controls our height, which basically looks like this:

if(birdie.body.y + offset > goal) {
    flap();
}

Super easy. Nothing special. If we are too low, flap otherwise do nothing. So how does this AI perform? Take a look at the video:

Looks pretty good, eh? I let it run for a while and it stopped because there was an error in the original game which created pipes which were so high that it was impossible to flight through the gap. It was around a score of 99. I’m satisfied.

Setting up nginx with Bottle and Flup (FastCGI)

Writing the bottle app

from bottle import route, run

@route('/app/')
def hello_name(name):
    return "Hello " + name

run(server="flup" host='localhost', port=8080)

Again, two things are important here.

  1. We got to have the routing correct
  2. We have to set the server = “flup”

Using flup

What is flup? Flup is a random assortment of WSGI server and most importantly for us, fcgi. You can install it with pip:

pip install flup

Now we can run our application with flup.

Installing and configuring nginx

If you’re running a Debian-based distribution it’s just:

sudo apt-get install nginx

Start by making a backup copy of the nginx.conf which is in /etc/nginx/:

cp /etc/nginx/nginx.conf .

Now we can edit it and connect it to our bottle application.

Setting nginx up for fast cgi

The basic structure in nginx’s config files is:

http {
    server {
        location {
        }
}

in server {} comes everything related to the servers, e.g. ports and host names. In location /foobar {} you insert everything to related a location – in this case /foobar.

Configuring a server

We start with the following code:

server {
    listen 8080;
    
    location / {
        root /home/foobar/www;
    }
    
    location /app/ {
        # magic
    }
}

This creates a server which runs on port 8080. Furthermore, we apply rules to two locations. First / to which we assign /home/foobar/www. That means that every request at / will request files / paths in /home/foobar/www. Some examples:

http://location:8080/ -> /home/foobar/www
http://location:8080/test -> /home/foobar/www/test
http://location:8080/test.html -> /home/foobar/www/test.html
http://location:8080/some/directory/baz.html -> /home/foobar/www/some/directory/baz.html

However, we defined an exception: /app/. We want to use fastcgi in this case. nginx offers this ability natively. You just write the following:

    location /queue/ {
            fastcgi_pass   localhost:8889;
            fastcgi_param  GATEWAY_INTERFACE  CGI/1.1;
            fastcgi_param  SERVER_SOFTWARE    nginx;
            fastcgi_param  QUERY_STRING       $query_string;
            fastcgi_param  REQUEST_METHOD     $request_method;
            fastcgi_param  CONTENT_TYPE       $content_type;
            fastcgi_param  CONTENT_LENGTH     $content_length;
            fastcgi_param  SCRIPT_FILENAME     $document_root$fastcgi_script_name;
            fastcgi_param  SCRIPT_NAME        $fastcgi_script_name;
            fastcgi_param  REQUEST_URI        $request_uri;
            fastcgi_param  DOCUMENT_URI       $document_uri;
            fastcgi_param  DOCUMENT_ROOT      $document_root;
            fastcgi_param  SERVER_PROTOCOL    $server_protocol;
            fastcgi_param  REMOTE_ADDR        $remote_addr;
            fastcgi_param  REMOTE_PORT        $remote_port;
            fastcgi_param  SERVER_ADDR        $server_addr;
            fastcgi_param  SERVER_PORT        $server_port;
            fastcgi_param  SERVER_NAME        $server_name;
        }

It’s a lot of lines but the first one is the one you have to edit. It says that fastcgi requests should be forwarded to localhost:8889 which, in this case, is our bottle application. You can put the rest into an extra file and use include file to include it.

Starting nginx and bottle

First, we want to start our nginx server:

sudo nginx # or if we want to use a special config file
sudo nginx -c /path/to/nginx.conf

If that runs, we just start our bottle app and we’re done.

python bottle.py