Archive for the ‘Technical’ Category

sys/queue.h sucks on Linux

Friday, June 27th, 2008

I was vaguely aware that the copy of sys/queue.h on Linux systems was old. However, I’d forgotten it actually lacks some important features of the modern version shipped with BSD systems. There is a very common pattern of usage with linked lists, which the Linux version of queue.h doesn’t support too easily - iterating over a list and selectively removing elements, without invalidating the list.

It is unsafe, for example, to iterate over a TAILQ using the TAILQ_FOREACH macro if you are going to then do a TAILQ_REMOVE on certain elements within the TAILQ_FOREACH. The idiom suggested in the OpenBSD queue(2) manual page is:

for (var = LIST_FIRST(head); var != LIST_END(head); var = nxt) {
    nxt = LIST_NEXT(var, entry);
    if (some_condition) {
        LIST_REMOVE(var, entry);
        some_function(var);
    }
}

Note the need for LIST_END macro as the loop guard. Since LIST_END, TAILQ_END, etc are all missing from the Linux version of sys/queue.h, this code will fail to compile on Linux systems. So it is necessary to bundle the BSD version. Fortunately, this isn’t a problem - the license is of course extremely liberal and you just drop it in your source directory and forget about it. It is annoying to stumble upon though. Why haven’t they updated their copy?

Python and poor documentation - urllib2.urlopen() exception layering problems

Saturday, April 5th, 2008

UPDATE: An informative email thread which I started on the Python BayPiggies email list, can be found here.

One of the things I like most about Python is the “batteries included” philosophy. The standard library is comprehensive. One of the things which I dislike most about Python is the documentation. While superficially comprehensive, it leaves out many important details.

A case in point is the urlopen() function from the urllib2 module. This is very useful for fetching HTTP resources and allowing you to process the results trivially. Unfortunately it has some strange behaviours. The documentation claims “Raises URLError on errors”. One would imagine that this means one simply needs to catch the URLError exception, to successfully handle errors. This is incorrect. I have written a number of Python programs which are long-lived in nature (run indefinitely - over months) which use this function over the Internet. I want these programs to be robust and to recover from individual urlopen() failures. In the course of running these programs I have found the following exceptions are in fact thrown (note that I operate only on HTTP URLs):

  • urllib2.HTTPError
  • urllib2.URLError
  • httplib.BadStatusLine
  • httplib.InvalidURL
  • ValueError
  • IOError

Of course none of these are mentioned in the documentation. It is also unclear if it is the intention of the API designers that these exceptions should bubble up to the urlopen() caller. For example, certain httplib errors are caught by urllib2 and raised as a URLError, but obviously many are not. In my opinion, the documentation should either be clear that the caller needs to check for various underlying library exceptions, or urllib2 should be modified to convert all these errors into URLError.

To this effect, I intend to bring this issue up for discussion on some Python lists and if consensus is reached, I will propose a patch to handle at least those httplib exceptions which I have encountered.

Top 10 Torrented Films of 2007

Tuesday, March 18th, 2008

Over at the Peer to Peer Research Institute, we have performed analysis on over 128,959 film torrents. From this data, we were able to extract the most torrented films of 2007. Without futher ado, here is the list:

  1. Harry Potter and the Order of the Phoenix
  2. 300
  3. Transformers
  4. Ghost Rider
  5. Ratatouille
  6. Shrek The Third
  7. The Simpsons Movie
  8. Sunshine
  9. I am Legend
  10. I now pronounce you Chuck and Larry

You can expect more interesting data extracted from the dynamic world of P2P file sharing on this blog in the future.

Amusing PostgreSQL tid-bit: dates treated as arithmetic expressions

Monday, March 17th, 2008

This evening I was trying to select entries within a specific date range, for example, all torrents for films which had been released in 2007. My query looked something like:

SELECT name FROM table WHERE date > 2007-01-01 AND date < 2008-01-01

PostgreSQL consistently returned very odd results. As far as it was concerned, Transformers was not a 2007 release, furthermore Batman Begins - which I distinctly remember going to see about three years ago - was.

Any relatively experienced SQL hacker is no doubt chuckling, having immediately seen my error. Of course, PostgreSQL is treating the un-quoted dates as arithmetic expressions and evaluating them numerically. When you think about it, 2005 - 05 - 05 = 1995. This is a perfectly valid arithmetic expression, its just that it happens to look like a definitive calendar date to my brain.

I found this mistake on my part absolutely hilarious.

Why I am not renewing my ACM membership

Sunday, March 16th, 2008

After a year of Association for Computing Machinery “professional-level” membership ($200 / year) I’ve decided not to renew. Why not? A number of things really rubbed me up the wrong way about the ACM.

First of all, I had been looking forward to having an @acm.org email alias which I could use as a neutral and professional contact address on personal cards. Unfortunately, the ACM gave me a completely useless address which included an apostrophe in it! Yes, RFC 821 allows this, but such an alias is horribly prone to mis-typing and is difficult to remember. It simply wouldn’t work on a business card. There was no way, as far as I could see, to change the alias. I could have called them up and shouted at them to change it I suppose, but I really didn’t have time to bother with that sort of thing.

Secondly, they had the nerve to mail me ads for dental insurance. I assumed since I had received a letter from the ACM - a supposedly serious and useful professional organisation - that it would be something of actual value - such as perhaps an invitation to a conference or something of that nature. Nope - they are trying to sell me dental insurance. I already had dental insurance, and even if I hadn’t, I wouldn’t have appreciated the spam from an organisation which is supposed to be helping my career as a computer scientist, not flogging on insurance. I found it incredibly cheeky that after paying them $200, they would send me such garbage.

Thirdly, their publications are terrible. “Communications of the ACM” literally makes me nauseous. Its chocked full of industry advertisements and vapid, content-free articles. For a publication which is supposed to be of practical value, its pages are surprisingly full of utter trash - titles straight out of some Sokal hoax paper such as “amoeba-based neurocomputing with chaotic dynamics”. Come on. Their other magazine, “Queue” is similarly full of rubbish, but of a slightly different nature. Its articles are criticism-free, glorified adverts for various large computer vendors. Vendors get to trot out all sorts of insane new technologies, sure to be “the next big thing” while puppy-dog like interviewers stare wide-eyed.

Overall, the ACM reeks of an organisation which was once probably authentic, and useful, but has now utterly sold out to industry and become effectively an advertising racket specifically targetting professionals in computer-related fields. Not only do they provide very little content (beyond their digital library, which can be accessed from various public libraries for free in any case) but they are over-priced and actively engage in cheeky, rude maneuvers like trying to sell on insurance. Sorry ACM, but you won’t be getting any more of my money.

Unworkable 0.4 released

Monday, January 7th, 2008

I have just tagged, packaged and announced version 0.4 of my BitTorrent implementation, Unworkable.

Here are the release notes:

  • Implemented sending peer keep-alives.
  • Trace log now contains timestamps.
  • Make us more tolerant of intermittent tracker failures.
  • Added support for Arch Linux.
  • Fixed an off-by-four bug which could cause segfaults on some platforms.
  • Fix zero padding in peer id generation.
  • Overall code reduction and re-factoring plus improvements to documentation.

BitTorrent Strategies: The Beginning

Thursday, December 27th, 2007

To follow up on my last post on the bittorrent end-game, I’m going to write about a strategy to bootstrap a torrent download. I am talking here about the case where you start a download with no existing data, in other words, from scratch. As I described in one of my earlier articles about the basics of the BitTorrent protocol, torrents are divided into units called pieces. The torrent meta data contains the SHA-1 checksums for each piece, so that we can hash each piece once we have downloaded it, for verification purposes. Pieces are downloaded on a block-by-block basis - typically the block size is sixteen kilobytes (16KB).

Another important thing to understand is that BitTorrent peers use a “tit-for-tat” transfer scheduling algorithm. That is, peers will be rewarded by other peers for uploading data to them. It is therefore important that a bootstrapping peer have at least a small number of complete pieces as soon as possible, in order to share them, and hence be rewarded by the “tit-for-tat” algorithm. The bootstrapping peer wants to get at least a few pieces to share as soon as possible.

Additionally, pieces are usually downloaded in rarest-first order. This ensures that rare pieces are replicated sufficiently so that holes do not appear in the torrent as peers leave the swarm. One of the design considerations of BitTorrent - vs other distributed data mechanisms - is to have as reliable replication as possible. It is highly undesirable in BitTorrent for even a single piece to disappear, as this could render the final file(s) completely unusable. This contrasts heavily with a file distribution system such as Amazon’s Dynamo, where it may be quite acceptable for data to disappear - is it really a huge deal if a user loses the contents of their shopping cart in an outage?

In any case, the desire to ensure replication of the rarest pieces in a torrent is overridden by the need to get a small number of pieces as a peer is bootstrapping. To this end, the initial pieces are downloaded in random order as opposed to rarest-first order. In practice this means the bootstrapping peer should get its initial data quite a bit faster. The number of pieces to download before switching to rarest-first order is suggested to be four.

As a general point, the peer will aim to complete a piece fully before requesting blocks belonging to another piece. Alternately put, if the peer has a few blocks in one piece, it will concentrate on downloading the remaining blocks belonging to that piece before starting on another piece.

BitTorrent Strategies: The End Game

Wednesday, December 26th, 2007

Downloads in BitTorrent take place according to a number of strategies, which map to stages. Initiating a torrent download has one strategy, normal operation has another strategy, and finally pulling down the last remaining pieces has yet another strategy.

The End Game is the name for the final download strategy - there is a tendency for the last few pieces of a torrent to download quite slowly. To avoid this, many BitTorrent implementations issue requests for the same remaining blocks to all its peers. When a block comes in from one peer, you send CANCEL messages to all the other peers requested from, in order to save bandwidth. Its cheaper to send a CANCEL message than to receive the full block and just discard it.

However, there is no formal definition of when to enter End Game Mode. In my BitTorrent implementation, Unworkable, I quite stupidly used a percentage-based threshold. When only 5% of the torrent download was incomplete, Unworkable would enter End Game Mode. I didn’t notice how stupid this was until I got around to testing with a large torrent download, a Fedora Core 8 DVD image which is 4G in size or so. It turns out that 5% of 4G is quite a bit of data, and requesting each bit of it from every single peer is extremely inefficient. So I did some more research to find a more effective way to detect when to enter End Game.

I found two popular definitions:

  1. All blocks have been requested
  2. Number of blocks in transit is greater than number of blocks left, and no more than 20

I didn’t understand the choice of the number 20 in method 2, so I decided to go with option 1), which I did understand. In Unworkable CVS HEAD (and in the 0.4 release, coming in a week or two), the ill-informed percentage-based threshold is gone and End Game Mode is entered when all blocks are requested.

How to Create a Highly Successful Open Source Project

Tuesday, December 25th, 2007

I’ve been making public releases of Unworkable, a simple and efficient from-scratch BitTorrent implementation written in C for a couple of months now. Prior to that, I had been working on it for about 18 months on and off. While I wouldn’t claim it to be highly successful at this early point, I make an effort to try to manage the project well, and I intend to write about some of my ideas and the things I have learned along the way.

Here are some of my thoughts so far:

Release early, release often. This advice was, to my knowledge, first phrased in this way by Eric S. Raymond, but its a pretty straight-forward observation. I attempt to put out two releases per month - roughly one every fourteen days. A reliable release cycle sends a great message to users. It shows them that you care about your project, that it is actively and reliably developed and it pushes you toward an evolutionary, as opposed to revolutionary, approach to development.

Focus on incremental improvements. This is closely related to the above point, and is something I greatly admire the OpenBSD project for adhering to so strictly. Reliability and stability are far more important in the long term than having the latest whiz-bang feature. There is no shortage of low quality software in the world - concentrate first on doing a small set of things very well, gradually adding new features as existing ones stabilise.

Strive to make your project as portable as possible. This is sadly often overlooked. A depressingly large amount of open source software today is written with the assumption that it will only ever run on Linux on the i386 platform. First of all, your goal should be to attract as many users and contributers to your project as possible. The more platforms you can support, the larger your potential pool of feedback and support. Secondly, supporting multiple platforms forces you to write code capable of dealing with both little and big endian architectures, different pointer sizes (32bit vs 64bit), and differing alignment requirements (for example sparc64 is rather strict about this). All of these differences can expose nasty bugs in your software, and its great to squash them early on! Thirdly, it pushes you in the direction of increased modularity and almost certainly necessitates a well-maintained build system - both of which are important in the long term.

Embrace criticism. One of the most terrifying things about releasing your code as open source is that other people are able to look at it, pick it apart, criticize it. Many people are very afraid of this sort of negative criticism - they feel that it is a personal attack. Instead of dreading criticism, learn to embrace it. Realise that negative feedback is completely disconnected from you - it says nothing about you as a person. See it as extremely valuable input, use it to drive yourself to make improvements. Commercial software developers pay top dollar for QA departments, third party audits and analyses, user surveys and so on - you are getting this stuff for free! Furthermore, the ability to constructively accept criticism of your work is a great asset that will make you highly attractive to potential employers.

Encourage feedback and contributions. I like nothing more than a detailed bug report, or a bunch of suggestions about features to implement - or even if someone submits my software to a compendium on my behalf, thats great! Some people are sadly completely incapable of dealing with users of their software. They are unable to see that just because someone emails you asking for support on an old version in a rude way, you are free to ignore them. If someone submits a diff that you don’t like, you don’t have to accept it. If someone emails you a list of features you “must implement”, you are not compelled to do what they ask. Sometimes user contributions are imperfect - I’ve received diffs that I haven’t committed, and I’ve received feature requests which I have no intention of implementing. However, I’ve gotten some great ideas, some nice diffs, and just generally gotten a feel for what people are interested in - and I didn’t even have to pay to conduct a market survey!

Thats it for now. Next time I will write a bit more about my thoughts on documentation, promotion and distribution for a highly successful open source project.

BitTorrent Distributed Hash Table (DHT) or Trackerless BitTorrent I

Monday, December 24th, 2007

One of the more interesting extensions to the BitTorrent protocol has been the introduction of a distributed hash table implementation. As mentioned in my previous article on the basics of the BitTorrent protocol, traditionally BitTorrent relies upon a centralised “tracker” application - which runs over standard HTTP - in order to facilitate contacting peers and so on. The requirement for a centralised tracker is obviously a major barrier to fully decentralised operation, and a problem in terms of BitTorrent’s resistance to tracker outage (perhaps even caused by legal actions). In part one of this article I’m going to look a bit at the network side of BitTorrent’s DHT.

The official BitTorrent DHT specification states that the protocol is based on Kademilia. In BitTorrent, DHT is mostly separated from the original protocol. Peers listen on an additional port, using a UDP protocol, to issue network searches and so forth. The DHT protocol is known as KRPC and consists of three message types - query (”q”), response (”r”) and error (”e”). There are four queries:

  • PING
  • FIND_NODE
  • GET_PEERS
  • ANNOUNCE_PEER

Each KRPC message is formatted in B-Encode, with various key-value pairs encoded in a dictionary structure. Each node participating in the DHT has its own routing table containing contact information for nodes “near” to it (according to a mathematical ‘distance function’). This routing table is used to produce responses to queries. We will go more into details of how this works in the next article.

The TCP BitTorrent peer protocol itself has only been very slightly extended in order to take advantage of this new DHT functionality. During the handshake, peers may indicate that they support DHT. If the receiver also supports DHT, it should send a new message called PORT (id is 0×09) with the port number of its UDP DHT service, encoded as a 16-bit value. The receiver of this message will then try to send a PING message to the peer on this port, and if it gets a reply, routing table adjustments etc should take place.

Those are the very basics of BitTorrent DHT from a network perspective. I will write more about the details of the algorithms used for routing and how the consistent hashing is employed in part two.