How to Create a Highly Successful Open Source Project

December 25th, 2007

I’ve been making public releases of Unworkable, a simple and efficient from-scratch BitTorrent implementation written in C for a couple of months now. Prior to that, I had been working on it for about 18 months on and off. While I wouldn’t claim it to be highly successful at this early point, I make an effort to try to manage the project well, and I intend to write about some of my ideas and the things I have learned along the way.

Here are some of my thoughts so far:

Release early, release often. This advice was, to my knowledge, first phrased in this way by Eric S. Raymond, but its a pretty straight-forward observation. I attempt to put out two releases per month - roughly one every fourteen days. A reliable release cycle sends a great message to users. It shows them that you care about your project, that it is actively and reliably developed and it pushes you toward an evolutionary, as opposed to revolutionary, approach to development.

Focus on incremental improvements. This is closely related to the above point, and is something I greatly admire the OpenBSD project for adhering to so strictly. Reliability and stability are far more important in the long term than having the latest whiz-bang feature. There is no shortage of low quality software in the world - concentrate first on doing a small set of things very well, gradually adding new features as existing ones stabilise.

Strive to make your project as portable as possible. This is sadly often overlooked. A depressingly large amount of open source software today is written with the assumption that it will only ever run on Linux on the i386 platform. First of all, your goal should be to attract as many users and contributers to your project as possible. The more platforms you can support, the larger your potential pool of feedback and support. Secondly, supporting multiple platforms forces you to write code capable of dealing with both little and big endian architectures, different pointer sizes (32bit vs 64bit), and differing alignment requirements (for example sparc64 is rather strict about this). All of these differences can expose nasty bugs in your software, and its great to squash them early on! Thirdly, it pushes you in the direction of increased modularity and almost certainly necessitates a well-maintained build system - both of which are important in the long term.

Embrace criticism. One of the most terrifying things about releasing your code as open source is that other people are able to look at it, pick it apart, criticize it. Many people are very afraid of this sort of negative criticism - they feel that it is a personal attack. Instead of dreading criticism, learn to embrace it. Realise that negative feedback is completely disconnected from you - it says nothing about you as a person. See it as extremely valuable input, use it to drive yourself to make improvements. Commercial software developers pay top dollar for QA departments, third party audits and analyses, user surveys and so on - you are getting this stuff for free! Furthermore, the ability to constructively accept criticism of your work is a great asset that will make you highly attractive to potential employers.

Encourage feedback and contributions. I like nothing more than a detailed bug report, or a bunch of suggestions about features to implement - or even if someone submits my software to a compendium on my behalf, thats great! Some people are sadly completely incapable of dealing with users of their software. They are unable to see that just because someone emails you asking for support on an old version in a rude way, you are free to ignore them. If someone submits a diff that you don’t like, you don’t have to accept it. If someone emails you a list of features you “must implement”, you are not compelled to do what they ask. Sometimes user contributions are imperfect - I’ve received diffs that I haven’t committed, and I’ve received feature requests which I have no intention of implementing. However, I’ve gotten some great ideas, some nice diffs, and just generally gotten a feel for what people are interested in - and I didn’t even have to pay to conduct a market survey!

Thats it for now. Next time I will write a bit more about my thoughts on documentation, promotion and distribution for a highly successful open source project.

BitTorrent Distributed Hash Table (DHT) or Trackerless BitTorrent I

December 24th, 2007

One of the more interesting extensions to the BitTorrent protocol has been the introduction of a distributed hash table implementation. As mentioned in my previous article on the basics of the BitTorrent protocol, traditionally BitTorrent relies upon a centralised “tracker” application - which runs over standard HTTP - in order to facilitate contacting peers and so on. The requirement for a centralised tracker is obviously a major barrier to fully decentralised operation, and a problem in terms of BitTorrent’s resistance to tracker outage (perhaps even caused by legal actions). In part one of this article I’m going to look a bit at the network side of BitTorrent’s DHT.

The official BitTorrent DHT specification states that the protocol is based on Kademilia. In BitTorrent, DHT is mostly separated from the original protocol. Peers listen on an additional port, using a UDP protocol, to issue network searches and so forth. The DHT protocol is known as KRPC and consists of three message types - query (”q”), response (”r”) and error (”e”). There are four queries:

  • PING
  • FIND_NODE
  • GET_PEERS
  • ANNOUNCE_PEER

Each KRPC message is formatted in B-Encode, with various key-value pairs encoded in a dictionary structure. Each node participating in the DHT has its own routing table containing contact information for nodes “near” to it (according to a mathematical ‘distance function’). This routing table is used to produce responses to queries. We will go more into details of how this works in the next article.

The TCP BitTorrent peer protocol itself has only been very slightly extended in order to take advantage of this new DHT functionality. During the handshake, peers may indicate that they support DHT. If the receiver also supports DHT, it should send a new message called PORT (id is 0×09) with the port number of its UDP DHT service, encoded as a 16-bit value. The receiver of this message will then try to send a PING message to the peer on this port, and if it gets a reply, routing table adjustments etc should take place.

Those are the very basics of BitTorrent DHT from a network perspective. I will write more about the details of the algorithms used for routing and how the consistent hashing is employed in part two.

Enderby Land in 300 words

December 24th, 2007

On our boat, cold, in the midst of a particularly lonely sea. Clouds sitting overhead, fat fearsome and lazy, refused to let any sun filter through. Biscoe was standing on the deck, eyes dulled by the frozen wind, struggling for a matchstick to set his meagre smoke alight. Both hands were in pockets, clenched against the heartless temperature - he knew one must be sacrificed to the elements, in order that the smoke be lit. But the hiss was overwhelming, that primitive engine spewing forth steady clouds of smoke, chuffing on and on through the endless swathes of water, wasted and useless. Now and then a great clanging would come, as the hull punched through some nonsensical underwater ice pack. He resigned to his fate, removing one hand - his left - first from the pocket, then from its own tender glove. Steam rose up from that now-empty glove hole in a childish imitation of the grand machine’s chimney stack. Vapours are at the essence of our construction, much the same as any good boat, Biscoe thought, as he inhaled the product of the cigarette’s own combustion.

“Lad!” Biscoe shouted through the unfriendly air. “Lad! What do you make of that”. He pointed the lad toward what he had been regarding. It looked like a very vague dash in the sea at the horizon. A dirty smudge on an otherwise flawless daguerreotype. The lad peered and peered, trembling terribly through his too-thin clothes, until he, too, saw the smudge. “Could it be real, sir?” said the lad, “could it be land?”. Biscoe took another drag through his lips, holding the warm smoke in his own lungs for a long moment, before spewing forth another cloud of smoke. “Yes, lad, its land. I’d say its Enderby Land”. At that very moment, a great cachalot breached by the aft of the boat.

Decoupled Python GUI Construction, or BitTorrent visualisation

December 22nd, 2007

While in general I appreciate very simple, no-nonsense user interfaces for applications that work efficiently on the console and so can be used via SSH, there are times when increased visualisation is very useful.

Specifically with regard to my BitTorrent client, Unworkable, the default user interface is exceedingly simple. Inspired by the ubiquitous scp program on UNIX, the idea is that it should be just as simple to download a file via BitTorrent as via SSH or FTP/HTTP. Indeed, I borrowed the progressmeter.c source code from OpenSSH (a great repository of nice code). Have a look at this screenshot of Unworkable running on Windows to get an idea of what it looks like.

While I’m pretty happy with the current console user interface, it is obviously very limited in how much information it can display. To overcome this, I have for a long time had an extremely verbose trace mechanism. Pass the -t option to Unworkable, and it will write a detailed log. This is incredibly useful for debugging, but it suffers from the opposite problem - information overload. Its quite difficult for a human to parse scrolling pages of ASCII text and spot patterns. It can take considerable analysis to determine exactly why the program is making the decisions it is. This is the main motivation for adding support for a graphical user interface - information can be much more easily distilled into graphs and other visualisations which make it much easier to understand, at a glance, what is going on. For some examples of the kind of things I have in mind, take a look at the Azureus screenshots page.

This brings me to the question of how to nicely add a GUI to my application. I strongly want to keep the existing simple, console interface, and don’t want to bloat the application with additional external dependencies for widget sets and so forth. I decided to go with an IPC mechanism, to split the GUI from Unworkable entirely. While considering IPC mechanisms, I figured why not simply use TCP/IP. While in general the GUI is going to be running on the same host as Unworkable, this at least gives the option for operation over a network.

So, I have added a simple “control server” to Unworkable, which currently has a fairly basic ASCII protocol. At the moment, communication is unidirectional - Unworkable only pushes out some event notification messages. There is no mechanism for clients to send instructions back, since this isn’t required for visualisation just yet and that is my first priority. I have started a client implementation in Python. “Why Python?”, you may ask. Python has pretty good networking support, good UI toolkit support and good multi-platform support. I’m also pretty experienced with the language, and I find it very fast to write applications with. Since the GUI itself performs practically no hard computation nor I/O, the performance penalties of a higher-level language are hardly a concern.

In closing, the formula of having the application split into a C program which does the performance-sensitive stuff, exposing a simple ASCII protocol over TCP/IP, while implementing the user interface in Python, permits maintaining a lean and efficient core application with a slick graphical user interface.

Anton Chekhov’s Ward No. 6

December 21st, 2007

Just yesterday I was sent a link by a good friend to the English translation of a short story called Ward No. 6 by Anton Chekhov.

I had never read anything by Chekhov before, but I was familiar with a number of Russian authors - Gogol, Dostoevsky, Bulgakov, Tolstoy - all of whom I enjoyed greatly.

Chekhov’s style reminds me somewhat of a more-psychological Balzac, certainly somewhat in the realist vein, or of Zola. Very cutting, gritty and in my opinion worthwhile.

Unworkable 0.3 released

December 20th, 2007

I have just tagged, packaged and announced version 0.3 of my BitTorrent implementation, Unworkable. My goal with Unworkable is to make releases frequently - hopefully twice a month or so - with incremental improvements each release. The hope is that each release should be of a higher quality than the last. Therefore I try to test new features well and ensure the stability is at least as good as the previous release. I also try to run tests across a wide variety of platforms (Solaris, OpenBSD, Linux, Windows, Mac OS X, etc).

Anyway, here’s whats new in this version:

  • Fixed a subtle bug in download strategy
  • Removed numerous format specifier bugs by bringing source in line with C99.
  • Major refactoring and code cleanup.
  • Added initial implementation of a TCP/IP “control server”
  • Checked in some initial work towards a decoupled Python UI.
  • Portability improvements to build and run on Windows (Cygwin).
  • Build and runtime testing on Fedora 7 and Gentoo Linux.

Porting software from OpenBSD to Windows

December 8th, 2007

Just committed the bits for Unworkable to build under Windows, using Cygwin. The code changes were pretty minimal - specifically, Cygwin lacks getaddrinfo() - however this was a relatively simple matter of bundling the KAME implementation (which is BSD licensed) in the source tree and building and linking with that if it wasn’t found. The other issue was that Cygwin doesn’t implement the madvise() function - this isn’t strictly necessary, its just supposed to give some hints to the OS VM - so I could safely #ifdef it out on Cygwin.

More difficult was getting SCons to work properly under Cygwin. Don’t bother trying to use the Windows package on their website, simply install Python through Cygwin, download the SCons source, and build and install it yourself manually. Then it will work fine.

Cygwin is a good starting point. Now to see about getting it to build as a native win32 app under MingW or even MS Visual Studio. Fortunately libevent appears to already build natively under Windows, and I’d imagine OpenSSL does too - so it shouldn’t be a horrendous amount of work.

Comments were broken, reCAPTCHA issue with chroot Apache and PF

November 21st, 2007

I just realised that posting of comments was broken on my site for at least a while. I had been wondering why there was nothing in my moderation queue for some time.

The exact reason is interesting. First of all, this site is hosted under a chroot()-ed Apache, with mod_php5, running on OpenBSD. Because of the prevalence of PHP vulnerabilities - especially in terms of using them to send out spam and so forth - we’ve locked down outgoing connections from the apache user at the packet filter (PF) level. The exact rule is: block out log proto { tcp, udp } all user www. However, the captcha service I use - reCAPTCHA - requires that the webserver connect to its hosts to verify input. I had therefore explicitly allowed Apache to connect to the specific reCAPTCHA host for this purpose.

It turns out that the DNS for the various reCAPTCHA services had changed, so I needed to update my PF rules. Sorry for the outage!

cvs log output parser, find N most recent commits, in Python

November 20th, 2007

Over the weekend I decided to spend some time trying to improve the web page for my BitTorrent project. I felt that there is a good bit going on with the project, but that that web page was not reflecting this to people who might be visiting it. I already write CVS commit log messages, I don’t feel like also writing those things into the HTML page. Why not simply display the latest N commit logs on the web page? This informs the user at a glance a) how active the project is b) what I’ve been working on.

Probably the traditional approach to doing this kind of thing is using CVS ‘commit hooks’. These are scripts run on every check in that can do things with the log messages and so on. For example, the OpenBSD ’source-changes’ mailing list is run using this feature. I didn’t really want to bother with this though - not least because the web server is not the same machine as the CVS server. Also, the commit hook approach only works from the point it has been set up onwards. I wanted prior changes to be visible.

The output of ‘cvs log’ contains all the relevant information. The cvs log command doesn’t require access to the CVSROOT, indeed I can do it quite easily read-only over SSH, since cvs.unworkable.org uses anoncvs - thus it is highly secure, and distributed. All I needed to do was to write a program to parse the output of ‘cvs log’, order it by timestamp (latest first ordering), and display the first N entries. It turns out to be pretty easy to do this in Python. My first question was, what should the data structures be? I reasoned that each commit could be represented by a dictionary with a few keys - timestamp, file, and commit log message. So all we needed was a list of these dictionaries - easy! After this, I saw two challenges - the parser and the sorter. The parser is a fairly simple finite state machine. String handling is pretty nice in Python - at least compared to C. It did not take too much work to get it pulling out the data I needed - great. Now to order the list. It is not hard to grasp what we need to do - we want something to compare each dict in the list based on its timestamp. But how exactly do we get Python to do this for us? We could write our own sort routine, but that shouldn’t really be necessary. Python already has a perfectly good list.sort() method. The question thus becomes, how do we make the sort() API do what we want? Python’s internal API docs reveal:

sort(...)
    L.sort(cmp=None, key=None, reverse=False) -- stable sort *IN PLACE*;
        cmp(x, y) -> -1, 0, 1

This tells us very little beyond a few named parameters. I can guess the meanings of ‘reverse’ and ‘cmp’ roughly, but there is no info at all on ‘key’. Sparse documentation like this is one of my pet peeves about Python. Fortunately I was able to figure it out. ‘key’ expects a function to be passed in, which will be called on each list entry and should return a value to sort by. Aha! All we need is to write a function to return the timestamp from each list entry. Oh wait - it turns out we don’t. Python’s standard library has a module called ‘operator‘ which includes a pre-written function “itemgettr” to do exactly this! So the entire ’sorting’ part of the problem can be achieved in a single line: entries.sort(key=operator.itemgetter(’dt’), reverse=True)

Having finished the program, it simply reads from stdin and writes to stdout. It defaults to printing the most recent three entries, but this is configurable through a single command-line argument. For example, cvs log unworkable | python changes.py -n 1 currently prints out:

network.c
revision 1.180
date: 2007/11/20 04:44:07;  author: niallo;  state: Exp;  lines: +3 -2
guard against a NULL piece_dl in the PIECE message handler.  this needs to be re-examined,
but it should at least stop us segfaulting.

which is exactly what I want!

Anyway, you can download the Python program (which I’ve BSD-licensed) here. Feel free to post any comments or feedback.

Porting software from OpenBSD to Linux II

November 19th, 2007

I wrote the other day about porting software to Linux. I talked about some of the differences between the platforms in terms of the implementation of some well-known C library functions, and hinted at some other incompatibilities. Well, now I’m going to write about some of the other issues I encountered.

There are a number of security-oriented libc functions originating in OpenBSD which are very useful but have had varying degrees of uptake from other OSes. The classic examples are the well-known strlcat and strlcpy functions. As is discussed in this paper, these functions offer a less error-prone interface (specifically in the realm of truncation detection) compared to the strncat/strncpy ANSI C functions. Although they’ve been integrated into to at least Solaris, FreeBSD and NetBSD libc’s, for some bizarre reason they are not included in GNU libc. An additional OpenBSD function I make use of is strtonum. The issue here is that for my software to build on platforms other than OpenBSD, I must include copies of these functions and build them on platforms which don’t already have them. This leads quickly to the need for a way of dynamically detecting at build time whether these functions are available on the host system. Traditionally, the solution to this situation has been to use GNU Autoconf. However, I’ve never been a big fan of autoconf, mostly since its so complicated and tricky to get working with your project. Many people have been dissatisfied with autoconf though, and so alternatives now exist.

Much as I like the simplicity and ubiquity of make, I didn’t fancy implementing all the function call dependency checking in it. Furthermore, I am only familiar with the BSD flavour of make, and didn’t have much inclination to port my makefile over to other platforms. So I decided that I would try out one of the contemporary build systems. I came across something called SCons. I read the documentation and liked it because it seemed to make the tricky things pretty easy. Additionally, its written in Python - which I quite like - and seems to be pretty portable. At least, there are packages for OpenBSD, Linux and Windows.

I actually ended up with two different build systems. A BSDmakefile which will build my application under OpenBSD with no dependencies beyond the base install, and an SConstruct file for all the other platforms - requiring the presence of SCons. I’ve tested the SConstruct build under both Ubuntu and Centos Linux and it works great. I hope to add support for Solaris and Mac OS X in the near future - as soon as I get access to these operating systems.