Posts Tagged ‘UNIX’

Porting software from OpenBSD to Windows

Saturday, December 8th, 2007

Just committed the bits for Unworkable to build under Windows, using Cygwin. The code changes were pretty minimal - specifically, Cygwin lacks getaddrinfo() - however this was a relatively simple matter of bundling the KAME implementation (which is BSD licensed) in the source tree and building and linking with that if it wasn’t found. The other issue was that Cygwin doesn’t implement the madvise() function - this isn’t strictly necessary, its just supposed to give some hints to the OS VM - so I could safely #ifdef it out on Cygwin.

More difficult was getting SCons to work properly under Cygwin. Don’t bother trying to use the Windows package on their website, simply install Python through Cygwin, download the SCons source, and build and install it yourself manually. Then it will work fine.

Cygwin is a good starting point. Now to see about getting it to build as a native win32 app under MingW or even MS Visual Studio. Fortunately libevent appears to already build natively under Windows, and I’d imagine OpenSSL does too - so it shouldn’t be a horrendous amount of work.

Porting software from OpenBSD to Linux II

Monday, November 19th, 2007

I wrote the other day about porting software to Linux. I talked about some of the differences between the platforms in terms of the implementation of some well-known C library functions, and hinted at some other incompatibilities. Well, now I’m going to write about some of the other issues I encountered.

There are a number of security-oriented libc functions originating in OpenBSD which are very useful but have had varying degrees of uptake from other OSes. The classic examples are the well-known strlcat and strlcpy functions. As is discussed in this paper, these functions offer a less error-prone interface (specifically in the realm of truncation detection) compared to the strncat/strncpy ANSI C functions. Although they’ve been integrated into to at least Solaris, FreeBSD and NetBSD libc’s, for some bizarre reason they are not included in GNU libc. An additional OpenBSD function I make use of is strtonum. The issue here is that for my software to build on platforms other than OpenBSD, I must include copies of these functions and build them on platforms which don’t already have them. This leads quickly to the need for a way of dynamically detecting at build time whether these functions are available on the host system. Traditionally, the solution to this situation has been to use GNU Autoconf. However, I’ve never been a big fan of autoconf, mostly since its so complicated and tricky to get working with your project. Many people have been dissatisfied with autoconf though, and so alternatives now exist.

Much as I like the simplicity and ubiquity of make, I didn’t fancy implementing all the function call dependency checking in it. Furthermore, I am only familiar with the BSD flavour of make, and didn’t have much inclination to port my makefile over to other platforms. So I decided that I would try out one of the contemporary build systems. I came across something called SCons. I read the documentation and liked it because it seemed to make the tricky things pretty easy. Additionally, its written in Python - which I quite like - and seems to be pretty portable. At least, there are packages for OpenBSD, Linux and Windows.

I actually ended up with two different build systems. A BSDmakefile which will build my application under OpenBSD with no dependencies beyond the base install, and an SConstruct file for all the other platforms - requiring the presence of SCons. I’ve tested the SConstruct build under both Ubuntu and Centos Linux and it works great. I hope to add support for Solaris and Mac OS X in the near future - as soon as I get access to these operating systems.

Faster BitTorrent, or, SHA1 is slow

Sunday, October 28th, 2007

This weekend I have been concentrating on improving the performance of my BitTorrent implementation. I somewhat believe in the mantra “premature optimisation is the root of all evil”, or at least, I’m not too worried about making everything super fast the first time around. Asynchronous network programs are complicated enough to write that I expected off the bat to have to rewrite much of the code after having sketched it out. This process of hacking up an initial approach, pushing it as far as I could, figuring out a better way, then re-doing it the better way, has brought my project reasonable success. In any case, I was happy enough with the download process that I decided to spend some time making it faster.

In order to make something fast, it helps to understand what is making it slow. Efficient BitTorrent protocol handling requires keeping lots of “in-flight” data block requests around. Both queued to individual peers, or spread out across many peers. At any one time, there are many block requests which the implementation must track, especially in order to make decisions about what it should try to download next. While working on my initial sketch which used a single linear tail queue for all “in-flight” requests, I found there are two common ways I need to access these requests. Firstly, I often ask “give me the in-flight request for offset X of piece index Y”. This maps nicely onto a binary tree structure, searched by request index and offset. Currently, these properties are guaranteed to be unique - i.e. only one request for a given index/offset pair can exist at a time - but in the future this will not be guaranteed. Think especially of “the endgame”, where we may want to request the same block from many peers. To future proof against this, I have the tree actually point to a tail queue of requests. Currently I am using a red-black tree. The second way I often need to access “in-flight” requests is by peer. I need to know “what are all the in-flight requests for peer X”. This is quite a bit easier to manage. Each peer simply has its own tail queue of its associated requests, which is at most one hundred elements long, and so cheap to search linearly. Furthermore, when inspecting a peer’s requests, it is almost always to perform an operation - e.g. deallocation - on all of them, so the the linear data structure is perfectly appropriate.

I didn’t measure exactly how much of an improvement this data structure change made - perhaps very little, since the linear lists were never more than a thousand entries long - however I’m happier to use the ‘right’ data structures in general, and not have to worry about it. After some profiling, I found another pretty obvious consumer of resources - SHA1 hashing. I had not especially optimised unworkable’s handling of the PIECE message, and it was performing a SHA1 hash of a piece as each block for it came in - regardless of whether we had received all the other blocks or not. I added a simple conditional which only does the checksum when we at least believe we have received all the blocks in the piece. This drastically improved performance, since we are now doing on the order of 16x (but depending on piece/block sizes, of course) fewer SHA1 hash operations.

I have not looked closely at other BitTorrent implementations, but I believe I can answer with relative confidence the question “why is BitTorrent so resource hungry?”. The answer: SHA1 hashing is expensive. As an aside, I do not think that OpenBSD yet supports hardware-accelerated SHA1 features in the newer VIA CPUs. Such hardware crypto features would be nice to see in more chips, maybe. Also, I wonder if other BitTorrent implementation authors have come up with better data structures or storage schemes to keep track of the stuff I am keeping track of. I must take a look sometime.

Porting software from OpenBSD to Linux

Thursday, October 25th, 2007

I have been making very good progress with my BitTorrent implementation recently. However, I have felt the need to make it compile out of the box under systems other than OpenBSD. The first system I have ported to is Linux, specifically Ubuntu 7.10. In doing the port, I found some interesting differences between OpenBSD and Linux.

mmap() differences

Unworkable (not officially released yet, but blog readers can have a sneak peek) uses the mmap() system call rather heavily. For each piece in a torrent, I create at least one corresponding mmap’d segment - more if the piece spans multiple files. OpenBSD’s mmap() accepts arbitrary offsets, which is very straight forward. Linux, however, requires that the offset be aligned to page size. That is, offset must be a multiple of the page size.

On a POSIX operating system, the page size can be queried easily using the sysconf libc function, passing the _SC_PAGESIZE value. Once I have the page size, I find the nearest multiple to the offset we really want, and mmap that. Then I do some pointer arithmetic to get to the ‘real’ offset from there, and do all my reads and writes with a ’synthetic’ pointer into the mmap as the base address. The alignment requirement makes things a little more complicated, but its not too tough to handle.

dirname() / basename()

I use dirname() to ensure that directories exist before I try to write files underneath them - something analogous to mkdir -p. Under OpenBSD, it is safe to call dirname() directly on your path string, and use the returned value. On Linux however, dirname() and basename() modify the string you give it, meaning that you can’t re-use it. This caused me a bit of confusion until I realised what was going on. Under Linux, its necessary to create a copy of your string before passing it to dirname(), so that your original is not mangled.

snprintf()

The snprintf() function is very useful for conveniently building strings in C. I use it especially in Unworkable’s HTTP/1.0 client implementation, to build the query string. Under OpenBSD, its fine to snprintf a variable into itself, e.g.

snprintf(x, XLEN, "%s = %d", x, y);

Under Linux, this doesn’t work - in fact, glibc’s snprintf() fails completely silently in this case, from what I could tell. It was necessary for me to snprintf() into a temporary buffer, and copy that around, rather than just using the snrpintf() on its own, like you can under OpenBSD.

Well, thats it for now. I’m going to write more about the porting process, in terms of adding missing functions and dealing with Make incompatibilities, next.

BitTorrent and protocol analysis, instrumentation

Sunday, October 14th, 2007

I have been working, time permitting, on my BitTorrent implementation over the past couple of months. I’d like to note first of all that a fellow
BitTorrent hacker, the maintainer of LibBt, Ryan Walklin, contacted me about some of the issues I was writing posts about. Unfortunately I don’t think I was of much help to him with his MSE/OpenSSL problems, but it was a very useful contact for me since Ryan took the time to explain some of the downloading algorithms used in LibBT.

It was great to get some answers to some of my questions, thanks for that Ryan! The discussion made me all the more aware of the limitations in my design, and I have since been re-structuring much of the network code. This correspondence alone proves to me that its worth taking the time to write posts on some of the issues I’m working through. Today I finally got where I want to be in terms of keeping track of in-progress block transfers and peer connections and so forth, so I feel I’m making good progress. However for a long time now I’ve been wishing for some sort of analysis tool to see exactly what messages existing clients (which work!) are sending, and in what order. Just now I discovered an “instrumented” version of the mainline BitTorrent client, from the folks at INRIA. I’d been aware of their work in BitTorrent for some time, having read their papers - which are responsible for me using the rarest-first piece download strategy. Anyway, this patched client they have published is truly eye-opening. It shows exactly the messages the client is sending and receiving, along with timing and internal state information. Absolutely invaluable for seeing how a real world transfer should go! This information should greatly help me in making my implementation much more robust.

BitTorrent encryption, MSE basics

Sunday, July 29th, 2007

I have been working quite a bit recently on my BitTorrent implementation. My project is faring reasonably well so far - it can successfully download large real-world torrents at decent speed. In the course of development, I have been focusing solely on the core specification, as documented here. Having gotten support for peer connections - outgoing and incoming - working, I started to notice something. Some peers would send ostensibly random data to my client. Investigation of this led me to conclude these were peers using the Message Stream Encryption protocol.

I am not hugely interested in avoiding traffic shaping, but the encryption protocol looks like fun to implement and I’d like my software to be able to communicate with these funky peers (instead of just killing them, as it does now). I know very little about cryptography, but I know a little more now that I have begun implementing MSE. MSE really has two components - a Diffie-Hellman (D-H) key exchange to negotiate a secret, and RC4 stream cipher to encrypt the data. Both these functions are used in Secure Sockets Layer. However it should be pointed out that MSE is more about obfuscating the data to confuse traffic shapers than hiding its contents from prying eyes.

Fortunately for me, the OpenSSL library contains nice implementations of both D-H key exchange and RC4 . Before I can get to the nitty gritty of actually negotiating obfuscated streams, I had to make considerable changes to my software’s peer handling logic. I had written the peer response handler with quite hard assumptions about the nature of the incoming peer handshakes - for example, assuming the first byte would be the length of the protocol string. Furthermore I needed to add some timekeeping logic for each peer, because MSE handshake depends on timing for synchronisation. The initial peer handshake message is variably padded and so time must be considered to know when the message is complete. For example a MSE handshake from peer A may be considered invalid by peer B a) if A sent less than 96 Bytes within 30 seconds or b) if A sent more than 608 bytes.

My first step has been to re-work my peer handshake handling so that MSE handshakes can be distinguished and separately handled by the software, while traditional handshakes continue to work. Code to do this has been committed. The next step is to actually support crypto peers. For anyone curious about where the source code for my software is, I am not yet ready to release it. I have distributed it to a small group of testers to help iron out some of the bugs, but I don’t feel its ready for public release. I am being somewhat selfish and sitting on my personal project until I have polished it to where I want it to be. If you are really interested though, shoot me an email (niallo (*) niallo dot net).

BitTorrent basics - protocol etc

Tuesday, July 10th, 2007

I started working on a BitTorrent implementation in 2006. The protocol is kind of fun and there is a need for a lightweight UNIX implementation in my opinion. rTorrent is quite usable, but absolutely huge - something like over 40,000 lines of C++ last time I looked. My implementation is ~3,500 lines at present and has most of the needed functionality - although admittedly it will not support downloading of multiple torrents at a time, nor the fancy UI of rTorrent. Even assuming it doubles in size by the time its finished, that is still quite lightweight in comparison. It is written in C, Yacc and uses libevent - and its under the BSD license. I develop it completely under OpenBSD, although across multiple platforms (to weed out the endianness and pointer-size bugs). It should be easily portable to other systems with some glue (much like OpenSSH or OpenNTPd). Anyway, it has shaped up rather well and in fact I have been able to actually download some ‘real-world’ torrents with it. However, I still need to overhaul the network scheduler. Once I am satisfied with this part, I will publish the source code. The network scheduler is a complicated component - with numerous research papers on the subject! - and I plan to cover it in future postings.

For now, I want to post a quick summary of how BitTorrent works. I’m going to describe ‘vanilla’ BitTorrent - without distributed, trackerless operation and unofficial encrypted streams. The best complete “specification” I have found is this wiki page - the official specification being rather useless. Unfortunately, the official Python implementation sources are a neceesary reference at times also.

BitTorrent splits data into a number of units called pieces. Typically these pieces are 256K. Each piece has an associated 20 byte SHA-1 hash to verify integrity. The pieces are sub-divided further into what are called ‘blocks’ - typically 16K. Each peer has zero, some or all pieces. Metadata is distributed in .torrent files. Very likely you are familiar with downloading these files via HTTP and then feeding them to a BitTorrent implementation in some way. These files are encoded in a mixed binary/ASCII format called ‘bencode’ - which is a kind of mini-language supporting nested strings, integers, dictionaries and lists. Its actually quite a pain to parse in a language like C - it took me quite some time to write the Yacc parser. The metadata contains things like, number of pieces, piece sizes, hashes for each piece, tracker URL and filenames. Pieces are all zero-indexed. This gets a little complicated in multi-file torrents, where you must determine which file piece X falls within, and at what offset. My implementation uses mmap(), and has translation routines to nicely abstract these calculations away :-)

Each peer must initially ‘announce’ itself to the tracker, and is supposed to send further updates periodically. Communication with the tracker is done via HTTP GET, with the tracker returning a bencode’ed dictionary as its response. This response includes a list of other peers - whose addresses are in fixed-length (IPv4) binary format, making it a little difficult to use IPv6 for BitTorrent, at least without protocol changes. There are a variety of peer messages. Data is transferred in terms of pieces, offsets and lengths (block size). Peers tell each other which pieces they have by sending a bitmap - one bit per piece.

The basics of the protocol are fairly straight forward. Aside from some poor decisions in terms of the bencode format for encoding messages, it is not hard to code support. As I said above, things start to get tricky when it comes to peer communication strategies. One thing of note is that the most efficient algorithm for downloading pieces is believed to be ‘rarest first order’. To my knowledge, nobody is using a sequential download order - which is why partial BitTorrent downloads of audio or video data are notoriously unplayable.

Keyboard-only X, cwm hacks and Vimperator

Monday, July 9th, 2007

A few weeks ago, I wrote a short entry on some of the software I’d been looking at to reduce my reliance on the mouse. Specifically, I mentioned a window manager, cwm, and a FireFox extension, Conkeror.

Since I wrote that piece, I have refined my setup a little more. I found a Firefox extension called Vimperator which adds Vi-like keybindings. I prefer it to Conkeror mostly because it seems a little less buggy. Both Conkeror and Vimperator have a feature where you press a key and all links on a page get associated numbers (in the case of Conkeror) or letters (in the case of Vimperator). This feature enables you to very quickly visit links on a page. I found that Conkeror’s implementation of this feature would not work with certain focus-manipulating JavaScript used on websites, and also it would sometimes insert the HTML and CSS for these numbered links into HTML you were editing! In fact, because of Conkeror, I ended up with it’s numbered links embedded into an entry on this site - which was very odd and quite annoying. Conkeror also had a tendency to crash in some way which disabled all its keyboard shortcuts on windows, such that the only way to kill a window was via terminal - again, highly annoying. Vimperator is a little buggy still, but doesn’t seem quite so bad. Furthermore, Vimperator makes it easier to open links in a new tab and copy links via the keyboard. Finally, I feel more at home with Vi-like keybindings than with Emacs-like keybindings, and I think they are a little easier on the hands in any case (less chording).

As for cwm, the window manager, I in fact went ahead and made some improvements to it of my own. The cwm code is quite clean and elegant C code, and was a pleasure to hack on. Already, it featured a window-switching dialog which had auto-complete (a.k.a. type ahead find). This feature is very nice and makes switching windows via the keyboard very easy. However, cwm also had an execution dialog which lacked this feature. I felt it would be relatively easy to populate the execution dialog with the executable files in the default path to add such auto-completion. It was a little more involved than I originally anticipated, owing to the fact that the existing cwm API for these auto-completing dialogs was designed for selecting one item from many, not for entering input. With a little hacking, though, I made a working patch. I find this feature very useful, as it allows me to launch common applications like Firefox and Pidgin with a minimal number of keystrokes. The next feature I added was based on something that I saw in the Ion3 window manager - namely, an auto-completing `ssh-to’ dialog. Much like the window-switch dialog or the exec dialog, the `ssh-to’ dialog is populated with entries from your ~/.ssh/known_hosts file. OpenSSH has support for hashing these entries - the `ssh-to’ dialog simply skips hashed entries. For me at least, my work involves ssh’ing to a large number of different hosts, many of which have long names. The auto-completing dialog saves me considerable time. After committing these two hacks to the OpenBSD Xenocara tree, cwm was becoming very comfortable for me indeed. Only one major thing was bugging me - window movement. Cwm still required you to use the mouse to move windows around. Unlike a window manager such as Ion3, cwm does not (yet…) support tiling layout, and as such terminals and so on must be moved manually to maximise screen real estate. A simple solution to this is the ability to move windows around with the keyboard. After a productive conversation with OpenBSD developer Todd T. Fries I hacked up a simple patch which moved the active window with a Vi-style M-hjkl keybinding. Todd improved on this initial hack and committed it to the Xenocara tree. With these improvements, I find cwm quite comfortable and have little use for the mouse.

There are still a few niggling things in my setup. Mostly they revolve around X11’s copy/paste buffer management. A thing I do very frequently is to copy a link or text selection from Firefox into an xterm, and vice versa. X11 has effectively two copy buffers - the “PRIMARY selection” and the “CLIPBOARD”. If you select text with the mouse in an xterm, it goes into the “PRIMARY selection”. You can paste the contents into an xterm using the keybinding shift-insert. Additionally, links yanked with Vimperator, or text selected in Firefox with the mouse - this all ends up in the “PRIMARY selection” also. However, typing shift-insert in the “open” dialog (`o’) of Vimperator pastes the contents of the “CLIPBOARD”! Actually, not just the “open” dialog behaves like this, but every text input field in Firefox. Effectively this means I have to use the mouse to paste text into Firefox, which is quite irritating. I hope to find a utility which can keep the “CLIPBOARD” and “PRIMARY selection” synchronised. Personally, I do not find it helpful for these to be separate and it simply causes me annoyance. More information on X copy and paste can be found in this jwz article.

Vi(M) tips #1

Friday, June 29th, 2007

I’ve been using the ViM and Vi editors for years, but I would not say I know all their short cuts or nifty features by any stretch. Here are three shortcuts (two for plan old Vi) which I learned just this week:

- (Vi/ViM) shift-a puts you in insert mode and brings you to the end of current line.

- (Vi/ViM) shift-i puts you in insert mode and brings you to the start of current line.

- (ViM) Key sequence gqip reformats the current paragraph nicely, much like M-q in Emacs. Very useful for writing email.

I also found one very nice ViM plugin this week, SuperTab. I used to not bother with ViM plugins because of the hassle of installing them on many machines, but I found now that plugin installation consists merely of creating ~/.vim/plugin/ and dropping the .vim file in there. ViM already has some fairly cool word-completion but SuperTab makes it work all with one key (tab). I’m not exactly sure about what it will complete, but so far it will complete based on a) the content of its dictionary file and b) words in the file you are editing. Very useful for completing variable or function names in source code, and also useful as a spelling tool for words you may be unsure of in email. Perhaps even useful as a poor man’s thesauraus ;-)

Lastly, I finally got around to binding my Caps Lock key to Escape. Obviously using Vi/ViM, you hit the escape key all the time. Escape is positioned out of my left hand’s easy reach on just about every keyboard I have used, however. Binding it to Caps Lock means less strain on my hands and more speed in my editing. When I was a big Emacs user, I similarly bound Caps Lock to Control. The magic xmodmap incantation to bind Caps Lock to Escape this is:


keycode 0x42 =  Escape

Word Press post from shell

Thursday, June 28th, 2007

One of the most annoying things about this whole weblog business is having to write your posts in the browser. I can’t stand entering any more than a few sentences in a web browser text area. Not only because web browsers are so unstable, prone to crashing, or that its so easy to press the wrong key and kill the window or go back in the history and lose what you’ve written - but that you are forced to use the crummy browser text field component. I use Vi for most of my editing these days, and I have grown very fond of the movement keys (hjlk). Using these means I don’t have to move my hands from the keyboard to move the cursor, which reduces strain and increases speed. Of course I miss many other features, such as on-the-fly spell-checking (although Firefox 2+ has this), word completion, auto-save, auto-format, syntax highlighting, and bracket matching also.

I initially assumed that WordPress is too poorly thought-out to have something clever like an XML-RPC interface. Well, thankfully, I was wrong. WordPress in fact supports the Blogger, Moveable Type and a bunch of other XML-RPC interfaces. This meant I wasn’t going to have to screen-scrape and hack together some horrible and brittle HTTP client.

I do a fair bit of work in Python for my job, and I’ve come to rather like the language for whipping together smallish programs. Especially tasks involving processing files or URLS, Python particularly excels for these. While it would not be a big deal to use the Python XML-RPC library directly myself (its in the Python standard library since version 2.2), I figured I would use someone else’s code if at all possible. After a bit of searching, I found wordpresslib. This little .py file is available under the LGPL license and makes it trivial to perform a variety of Word Press operations from Python.

My rather trivial program simply allows you to submit a post in draft or published form from the shell. It accepts input either on STDIN or a specified file. You can also set a post title. For the hell of it, I’ve put it under a BSD license and packaged it up in a little tarball with wordress.py available for download here. Before using it, you need to set three variables in main.py - the URL of your Word Press install, your user and your password. Sample run:


$ echo this is a post | ./main.py -t 'a test post'

If you are seeing this post, that is proof that it works. There are many more features which could be implemented of course. Maybe I’ll bother maybe I won’t. I’ll see if I get the itch! Feel free to send me patches of course ;-)