Year at LShift – in photos

Marek Majkowski wrote “by marek on 30/04/10 Comments [+]”

What has happened to the segment registers?

Marek Majkowski wrote “16-bit days There were days when computers had 16-bit registers and 20-bit addressable memory. That is a total of 1MB memory – some claimed that it ought to be enough for anybody. Memory address space was flat and not protected by anything, now it’s known as the real mode. How was it possible to address 20-bit memory…”

By Tangopaso (Self-photographed) [Public domain], via Wikimedia Commons

Memory matters – even in Erlang

Marek Majkowski wrote “Some time ago we got an interesting bug report for RabbitMQ. Surprisingly, unlike other complex bugs, this one is easy to describe:  At some point basic.get suddenly starts being very slow – about 9 times slower!”

Introducing rabbitmq-status plugin

Marek Majkowski wrote “RabbitMQ is a becoming decent product, but it shares some of the common problems of young software – for example, beginners have a hard time understanding what happens under the hood. Don’t get me wrong, Rabbit generally works perfectly as a black-box. But at some point, when things go wrong or when Rabbit needs to…”

By John Hill (Own work) [CC-BY-SA-3.0 (], via Wikimedia Commons

Python quirks

Marek Majkowski wrote “I’ve been using Python for a while. Recently I have noted some nuances, wonders and counter-intuitive things I ran into. The list grew surprisingly fast.”

Yet another Key-Value database

Marek Majkowski wrote “Following the NoSQL movement, I became a fan of key-value databases. Usually there’s nothing interesting to say as they work fine out-of-the-box. But in a project I was recently working on K-V store started to be a major bottleneck.”

Python Queue interface for AMQP

Marek Majkowski wrote “Here at LShift we’re often discussing RabbitMQ. We’re keen about complicated deployment scenarios, redundancy of the broker and other complex use cases. While these problems are extremely interesting, some believe they are irrelevant for a great majority of RabbittMQ users. People keep asking how to get started with Rabbit. There are some very good sources…”

Memcached protocol is not enough

Marek Majkowski wrote “source Memcached protocol is not enough A few months ago I was wondering if it’s feasible to build a scalable realtime search engine using shared-nothing architecture. One of the essential project decisions I need to make, is to choose a decent communication protocol to storage nodes. Recently, the memcached protocol is becoming a standard as…”

LShift at QCon!

Marek Majkowski wrote “LShift will be attending QCon London! Please come over and meet us at stand 20 during the conference, from March 11th to 13th. I will also be presenting Etherpad clone at Skillsmatter stand (booth number 10). This will happen in break between sessions on Wednesday at 4:45 pm.”

Evserver, part3: Simplified Etherpad clone

Marek Majkowski wrote “I hate to write using markup languages. The problem with markups is that when I see a typo in a rendered output, I have to click through the text and search for exact place with the mistake. I have the same feeling about editing Wikipedia, documentation on, Trac, Blogger, Wordpress and so on. But I hate writing in WYSIWYG editors even more. Almost all graphical editors generate crappy output: badly closed html tags, broken styles, stripped white space. Considering this problems I usually try to stay with markups. Next problem is that I'm the only person that can fix mistakes in my texts. My friends tell me about typos, but I have to fix them by hand. I tried to share texts on google docs, but the collaboration doesn't work well enough. A few months ago I saw an online real-time editor Etherpad. That's quite a cool toy. It solves the problem of sharing the text with my friends, but it doesn't support any markups - it's just a plaintext editor. But I know how to create Comet applications easily using EvServer and Django. I realized that I could build a simplified Etherpad clone, which supports a markup language!”

EvServer, part2: Rabbit and Comet

Marek Majkowski wrote “Few days ago I introduced EvServer. In this post I'll present a simple EvServer example. EvServer is a normal WSGI server, but with one additional feature. Instead of blocking in your WSGI application you yield a file descriptor to the server. On descriptor activity the server will continue your WSGI app till it yields again. I'll show how to wait for AMQP messages inside the WSGI application and how to push them up to the browser. If you can't wait till the end of the post, please feel free to view the online demo(outdated) of the code described below. ”

EvServer, Introduction: The tale of a forgotten feature

Marek Majkowski wrote “Long long time ago there was a WSGI spec. This document described a lot of interesting stuff. Between other very important paragraphs you could find a hidden gem: [...] applications will usually return an iterator (often a generator-iterator) that produces the output in a block-by-block fashion. These blocks may be broken to coincide with mulitpart boundaries (for "server push"), or just before time-consuming tasks (such as reading another block of an on-disk file). [...] It means that all WSGI conforming servers should be able to send multipart http responses. WSGI clock application theoretically could be written like that: def clock_demo(environ, start_response): start_response("200 OK", [('Content-type','text/plain')]) for i in range(100): yield "%sn" % (,) time.sleep(1) The problem is that way of programming just doesn't work well. It's not scalable, requires a lot of threads and can eat a lot of resources. That's why the feature has been forgotten. Until May 2008, when Christopher Stawarz reminded us this feature and proposed an enhancement to it. He suggested, that instead of blocking, like time.sleep(1), inside the code WSGI application should return a file descriptor to server. When an event happens on this descriptor, the WSGI app will be continued. Here's equivalent of the previous code, but using the extension. With appropriate server this could be scalable and work as expected: def clock_demo(environ, start_response): start_response("200 OK", [('Content-type','text/plain')]) sd = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) try: for i in range(100): yield environ['x-wsgiorg.fdevent.readable'](sd, 1.0) yield "%sn" % (,) except GeneratorExit: pass sd.close() So I created a server that supports it: EvServer the Asynchronous Python WSGI Server ”

My thoughts on real time full-text search

Marek Majkowski wrote “Usually, search engines can look through data outdated by a few days. But Twitter search seems to be returning real time search results. That’s why it’s interesting how it works. In this post I’ll present a short introduction to full-text search engines and my private thoughts about a possible implementation of a better one. Let’s…”

Asynchronous libraries performance

Marek Majkowski wrote “Recently I found some pretty libevent benchmarks. For me they show terrifying results. The blood freezing fact is that the more connections you have, the bigger is the cost of adding new connections to asynchronous loop. It means that if you have 1 connection registered to asynchronous loop, the cost of registering callback would be…”

By Alias 0591 from the Netherlands (Snake Uploaded by russavia) [CC-BY-2.0 (], via Wikimedia Commons

Tracing Python memory leaks

Marek Majkowski wrote “While I was writing a python daemon, I noticed that my application process memory usage is growing over time. The data wasn’t increasing so there must have been some memory leak. It’s not so easy for a Python application to leak memory. Usually there are three scenarios: some low level C library is leaking your…”

Simple inter-process locks

Marek Majkowski wrote “I recently faced a very common problem, how to make sure that only one instance of my program is running at a time on the host. There are a lot of approaches that can be taken to solve this problem, but I needed a portable solution for Python. My first idea was to use widely…”