The problem with markups is that when I see a typo in a rendered output, I have to click through the text and search for exact place with the mistake. I have the same feeling about editing Wikipedia, documentation on code.google.com, Trac, Blogger, WordPress and so on.
But I hate writing in WYSIWYG editors even more. Almost all graphical editors generate crappy output: badly closed html tags, broken styles, stripped white space. Considering this problems I usually try to stay with markups.
Next problem is that I’m the only person that can fix mistakes in my texts. My friends tell me about typos, but I have to fix them by hand. I tried to share texts on google docs, but the collaboration doesn’t work well enough.
A few months ago I saw an online real-time editor Etherpad. That’s quite a cool toy. It solves the problem of sharing the text with my friends, but it doesn’t support any markups – it’s just a plaintext editor.
But I know how to create Comet applications easily using EvServer and Django. I realized that I could build a simplified Etherpad clone, which supports a markup language!
The Etherpad clone
I decided to spend a very limited time on this project. Actually I wanted to do everything in my spare time in a week, that is about 6 afternoons.
Features I wanted, ordered by importance:
- must support editing by many users in real time – like Etherpad
- must generate rendered markup with reasonable latency
- must support all major browsers (though IE and Konqueror are not a must)
- must be dead simple
- should be able to scale up (for reasons I’ll describe below)
- should show who created what – like Etherpad. In the end I dropped this requirement due to the limited time.
With such hard time constraints I was ready to make some technological decisions:
- Python on the server side
- EvServer as a server
- Haproxy as a loadbalancer
- use EvServer’s Comet transports
- RabbitMq as a message broker
- MemcacheDB as a database
- Memcache for temporary storage
- Support reStructuredText as a markup language
The hardest part of the project is synchronizing and merging updates from many clients in real time and fixing collisions and propagating changes to users. Fortunately Neil Fraser from Google solved this problem and published it as a very nice project diff-match-patch.
It seems that the only job I have to do is to glue this parts together.
A few days after Etherpad was launched I saw this dialog:
It seems that Etherpad failed to scale. I don’t expect my clone to be popular, but it would be nice to scale better, even if it’s only in theory.
So I built the application with scalability in mind. The major hub in the application is the message broker – RabbitMQ. As long as this AMQP server scales – my application will also do. In the end RabbitMQ ought to be scalable – it was build exactly for that.
To synchronize user changes between browsers with low latency EvServer application shouldn’t do any jobs requiring high CPU usage. Managing and comparing changes from browsers is quite fast. The tough part is to generate the rendered output from the markup. Rendering reStructuredText markup to html can take up to few seconds of processor time! I decided that this job should be done by external process that can afford greater latency. Except for that, the message flow is pretty standard for a real-time web application:
Burning the time
Once I decided what I wanted to build I was ready to start programming. I haven’t got a detailed plan, but I accomplished the task almost on time. This is roughly the time I used and how the effort broke down:
- 1st evening: creating initial html view
- 3rd evening: created basic message flow – the latency counter started to work
- waiting at the airport and flight: integrating the diff-merge-patch library, synchronization starts to work
- 4th evening: fixed message routing in amqp, integrated the persistent database
- all Sunday: making it work on IE and final fixes
Here you can play with the final application. It’s not especially impressive, but it works. I established my goals and I proved that using proper tools it’s possible to build simplified Etherpad clone in just few days. I must admit that creating this app was a great fun.
I even start to think that the application could be useful. Maybe I should consider adding more popular markup languages like Markdown, Wiki or Trac.