Google Talk and scalability
A talk here by a Google Talk guy:
I didn’t pay close attention, but some things caught my notice.
Google Talk started as a standalone application and became embeddable in Gmail, Orkut, iGoogle (the personalised homepage), usable from cellphones, and so on. This is no mean feat, and shows that modularity and reusability are not unattainable ideals.
It also has important lessons in scalability. Questions like “how many IM messages do you deliver?” or “how many users do you have?” might be relevant from the perspective of the product’s success, but they are not the right measure from an engineering perspective. Most of the packets on the network are presence packets, and this is the number of users × the number of buddies they have, which does not grow linearly with the number of users (think integration into Orkut).
Before deploying into Orkut, they did real-life load testing with a “backend launch” — Orkut started fetching presence status from Google Talk several weeks before launch (starting slowly from 1% of Orkut page views), without showing anything in the UI. With enough confidence and some bugs fixed, the integration was finally made visible. They did something similar with Gmail.
Sharding and re-sharding: Different users are allocated to different servers, and this can be changed easily too.
Modularity etc: Different parts (like Orkut and Gmail) know very little about each other, and interact using the same interface that the rest of the world uses, so one can be changed easily without affecting the other.
Not afraid of going low-level (TCP, epoll kernel calls, etc!)