Wednesday, December 09, 2009

high performance high user websites.

While looking at performance in eCommerce and large social sites on the web I come across a couple surprising profiling reports. The most impressive to me so far at least is that of Plenty of Fish, a very popular free online web dating site that has been priced up at near 1billion USD. this itself id not that surprising in the big stakes of online systems. What is surprising is the amount off hits and users their site gets and more importantly just how little hardware and how simple an architecture their system is. It’s truly beautiful in this day and age of bloat and high expense to implement server arrays and grids.

Here’s some basic stats:

Plenty of fish gets approximately 1.2billon pages/views a month with an average 500,000 unique logins per single day. That's not bad at all.

the System deals out 30+million hits a day, around 500-600 hits a second. Now that is very impressive!

The technology used is ASP.NET at its core,.The server setup can apparently deal with 2 million page views a hour under stress

Now the hardware!!! and this is quite impressive:

2 load balanced web servers (to get past the IIS restriction of 64,000 simultaneous connections at once issue.

3 Database servers, a main account server and 2 others used for searching and at a guess profiling of data. ( I do suspect these are seriously overloaded ‘Iron Servers’ which in some cases can cost easily anywhere between 50-100K GBP)

Despite all of this, it’s pretty impressive a site can reach that levl of performance using so little hardware.

Over the next week or two I will explore similar architectures and also the additional application of cloud/parallel processing across a Service buss to see if anything can get close to those stats (although I don’t have that hardware, some things can be approximated). I do think a virtual cloud running on  suitable batch of 5 decent machines should be able to deal with 200-500K web serves an hour with a moderately complex media rich (images not video) dynamically generated system. I will also be looking at issues of transactions over boundaries as this is where I see the biggest hit on performance. (database clusters/clouds accessed by cloud or service applications for example .NET applications or java clients if REST is used)

I will also look at the issue of scaling out Vs Scaling up the physical architecture.

No comments: