EIC Henry Blodget recently updated and reprised his original Business Insider Full Monty article with a new presentation about the media business and BI specifically, so I thought it would be a great time to update my own BI tech infrastructure article from 2011 and detail some of the changes we’ve worked on since then.WARNING: This is going to get technical, so hold on to your pocket protector. I’m not going to re-hash, so if you haven’t read the previous BI infrastructure post, you should check that out first.
Last time I discussed our technology, BI was pushing around 80 million monthly page views, using two Varnish cache servers, four Apache web servers, and three MongoDB database servers. Traffic has more than doubled since then, and BI regularly delivers 175+ million page views a month … using the same two Varnish servers, four Apache servers, and three MongoDB servers. (All are hosted at our hosting partner, Datapipe, who, despite being based in NJ, didn’t even hiccup during Hurricane Sandy.)
Even though traffic has doubled, not much has changed in the core infrastructure. We’ve upgraded RAM across the board, and moved the database storage onto SSD drives, but we haven’t had to add any servers or make any major architectural changes to accommodate the growth. A high-traffic day in my last post was described as 4 million pageviews; these days that’s considered a slow weekend day and a true-high traffic day clears 8 million PVs easily.
Though our core hasn’t changed, the software running on it is undergoing plenty of changes. We’re refactoring our PHP code and moving from our legacy custom MVC framework to the Symfony 2 open source framework. We’ve also been steadily streamlining our editorial CMS and improving it with new features to make the editors’ lives easier.
We migrated away from the Google Search Appliance to an open source Solr server, which has been very successful at making our search results are better, with great filtering and sorting options that we couldn’t offer while we were still using the GSA.
We’ve begun setting up a Jenkins server to do continuous integration, and unit test coverage is getting better as we continue integrating Symfony 2. We’ve set up Nagios to do our own internal monitoring of the network and services and help catch any hiccups right away. We’re still using Catchpoint to keep an eye on site speed and availability, which has been invaluable for spotting problems immediately.
Unfortunately, as good as it’s been, this architecture won’t withstand another doubling. At peak traffic a single Varnish server could probably handle our current traffic but would have difficulty on its own. We need to be able to withstand a server crash, so we’re close to a single point of failure.
However, there are a few catches involved in simply adding a third varnish server. Right now our two front-end caching servers individually have a full cache of every URL on the site, and they’re balanced between randomly. Each time a post or vertical page is purged, the Apache and MongoDB backend needs to generate two new copies, one for each server. If we simply add a third server, our backend will need to generate a third copy, increasing the load on the back end by 50 per cent. Needless to say, that’s the opposite of what we want.
So how will we get around this issue? I have a few possible ideas cooking, but you’ll have to wait for my follow-up article to find out how we ultimately decide to solve it.
NOW WATCH: Briefing videos
Business Insider Emails & Alerts
Site highlights each day to your inbox.