Hi, I’m Andrew, and I’ve got a problem. I’m one of two System and Network Admins here at DomainTools and because we’re growing, we have enough work for three! Actually we had enough work for three last year but they wouldn’t let me hire anybody 🙂 So I thought I’d write a quick blog post with some fun facts about our network and systems, and use it as a plug for a new role we are hiring for at DomainTools.
Providing the services we do requires a major amount work. Gathering the data we do and processing it is a complicated job, and while we’re nowhere near a Google or Facebook, we need a good bit of computing power to handle things. We handle just short of a Terabyte of data per day. This data is pulled in as zone files, whois records, DNS/IP addresses, as well as screenshots and text for websites we know about. This data must then be analyzed in relation to our existing data. To see what’s new, what’s changed, what’s stayed the same, and what has expired.
All of this work is what underpins our websites, such as domaintools.com, dailychanges.com, and screenshots.com – as well as all the daily emails and reports that we produce. We currently handle over 12 million page views per day, and according to Alexa rank higher than the NFL, NBA, and ATT. This means we have to be just as robust and redundant as any of those sites.
Maintaining that level of reliability while handling that much processing on a daily basis requires a lot of horsepower. At present we are currently running over 100 hardware systems, with roughly a quarter of those hardware nodes running over 170 virtual machines. These systems run a staggering array of services. MySQL, memcache, beanstalkd, sphinx, hadoop, postfix, apache, lighttpd, and haproxy are all run at various points in our production environment. We also have several services we’ve developed in house, because we weren’t able to find anything that provided answers fast enough out of a data set as large as ours when they were created. Our codebase is just as diverse, with code in C, C++, C#, Java, Perl, Python, PHP, and Ruby all used, and in a number of cases are tied together with shell scripts. As if all this wasn’t enough, when the company was founded it was decided to use FreeBSD to run the servers. However over the years we’ve found Linux to work better for us. So now most of our systems are running Linux, however there are still a few legacy systems running FreeBSD, just to keep everyone on their toes. Technical specialists need not apply here; to work at DomainTools you need to be a generalist, have a solid understanding of computing fundamentals and be ready to deal with large data sets.
Of course we also use tools internally to keep all this organized. We use cfengine to provide configuration management across our servers. Git to manage our code base. There is an internal wiki for technical documentation. And we use LDAP, XMPP (jabber), and asterisk to manage employee accounts and provide communication channels within the company, and with the world at large.
Looking at the projects for the coming year, things are only going to get bigger. We’re planning to offer more products, and more services, to more clients and more users. And the infrastructure has to grow to handle all the new activity. So I’m going to sign off, and get back to keeping things running as smoothly as possible. But not before I leave you all with a link to the job posting for DevOps, a new and unique role at DomainTools that reports to our Director of Technology but works closely with our super awesome 10 person Engineering Team. If this sounds interesting to your or someone you know (in Seattle), please pass them this link. Thank you!
Category: Domain Tools Updates