Skip to main content

This is kind of a free flowing discussion for IT folks here in our community, and friendly questions for ct.

I'm a professional computer programmer specializing in Ruby on Rails.

I'm a little curious about how your'e going to manage pageviews tonight.

Warning: This is some really complicated stuff, so if you dont understand what's being discussed, feel free to crack a couple of Ted Stevens jokes.

More after the flip

Now i'm scanning the earlier diaries you wrote, ct, and i'm picking up some interesting information.

The front-facing proxy is lighttpd, with your custom memcached enhancements

Do you have a failover setup in place?

How big is the memcached server? A gig? Is there just one memcached instance, or are they spread across several of your app servers.

I think i read somewhere that facebook has the biggest installation of memcached anywhere, at (i could be really wrong) 16 TB.

Using Daily Kos as an example, the lighttpd proxy server sits in front of the mod_perl apache that runs Scoop. When a request is made, lighttpd checks to see if the request does not have a session cookie. If it doesn't, it sees if the URI matches the pattern of URIs to check. If it does, it goes to a mod_magnet lua script that queries the memcached server for the page. If it's present, memcached returns the page, lighttpd gunzips it if necessary, sets the content type, and returns it to the user. If the page is not present in memcached, the request proceeds to the backed. There, the page is made, and if the request is an anonymous one and fits the same pattern of URIs that lighttpd looks in, it places a copy of the page into memcached before serving it up to the user. While that page is active in memcached, any of the other webservers can retrieve the page, saving them the work of regenerating it themselves.

A wise man once said the the hardest things in computer science are naming things and expiring caches. Was the cache expiring logic a pain to setup?

How many apache/mod_perl instances are you using per each quad core xeon server?

For the new webservers, we got six quad core Xeons, each with 8GB RAM and an 80GB SATA disk for logs and such. After much trial, tribulation, and confusion with both nfsroot and iSCSI, they were finally set up with an nfsroot served up from a Sun x4500. This way, they all share one root filesystem for ease of maintenance. Plus, if required we can throw extra machines into the pool and they'll come right up, even configuring swap space if it isn't there.

I'm not entirely famliar with nfsroot. That's the equivalent of a samba share, IIRC, but older and better. Would a SAN be a good idea for using shared disks? Or how about ZFS with solaris?

The new database machines are eight core (or, as I like to say, OCTOCORE) Xeons with 16GB RAM, one 73GB disk for the OS, one 73GB disk dedicated to tmp, and a 6x73 GB RAID-10 for the database files (and with tmp and the db RAID each having a finely tuned XFS filesystem set up on them). Setting those machines up was easier than the webservers, except for the time involved in loading all the data onto them and getting kicked in the head with this MySQL bug, necessitating me upgrading all the MySQL servers to 5.0.51. For the database servers, I'm running a 64 bit Debian etch and the icc compiled MySQL 5.0.51 server. The difference between the icc and gcc versions of MySQL don't seem to be too extreme, but I'm keeping icc for the moment anyway.

I just bought the O'Reilly high performance mysql book. How are you handling replication between the DBs

So, if i get this right, you custom-compiled mysql5 on debian etch?
What does icc do better than gcc for the requirements of this site?

CT, i know these are a ton of questions. If you have time to answer them, great, if not, good work keeping DKos up and running.

More on my background:

I'm a rails programmer. I'm changing my stack from Apache2/mod_proxy/mongrel_cluster to NGINX/HAProxy/Thin. I use mysql for my database, and at one point i'm going to use memcached. I deploy using ubuntu, and just got the God process monitorying framework to work using Ubuntu intrepid... I cant get the damn thing to work in hardy due to a kernel issue, and i cant change the kernel on slicehost...

Once this is all over, i strongly recommend that you take a look at HAProxy. It works well with rails apps due to making sure that only one request hits an instance of mongrel or thin at a time. Rails isnt threadsafe, and has some nasty mutex lock problems that they're just beginning to resolve. Even so, I'm using thin, an evented web server, which is great.

Thanks for all the good work!

Originally posted to sjgman9 on Tue Nov 04, 2008 at 08:38 AM PST.

Your Email has been sent.
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags


More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
(The diary will be removed from the site and returned to your drafts for further editing.)
(The diary will be removed.)
Are you sure you want to save these changes to the published diary?

Comment Preferences

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site