Why Sometimes it takes a long time for a Sift page to load
Well, we don't really know why- but at least we've identified that you are all not crazy or lieing (my first assumption- naturally)
Here's a graph that Lucky put together. It represents the load on our two web servers. Something is making them spike almost hourly. We thought it might have been some scheduled processes- but that doesn't seem to be the case. We're still investigating and trying to find the problem- but the first step is admitting we have a problem. We have a problem.
Here's a graph that Lucky put together. It represents the load on our two web servers. Something is making them spike almost hourly. We thought it might have been some scheduled processes- but that doesn't seem to be the case. We're still investigating and trying to find the problem- but the first step is admitting we have a problem. We have a problem.
10 Comments
Comment hidden because you are ignoring dag. (show it anyway)
If anyone has any good ideas on what might make webservers spike on a regular interval like this - we'd love to hear your crackpot theories.
Automated google web crawling because they recognize our premier ness on their search categories for all vids of import.
No genuine answers though. Do the spikes link to some freak net traffic spikes or is it just their own load that has this pattern?
I know nothing about this, but it does seem automated.
A quick search for "server spike automated" over at google brought up this:
http://www.xav.com/scripts/search/help/1177.html
Could it be that?
It could be that people go to videosift first thing after the end of whatever TV show they were watching that ended on the hour. What does the net traffic graph look like?
Putting a 15 second cooldown between searches from the same IP is a good idea regardless. Someone could easily DDoS by spamming search queries. And make sure you disallow the search page in your robots.txt.
Comment hidden because you are ignoring dag. (show it anyway)
It could be Googlebot- but they usually play nice- and our new search tool is pretty lightweight/not load producing.
>> ^jwray:
It could be that people go to videosift first thing after the end of whatever TV show they were watching that ended on the hour.
Like a server version of the superbowl water pressure myth?
It would sound a lot like spambots or searchbots, though I can't imagine they'd be pounding your servers at such a regular interval. Just in case, you could think about globally changing href links to no follow to keep bot indexing down.
I think you're using php, so you could look into a code accelerator like eAccelerator, though don't hold me accountable if your Xeon machines start spewing sparks. And if you're using MySQL, here's a nifty article on query caching which you may or may not find useful. Dunno. If it is rogue bot traffic, I've heard of people using bot traps, as well.
I hope you find some of this helpful at least.
Is there any correlation between cpu load and network load? If not, then looks inwards. Could it be siftbot doing maintenance, checking the age of queued vids?
Comment hidden because you are ignoring dag. (show it anyway)
We thought it might have been our scheduled processes - it could be, though when we're watching them - they never seem to be the problem. It might be our search.
You don't need to sit there and watch them. Just use a profiler. Hopefully it wouldn't add too much load, but I don't know for certain as I've never run one on a live website before.
Discuss...
Enable JavaScript to submit a comment.