SPAM, blackout, overload, missing comments and captcha

There was another blackout. The official reason – excessive SPAM led to CPU/memory overload and overusage.

Not sure if it indeed was the problem (I think it was overloaded MySQL), but the amount of SPAM the site has been getting since October last year is crazy (see the graph). Blocking IP’s does help a bit, but I’m afraid I will have to block the whole Internet over a couple of years.

As you may have noticed, I made commenting as easy as possible. No registration, no user name, no mail required. It’s only the text I’m after. However, some legit comments are positively labelled by Akismet as SPAM. Usually I go down the list of SPAM messages and check them manually, but sometimes I’m too late, sometimes I miss the “ham” (Akismet jargon). So, if you think that your comment is missing, please resend it again (excuses for the trouble).

The current solution is to use Captcha. It’s not “prettyfied” yet, I still need to change the page styles, but it’s already working. Try it and let me know (in the forum, via email or see Contact.

If it does not work and the performance is still suffering, I will have to change the hosting plan. But that will cost me $528 for two years ($22 a month).

15 thoughts on “SPAM, blackout, overload, missing comments and captcha”

  1. I go an internal server error on the first attempt to post. Let’s see, if this works better now.

    Hmmm, same captcha code as before…

  2. blackduck ” wrote:

    what hardware does your server have at the moment?

    No idea, it’s shared server with possibly lots of other virtual webservers. Going “the business way” will allow for more available resources (but still no dedicated server).

  3. Strappado ” wrote:

    an internal server error

    As I said: high utilization…

    # top

    top – 13:34:59 up 21 days, 10:28, 1 user, load average: 7.21, 6.64, 6.27
    Tasks: 594 total, 14 running, 576 sleeping, 0 stopped, 4 zombie
    Cpu(s): 25.9%us, 16.5%sy, 0.0%ni, 41.3%id, 16.0%wa, 0.0%hi, 0.2%si, 0.0%st
    Mem: 6095340k total, 5728816k used, 366524k free, 245108k buffers
    Swap: 2096440k total, 476084k used, 1620356k free, 3844208k cached

    PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
    6183 mysql 11 -5 467m 93m 5160 S 99.6 1.6 4385:24 mysqld
    18716 64320 15 -5 191m 22m 8772 S 75.4 0.4 0:04.66 php
    21487 68923 15 -5 301m 133m 8748 R 13.6 2.2 0:00.41 php
    21490 like_ra 11 -5 212m 44m 8596 D 9.9 0.8 0:00.30 php
    21475 like_ra 11 -5 195m 27m 8388 R 5.0 0.5 0:00.15 php

    Mysql is overloaded as always, I bet this is the bottleneck.

    Strappado ” wrote:

    Let’s see, if this works better now.

    You could simply “reload” or “shift+reload”. It usually works for me.

    Strappado ” wrote:

    Hmmm, same captcha code as before…

    Strange, but I don’t know how the plugin works. Probably the page was not actually reloaded.

    But at least it works, and I’m getting much less SPAM.

  4. Just counted – about 50 web-sites are hosted on this box. Hence the performance. But to upgrade the hosting plan I need $528 …

  5. Do you know how to run a server yourself? You can get a VPS at Linode.com for $20/month that would handle a blog like this without any load at all. Once you get off shared hosting you’ll never go back.

  6. VPS does not mean “not shared”. It’s still shared, but you do not see what’s taking all the CPU ;-D

    How do I get my fair share of CPU?

    We limit the number of Linodes placed on each host machine. We also only place one plan type on each host. In the worst-case scenario, you’re splitting CPU time evenly with your fellow Linoders, but are still able to use the full potential of the host if others are idle.

    Basically, I see no advantages. It’s $2 ($20 vs $22) cheaper, but has traffic limitation. Currently I have to limits, see what’s going on with the box and I log a ticket if something is slow or not working. The problem is – too many servers. Upgrading to the business plan ($22 per month if paid for 2 years) will reduce this amount.

  7. A VPS is (in my experience) significantly better than shared hosting. My needs eventually out grew both hosting and a VPS – it made sense to get a dedicated server. If you know where to look, even dedicated servers really aren’t that expensive.
    It might be worth taking a look at http://leaseweb.com – the have a nice range.

    If cash is an issue, I’m sure a few of us can hit the Donate button in the side menu.

  8. nc ” wrote:

    it made sense to get a dedicated server.

    Of course, but it depends on who pays 😉

    nc ” wrote:

    I’m sure a few of us can hit the Donate button in the side menu.

    This is the biggest problem 😉

  9. OK, it is VPS now. The load exceeded the Business plan limit. That means even more money.

    As a side effect the site is not reachable from my provider.

  10. Do you mind if I ask how much traffic you receive? An average + peak would be all – I might be able to offer you some alternative hosting cheaper than $20/22 per month, guess it just depends on the traffic levels!

  11. Something like 100GB/month. I expect this value to grow in the next months.

    The package includes 40GB disk space (not an issue for the time being), 1GB dedicated memory, 2 core CPU, unlimited traffic, extremely good 24×7 support. $50 per month or $40 if paid for the whole year (also I have some discount). To be honest, I’m pretty happy with this provider. It’s not cheap, though.

  12. Hmm, I suppose that is a reasonably good price. I would consider moving away from shared hosting though, although a VPS is “sharing” resources with other virtual machines on the host server the sharing is generally much better controlled than that of a directly shared host. I wouldn’t have thought CPU should be an issue for a site like this, it will be disk access and memory usage.

  13. Already moved to VPS (hence the downtime). Currently struggling with performance tuning. The problem with VPS is that you never know where the actual bottleneck is. I do not like the response time.

    Yes, you are right about the disk. The disk is always a bottleneck. And to make use of cache you need more memory. And the bigger service time, the more requests are waiting in the queue, the more memory used, etc…

    The bottom line – the actual bottleneck is the amount of money you can afford to pay ;-D

Leave a Reply

Your email address will not be published. Required fields are marked *