Tag Archives: hosting

Attack of the Chinese Comment Spam Robots

First, if you’re reading this through the feedburner you may have noticed in the past a little “Payday Loan” thing right below the title in past posts. Hopefully that is dead now.  I’ll explain what’s going on with that in a minute.  But if you see it above, please let me know!

I’m going to get a little geeky here, skip to the headlines if you want to get the big picture. You may have read my article on What’s in A Website?  It’s not all puppies and butterflies to set up and keep a website running. That was proved all the more last week.

Holy Smoking Internet

Below is a graph of the bandwidth (the number of megabytes of data) that was consumed by visitors to our little niche in cyberspace. I suspect that the “zeroes” here are due to throttling by GoDaddy, not actually spots where there was no traffic.  You see those spikes? Those are ROBOTS. It’s obvious the attacks began around 5/10, but didn’t reach a crescendo until 5/22 (as you can see on the next graph).

The attack begins.

But it’s about to get ugly:

Capture_20130528_188

Clever, but seriously demented people have written programs to run about on the web, find blogs to which they can add comments and then attempt to automatically add drivel so that they can hawk their worthless (and often scam) products. They also hope that by adding their links to more sites, their rank in search engines will go up. How else are you going to learn about Viagra and fancy watches? The comments are sometimes blatant:

soccer cleats for sale… quality. These cheap and quality nike air maxConcords are my favorite in life. You also give me good service, thank you. When I saw the nike air max Concords here, I know they are good and can be the best choice. My friend told me to buy nike air max …

Sometimes truly stupid:

replica watches… checked out demonstrate just about all invisible records within machine nonetheless absolutely no htaccess record. my spouse and i dont buy it. this is certainly insane, very much assist desired in this article. all i wish to accomplish will be be capa…

And sometimes sneaky:

Pretty nice post. I just stumbled upon your weblog and wished to say that I’ve really enjoyed browsing your blog posts. In any case I will be subscribing to your feed and I hope you write again soon!…

Huge banks of machines that have been zombified do this comment spamming on a massive scale. We had 545,611 “hits” in a 5 day period. Those hits consumed 15.96 Gigabytes of bandwidth. 372,000 of the hits out of 545,611 were robots!  Each was trying to comment on one of my old blog articles and collectively slurped up 11.6 Gbytes of bandwidth.  72% of all that traffic went to Chinese servers. Talk about trade imbalance! We didn’t know we were that popular abroad. There was also that guy in Poland who tried super hard to crack into our site through the login portal. He/it was turned away about 4,000 times in one day.

Unfortunately the extreme load caused GoDaddy, our hosting provider, to shut us down and hold us hostage.  It’s a good thing I know a little bit about networking… otherwise they probably would never have turned us back on. This all happened right when I launched the initiative to raise money for the Oklahoma Tornado victims. Talk about stress!  I felt like I was facing my own tornado – this one made in China. Fortunately the destruction was nothing like what my friends in Oklahoma endured.

This was the second time that the GoDaddy strategy was to punish the innocent. I won’t bother with all the details – I’ll just let you know that we’re folding up shop on GoDaddy and moving to HostGator real soon now. Hopefully it will be mostly transparent to you.

Countermeasures

With the thousands of assaults on the one blog article, it was obvious that I had to fix it. The problem is that WordPress incurs quite a bit of overhead to serve a particular article – it has to find it in the database, format all the ancillary content and then spit out all the parts of the page. That overhead was crushing the server, so I had to eliminate it.

I first took a look at where all the hits were coming from – I needed to shed most of the traffic and I used .htaccess instructions to deny a large range of network addresses from  China.  If you’re in China on one of those subnets you still won’t  be able to read my pages. So there! Hah!

<Limit GET POST>
order allow,deny
# - Chop the balls off of an intruder from triolan.net
deny from 178.151.216.53
# - Cut the balls off of the Chinese Spam bots-
deny from 110.85.
deny from 110.86.
deny from 121.205.
deny from 117.26.
deny from 218.86.
deny from 27.153.
allow from all
</Limit>

Then I wrote a very simple page that only says “Go away jerk”. Nothing fancy there.

WordPress relies on .htaccess rules to help serve content on my blog. Here are two important .htaccess rules. What they mean is simply: if the “path” is actually a directory or a file, then serve that path – otherwise it will hand the work off to the scripts that do all the processing.

RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

Since the url that the robots were hitting was
blog.starcircleacademy.com/2011/03/driven-to-abstraction

All I really needed to do was to create a directory path:  2011/03/driven-to-abstraction and add an index.html file in there.  Wordpress is thus bypassed and all I need to do to restore access is to delete the folders.

However, I first played with redirecting the traffic by putting this in the .htaccess at the root of my website(s).

RedirectMatch 301 /2011/03/driven-to-abstraction(.*) http://theamusing.com/badrobo/jerk.html

I noticed that most of the robots that were generating the spam hits were following the redirects.  If they keep this up I’ll redirect them to 127.0.0.1 in which case they’ll be asking themselves for content – and burning their own bandwidth!

RedirectMatch 301 /2011/03/driven-to-abstraction(.*) http://127.0.0.1:8676/youareanidiot

I’ve also thought about pointing them at those damn loan sites that have been infecting many webs.

Payday Loan Hack

Now getting back to that “payday loan” garbage. The problem is that GoDaddy’s servers are not secure – they are vulnerable to attack, especially via “sneaky files” designed to commandeer some aspects of the BLOG and surreptitiously insert their own drivel. In this case, the insertion was only visible to those who either looked at the HTML code or happened to read from email. Why? Because only Google’s feedburner exposed the otherwise invisible spam. The sneaky code has infected THOUSANDS of websites. Don’t believe me? I found 700+ sites infected with a simple Google search.  All the ones I checked are hosted on GoDaddy!  And please note that the search I did – trying to find news of a cure only include sites that contain both the subversive ads AND the world malware.  Guess how many hits there are here: t0inpaydayloans.com xmlrpc Over 4,000!  And that hawked site is only one of THOUSANDS of similar sites.

Get your Geek On

I used a number of tools to hunt down the invasion. There is a WordPress plugin called “Exploit Scanner” which gave me so many false positives at first that I had to drop back and do some clean up. Most of the false positives were related to our store, WordPress E-Cart.  A great free site that helped me is Securi.net.  I could kiss them.  They even offer a reasonably priced plan where they will regularly scan your site and fix any problems. It was tempting except the geek in me wanted to find the source of the garbage.

Boy was it well hidden.  Doubly encrypted – it had to be because there is plenty of advice out on the network warning about the presence of code like this: base64_decode(

This attacker hid his code well. But not well enough!

References:

We’ll be back on topic with the next article. Promise.

What’s In a Website?

I thought I’d take a moment or two, because I’m often asked, about how I’ve set up a website and what pitfalls I face.  This is NOT meant to be a primer on how to set up a website.  And to be frank, unless you’re willing to pay me, I am NOT offering to assist you with the process, sorry – but hey, this advice is free!

Here is the basic outline of what is needed:

  • Buy the domain(s) you want.
  • Arrange for your site to be hosted
  • Pick a tool/product/system for keeping the website running.
  • Add the features you want – including plugins
  • Backup the site
  • Do regular maintenance
  • Be vigilant about spam and security
  • Handle the occasional disaster, misdeed, or dead-end.

I purchased all of my dozen or so domain names from GoDaddy.com. GoDaddy’s salacious advertising turns my stomach. Their abrasive founder, Bob Parson, is widely – and probably fairly – excoriated for his antics.  But until recently I’ll have to say GoDaddy has been cheap and efficient with good support.  How they could afford to spend 20 minutes on the phone with me when I had only purchased $20 worth of product is perplexing.  I’ve used other “registrars” to get domain names. None I found were as inexpensive or efficient. Indeed there is no point in spending more on a domain name than you have to, so don’t.

What’s In a (Domain) Name

A domain name is nothing more than a handle that can be used to “find you” on the internet. Aim for a domain that is:

  1. Memorable
  2. Unique and easy to say and spell (If you get the domain sqakizamazula nobody is going to find you by name if they manage to remember it!)
  3. Not too similar to other domains (what about misspellings? You might want to get those, too)
  4. Amenable to keeping your private information private
  5. Inexpensive  – no need to pay more than about $15 a year for a domain name.
  6. Appropriate for what you’ll use it for?  (If you’re not on TV it doesn’t make sense to get a .TV domain)

When you buy a domain name, you’re required to give personal contact information. Not surprisingly there are many spammy/scammy businesses that grab that information to automate calling and emailing you… so you will want a “private registration” – that is a service that keeps your information secure – at additional cost, of course. Some domains, however, like all .US domains do not allow private registration.  And because “StarCircleAcademy” is a bit long and not always properly remembered, I made sure to also grab StarTrailAcademy, StarTrailsAcademy, and StarCirclesAcademy.com.  You can point many names at the same place.

One BIG benefit to having your own domain is that all the email addresses for that domain are yours! Oh, and as long as you keep that domain, you’ll never have to worry about changing your email address.  Even if I move to Timbuktu – which is NOT planned – I can still be Steven(at)StarTrailsAcademy.com  or SuperHandsomeFellow(at)StarTrailAcademy.com

Not all domain providers bundle in email for free, so beware.

What is a Host?

I host (store) my files on a GoDaddy.com server, however a series of recent misfortunes has me looking at HostGator.com as a better alternative. There are many choices for hosting. I won’t describe them all, but here they are roughly ordered by cost – lowest to highest:  economy-shared, performance-shared, private address shared, private resources, and dedicated.  In the last category basically what one is paying for is a machine that is used exclusively by you. Performance of any shared solution may range from sluggish to extremely sluggish.  And there is a HUGE downside to being on a shared machine.  A shared machine basically means it houses lots of websites, not just yours. You share bandwidth, hardware, and an Internet Address. The downside is that there are many tools that find websites that have malware on them and “Blacklist” those sites. This happened to me recently. Apparently a compromised website running on the same server as mine (with the same IP address) ticked off the Consolidate Block List and all hundred or so websites on the server were effectively inaccessible.

GoDaddy’s solution to this problem was… Gee, that’s too bad. If you want to pay us for a private address or move to a dedicated machine at an extra $6 monthly cost we can do that for you. It will only take 24 to 72 hours.  It actually took 3.  Unfortunately one of the tools I want to use on my website requires an intricate, and painfully laborious series of steps to configure it.  HostGator charges about the same for hosting and has all the support set to go.

What Tool?

A website can be created in many ways. Early on I used tools like Microsoft FrontPage (later became Expressions) and Dreamweaver to create websites.  You get a lot of control using tools like that, but you pay a high manual overhead to keep things up to date – and you better know something about HTML and JavaScript or you’ll have a dull site. After a while interactive online site builders became available. None of the ones I’d seen look interesting or unique.  There is a huge amount of complexity involved in creating and maintaining a “swanky site”. After the manual tools, and the online site builders there arose an armada of Content Management Systems (CMS). Joomla, BBoard, and so on. But I elected WordPress because: A. It’s free (mostly), B. It’s widely supported on hosts, C. It’s flexible enough and configurable enough, D. It’s pretty easy to use – unless you want to do fancy things.

Getting the Features

As I noted, WordPress has lots and lots of free and almost free customizations you can add. Some are really nifty. Some, like the scads of useless iPhone Apps will disappear soon after you test drive them.  My most recent addition is the “WordPress eStore”.  I had looked at many things including ZenCart and others. Honestly, though I wanted something less painful to set up and manage.  Unfortunately setting up WP eStore has taken me more than a week of twiddling to get close to what I want… but it’s still not there.  Other things I’ve added in (and many that I’ve customize) include the Meetup Events (see the margin in the right), a Gallery of Flickr images, maps and much more.  All of these required effort, and in most cases you really do need to understand HTML well enough to fix/correct/update.

Fight The Spam

I get three kinds of comment spam: blocked, sneaky, and low-brow. Several WordPress plugins block the majority of the automated junk. For example 1,105 bits of blocked spam have accrued in my queue in less than a month. As my site popularity grows, so do the automated comment spam attempts. About once or twice a week a spam item makes it through the filters. I have turned on WordPress comment moderation so that I must approve all comments.  So far I’ve described the auto-rejected spam (blocked), the sneaky spam I have to mark as SPAM and the last type is from well-meaning people who sometimes post four or five comments that basically say nothing at all or things that are self-contradictory – not you, of course!  Hey, I welcome your comment if it helps people understand, but if you just want to be argumentative or hawk your photos get your own site!  Sorry, was that harsh?

The Disasters

Things break. Sometimes they break in mild ways – like a single article that I could no longer open until I completed some upgrades. Sometimes the breakage is spectacular like the whole site going offline – or forgetting my password, or putting an embarrassing typographical mistake in my articles.  Or configuring a plugin incorrectly…   Backups and maintenance are meant to overcome these issues, but of course they pop up at the worst of times… like when WordPress DEMANDED that I upgrade it in the hours before I got on a plane to a place where I’d have no internet for two weeks!

To make matters worse, I had just published in a private location the details for an upcoming Field Expedition and blasted out the link in an email.  As luck would have it the flight had on-board WiFi so I could spend some $ and fix the problem instead of catching up on my sleep. As worse luck would have it, the on-board WiFi was broken 🙁

In a Nutshell

Setting up and maintaining a website is not for the faint of heart or the technology illiterate. It can be a huge time waster.  On the other hand had I not done it, well you wouldn’t be here, would you?!

If after all this you’re thinking that with a little of my help you’d like to set up your own site, please re-read the first paragraph. 😉