A few months ago, I noticed that cnicholson.net was serving a blank page, and that Google Chrome was warning me of malware. I sent an email to Dreamhost asking if things were ok, and they responded with a link to their Troubleshooting Hacked Sites wiki page. Uh oh. I let the site languish because work was busy and my personal life was more important. Just yesterday I devoted a few hours to figuring out what was going on, and fixing it. Fortunately this was pretty fun, and I got to brush up on my GNU tools. Read on for the gory details.
Every .js file in my WordPress install had a malicious script tag in it, and many of the .php files had code that generated malicious script tags. Sorry this is vague, I didn’t keep any details around. Anyway, my plan was just to download my site from Dreamhost to my Win7 desktop at home, fix the site locally, upload it, and get on with life. When the download went up to 1800 files (WTF?), though, I realized that I had no idea how to easily do a regex-based search-and-replace in Windows. I’m sure it can be done, I just didn’t want to start looking for new tools.
Then I realized, I have all the tools I need to fix this inside of my Dreamhost ssh terminal. A stock GNU/Linux environment is all I need; this kind of stuff is exactly what the GNU tools were built for.
I needed code that would go through every file that contained a russian link, and remove a “script” pattern. There are many different ways to do this (find, grep, sed, awk, perl, …), but I’m the most fluent with grep and perl, so I used them. I can’t imagine that there are any actual trade-offs for a project this small, so it was an easy choice. Here’s what I ended up doing:
grep -lr ".ru/" . | xargs perl -pi -e "s/malware-pattern//"
A quick explanation:
grep -lr ".ru/" .
Search recursively (-r) through ‘.’ for all files that contain the text “.ru/” (as in “malware.ru/evil.js”). When the first match in a file is found, print only the filename and continue to the next file (-l).
Take each result from grep and pipe it (|) to xargs. xargs breaks down large input sets to manageable sizes and invokes its arguments with each element of the input set. This safely passes each reasonably-sized block of grep result filenames to perl.
perl -pi -e "s/malware-pattern//"
Evaluate the command-line argument “s/malware-pattern//” as a perl script (-e) on each passed in by xargs. Multiple command-line arguments should be looped over, consuming each line of each file (-p). The substitution regex replaces every occurrence of “malware-pattern” with “”, which has the effect of deleting it. The input file is opened in interactive mode (-i), so that the substitution actually updates the contents of the file.
Anyway, I backed up my web site, ran that command from my website root, and it correctly removed every occurrence of the hacked tags. My WordPress database looks clean, I’ve carefully restored my public html/php/js/css files to their correct (now minimal) permissions, and I’ve upgraded to the latest version of WP. Let’s hope it holds.
More Stupid C++ Tricks coming “soon”.