Website Security Suggestion: Get rid of cruft! (script included)

Right: One of my pet hates is cruft on a production website.

Cruft is stuff - files - which has accumulated because nobody’s paying attention. Cruft includes sampleware. Developer experiments. Readmes. Sample configs. Backups of files which never get cleaned up. Just general accumulated stuff. It’s website navel lint. Hypertext hairballs.

Cruft. Has. No. Place. On. A. Production. Website!

Worst-case, it might actually expose security-sensitive information. (That's the worst type of cruft!).

Want to find cruft? Well, easiest way to start is:

D:\WebContent> dir /s *.txt

That’s a good start. For every Readme.txt, add 10 points. For every web.config.txt, add 1000 points (why? That's a potentially huge problem - .config is blocked by Request Filtering by default (with certain exceptions), but .config.txt: no problem! Download away.)

If you score more than 10 points, you need to rethink your strategy.

  • There is no reason for files like readme.txt to exist within your production website
    • Okay, there’s one reason and that’s when you’re providing one you know about, and have vetted, for download.
      • I mean, obviously if the site is there to provide readme.txt s for apps people are downloading, great! But if it's the readme for some developer library which has been included wholesale, bad pussycat.
  • There is no reason for files like web.config.bak to exist within your production website.
    • Luckily, .bak files aren't servable with the default StaticFileHandler behaviour. But that doesn't mean an app (or * scriptmap...) can't be convinced to hand you one...
  • If you have web.config.bak.txt files, you’re asking for trouble.
    • Change your operational process. Don’t risk leaking usernames and passwords this way.

The Core Rationale

Web developers and site designers should be able to explain the presence of every single file on your website.

I don’t care if it’s IIS or Apache or nginx or SuperCoolNewTechnologyX… the developers should be responsible for every single file deployed to production.

And before the admins (Hi!) get smug and self-satisfied (you still can, you just need to check you’re not doing the next thing…), just check that when you deploy new versions of Site X, you’re not backing up the last version of Site X to a servable content area within the new version of Site X.

For example, your content is in F:\Websites\CoolNewSite\ with the website pointed to that location…

  • It’s safe to back up to F:\Backups\CoolNewSite\2016-11-13 because it’s outside the servable website
  • It’s not cool to back up to F:\Websites\CoolNewSite\2016-11-13 because that’s part of the website.

How Do I Know If I’m Crufty?

As I do, I started typing this rant a while ago, and then thought: You know what? I should script that!

I had a bunch of DIR commands I was using, and sure, could’ve just made a CMD, but who does that these days? (Says my friend. (Singular))

Then {stuff}… but it finally bubbled to the top of my to-do list… So I wrote a first draft Get-CruftyWebFiles script.

I’ve lots of enhancement ideas from here, but wanted to get something which basically worked. I think this basically works!

Sure, there’s potential duplication if sites and apps overlap (i.e. the same file might be listed repeatedly) (which is fine; I figure you weed that out in post production), and if your site is self-referential it might get caught in a loop (hit Ctrl+C if you think/know that’s you, and *stop doing that*)

So, feel free if you want to see how crufty your IIS 7.5+ (assumed? Tested on 8.5) sites are:

The Script: https://github.com/TristankMS/IIS-Junk

Usage (roughly):

Copy to target web server. Then from an Admin PS prompt:

  • .\Get-CruftyWebFiles.ps1   # scans all web content folders linked from Sites, and outputs to .\crufty.csv
  • .\Get-CruftyWebFiles.ps1 -WebSiteName "Default Web Site"     # limits to just the one website.
  • .\Get-CruftyWebFiles.ps1 -DomainName "YOURDOMAIN"    # checks for that text string used in txt / xml files as well

Pull the CSV into Excel, Format as Table, and get sorting and filtering. Severity works on a lower-is-more-critical basis. Look at anything with a zero first.

Todo: Cruft Scoring (severity’s already in there), more detections/words, general fit and finish. Also considering building a cruft module for a security scanner, or just for the script, to check what's findable on a website given some knowledge of the structure.

* oh! No I’m not