How can I figure out what files are corrupt or missing on my hard drive?

David Orlow 1 Reputation point
2023-02-24T23:18:43.37+00:00

Ok, this is something I've pondered in the past. I have pictures and videos on my NAS for 20 years. Every so often, I start looking around folders and then something sparks a memory and I look for a picture or a video that I remember having and sometimes can't find them. Sometimes I'll find a picture or video that is corrupt.

I have backuips. But, backups are no good if I don't notice it's missing for a year after it's missing and my backups have cycled. I need a way to know when I have missing files or corrupt files. I was thinking for corrupt files, if it could take an inventory of my hard dirve and generate an md5 hash of each file to compare later to let me know if a file has changed.

A while back, I was trying to figure out how to script this. It would be a whole lot nicer if I found a commercial application that was designed for this. Any ideas?

Windows 10
Windows 10
A Microsoft operating system that runs on personal computers and tablets.
11,195 questions
Windows Server
Windows Server
A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.
12,637 questions
PowerShell
PowerShell
A family of Microsoft task automation and configuration management frameworks consisting of a command-line shell and associated scripting language.
2,329 questions
0 comments No comments
{count} votes

7 answers

Sort by: Most helpful
  1. David Orlow 1 Reputation point
    2023-02-25T03:57:52.9033333+00:00

    So, MD5 seemed to work fine. Except, I'm trying to figure out the logic in my mind how I'm going to find a file is missing... Id have to loop through each file and then run a loop inside of that which would loop through each file again and compare the md5 hash. But, the loops are going to be almost exponential because it would have to run how many files are in the directory sqared. So, if there was 4 files, it woud have to run 16 times. If there was 100 files, it would have to run 10,000 times.

    0 comments No comments

  2. David Orlow 1 Reputation point
    2023-02-25T17:46:25.2833333+00:00

    Wait, maybe not. I guess it needs to run once to generate the md5 for each and a timestamp of the run. then later it runs again and checks if it exists within the last x many days since the last timestamp and the md5 is the same.... but then again, every single file, i'm going to have to tell it to check the csv file or database and do a full run looking for each file. so, yes, probably a lot of wasted cycles. Which means I probably should use a database instead of a csv file. I probably can make a database call a lot quicker than a scan of a csv file. too bad I don't know how to do that... lol. I guess I'll see if I can figure it out. Also wondering if powershell is the right tool for this job. I'm sure there's a better language. Maybe Ill explore that too. Maybe a compiled language would be better. I always default to powershell when trying to manipulate files and text files because I learned it first.

    0 comments No comments