From a large file how to extract only certain files?

Jon N 0 Reputation points
2024-10-21T15:29:08.44+00:00

Hello,

First i apologize for most likely describing this poorly!

I'm trying to create a new smaller file from a large file folder that has ~10,000 pictures inside of it without manually copying the folder and deleting each, as the pix are updated annually so we'd have to do this every year.

There are ~3,500 separate properties but ~10,000 pictures. So ideally i would like a new separate file with only the first picture of the first building.

They are labeled in this manner: K=property. B=Building; P=Photo . There are ~3500 K's. then each k can have multiple buildings and then each building can have multiple photos. For Example:

k1b1p1 (this is the first picture of bldg 1 on property 1-Really what i'm looking to extract but for each "K"); k1b1p2, (this is the 2nd picture of bldg 1 on property 1); k2b1p1(is Property 2-building 1-picture1); k2b1p2; k2b2p1, etc.

If this helps and is simpler-My coworker just said.. what we really want is to extract or isolate the files ending with b1p1 but for each K.

Any kind of assistance would really be appreciated and save us lots of time!

Windows for business Windows Client for IT Pros User experience Other
0 comments No comments
{count} votes

4 answers

Sort by: Most helpful
  1. Darrell Gorter 2,731 Reputation points
    2024-10-21T17:16:46.3+00:00

    Hello,

    You could try using the ? mark to replace the first set of number. Use a question mark for each digit to be replaced

    Copy k?b1p1

    copy k??b1p1

    etc

    Darrell


  2. MotoX80 36,291 Reputation points
    2024-10-21T18:26:05.99+00:00

    Powershell can do this. The main question would be the format of the values of K,B,P. That is, is a K always the first n number of characters in the file name? Or can a K value be of different lengths? K01, K999, K1234. How would you determine where the value of B starts? If they are fixed lengths, then that's easy.

    Here's a sample. I have a folder of jpg's and they are all named in the format yyyymmdd_tttttt. Like 20240829_091244.jpg. I want to organize all the photos that I took in August for any year. So I want an August folder, and then subfolders for 2023,2024,2025, etc.

    $Source = "C:\Temp\XXXX\zzzzz"
    $Dest = "C:\Temp\XXXX\new"               
    $files = Get-ChildItem $Source
    foreach ($f in $files) {
        "Base file name is {0}" -f $f.BaseName
        $year = $f.BaseName.Substring(0,4)      # get the first 4 characters from the name 
        "Year is {0}" -f $year
        $month = $f.BaseName.Substring(4,2)     # the 5th and 6th characters are the month 
        "Month is {0}" -f $month
        $target = "{0}\{1}\{2}" -f $Dest, $month, $year
        "Copy this file to {0}" -f $target
        if ((Test-Path $target) -eq $false) {     # does folder exist?
             "Creating folder {0}" -f $target
            New-Item $target -ItemType Directory
        }
        $f | Copy-Item -Destination $target        # Use move-item if you want to remove the file from the source. 
        ""
    }
    

    I end up with a folder structure like this.

    C:\Temp\XXXX>tree
    Folder PATH listing for volume OS
    Volume serial number is 36C0-6121
    C:.
    ├───new
    │   ├───08
    │   │   └───2024
    │   └───09
    │       └───2024
    

    Running the script in Powershell_ISE looks like this.

    User's image

    You would need to modify the script to parse for the K,B,P values and then build the destination folder name that you require.

    I recommend copying a few files to a temporary folder that you can use as a source to test with. Verify that the files are copied to the correct folder. Use Powershell's "-whatif" switch to test the copy without actually doing the copy (or move).

    Make sure that you have a backup of the folder that has the 10,000 files in case something goes wrong.

    Comment out the copy-item and run the script against the 10,000 files to verify that you can parse each name correctly.


  3. Yanhong Liu 14,195 Reputation points Microsoft External Staff
    2024-10-22T07:14:44.0466667+00:00

    Hello

    Thank you for posting in Q&A forum

    You need to extract the first picture of the first building for each property. Let’s use a script to help with that. Here’s a simple Python script to get you started:

    import os

    import shutil

    source_folder = 'path_to_your_folder'

    destination_folder = 'path_to_new_folder'

    if not os.path.exists(destination_folder):

    os.makedirs(destination_folder)

    for root, dirs, files in os.walk(source_folder):

    files_sorted = sorted(files)

    for file in files_sorted:

    if file.endswith('b1p1.jpg'): # Adjust the file extension if needed

    property_code = file.split('b')[0] # Extracts the property code (e.g., k1 from k1b1p1)

    if not any(f.startswith(property_code) for f in os.listdir(destination_folder)):

    shutil.copy(os.path.join(root, file), destination_folder)

    break

    Change path_to_your_folder to the path of the folder containing your pictures.

    Change path_to_new_folder to the path where you want to save the extracted pictures.

    Adjust the file extension if needed.

    This script will walk through your folder, check for files ending with 'b1p1', and copy the first matching file for each property to the new folder.

    Give it a try and see if it meets your needs!

    Best regards

    Yanhong

    =====================================

    If the answer is helpful, please click "Accept answer" and upvote it


  4. Jon N 0 Reputation points
    2024-10-23T17:12:38.73+00:00

    thanks so much. Unfortunately this did not work exactly however i was able to do something similar and fix my issue. In similar fashion, within the folder i searched for *b1p1 and this gave me all the files that ended in b1p1. then as some had p10 through p19 i searched for *b1p1.jpg and it gave me what i needed. Then i selected all ctrl-A and copied ctrl-c; and then created a new file and pasted ctrl-v. and it gave me all what i needed. Thank you very much Darrell for taking time and helping me i really appreciate your time! Sincerely, jon

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.