Copy files from Site A to Site B to the same folders

SR VSP 1,251 Reputation points
2023-08-21T16:31:36.1966667+00:00

Hi Guys,

I've the below script to generate files in SiteA which are older than 2 years .And at the same time i would like to move these files which are generated to SiteB in to the same Document Libraries/folders ( the structure in site B is already created ) please advise ?

#Function to Generate Report on all documents in a SharePoint Online Site Collection
Function Get-SPODocumentInventory($SiteURL)
{
    Try {
        #Setup the context
        $Ctx = New-Object Microsoft.SharePoint.Client.ClientContext($SiteURL)
        $Ctx.Credentials = $Credentials
    
        #Get the web from given URL and its subsites
        $Web = $Ctx.web
        $Ctx.Load($Web)
        $Ctx.Load($Web.Lists)
        $Ctx.Load($web.Webs)
        $Ctx.executeQuery()
  
        #Arry to Skip System Lists and Libraries
        $SystemLists =@("Converted Forms", "Master Page Gallery", "Customized Reports", "Form Templates", "List Template Gallery", "Theme Gallery",
               "Reporting Templates", "Solution Gallery", "Style Library", "Web Part Gallery","Site Assets", "wfpub", "Site Pages", "Images")
      
        Write-host -f Yellow "Processing Site: $SiteURL"
  
        #Filter Document Libraries to Scan
        $Lists = $Web.Lists | Where {$_.BaseType -eq "DocumentLibrary" -and $_.Hidden -eq $false -and $SystemLists -notcontains $_.Title -and $_.ItemCount -gt 0}
        #Loop through each document library
        Foreach ($List in $Lists)
        {
            #Define CAML Query to Get List Items in batches
            $Query = New-Object Microsoft.SharePoint.Client.CamlQuery
            $Query.ViewXml ="
                <View Scope='RecursiveAll'>
                   <Query>
                      <OrderBy><FieldRef Name='ID' Ascending='TRUE'/></OrderBy>
                   </Query>
                   <RowLimit Paged='TRUE'>$BatchSize</RowLimit>
                </View>"
  
            Write-host -f Cyan "`t Processing Document Library: '$($List.Title)' with $($List.ItemCount) Item(s)"
  
            Do {
                #Get List items
                $ListItems = $List.GetItems($Query)
                $Ctx.Load($ListItems)
                $Ctx.ExecuteQuery()
 
                #Filter Files
                $Files = $ListItems | Where { $_.FileSystemObjectType -eq "File"}
 
                #Iterate through each file and get data
                $DocumentInventory = @()
                Foreach($Item in $Files)
                {
                    $File = $Item.File
                    $Ctx.Load($File)
                    $Ctx.ExecuteQuery()

                 
                    $filterDate = (Get-Date).AddDays(-730).Date
                    if ($File.TimeLastModified -lt $filterDate){

                        $DocumentData = New-Object PSObject
                        $DocumentData | Add-Member NoteProperty SiteURL($SiteURL)
                        $DocumentData | Add-Member NoteProperty DocLibraryName($List.Title)
                        $DocumentData | Add-Member NoteProperty FileName($File.Name)
                        $DocumentData | Add-Member NoteProperty FileURL($File.ServerRelativeUrl)
                        $DocumentData | Add-Member NoteProperty CreatedBy($Item["Author"].Email)
                        $DocumentData | Add-Member NoteProperty CreatedOn($File.TimeCreated)
                        $DocumentData | Add-Member NoteProperty ModifiedBy($Item["Editor"].Email)
                        $DocumentData | Add-Member NoteProperty LastModifiedOn($File.TimeLastModified)
                        $DocumentData | Add-Member NoteProperty Size-KB([math]::Round($File.Length/1KB))

                    }
  
                    
                        
                    #Add the result to an Array
                    $DocumentInventory += $DocumentData
                }
                #Export the result to CSV file
                $DocumentInventory | Export-CSV $ReportOutput -NoTypeInformation -Append
                $Query.ListItemCollectionPosition = $ListItems.ListItemCollectionPosition
            } While($Query.ListItemCollectionPosition -ne $null)
        }
           
        #Iterate through each subsite of the current web and call the function recursively
        ForEach ($Subweb in $Web.Webs)
        {
            #Call the function recursively to process all subsites underneaththe current web
            Get-SPODocumentInventory($Subweb.url)
        }
    }
    Catch {
        write-host -f Red "Error Generating Document Inventory!" $_.Exception.Message
    }
}
  

[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12

#Config Parameters
$SiteCollURL="https://xxxx.sharepoint.com/sites/name"
$ReportOutput="C:\xxx\DocInventory.csv"
$BatchSize = 500
 
#Set user name and password to connect
$UserName="******@yyy.onmicrosoft.com"
$Password = "pass"
 
#Create Credential object from given user name and password
$Cred = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($UserName,(ConvertTo-SecureString $Password -AsPlainText -Force))
    
#Set up the context
$Ctx = New-Object Microsoft.SharePoint.Client.ClientContext($SiteUrl)
$Ctx.Credentials = $Cred


   
#Delete the Output Report, if exists
if (Test-Path $ReportOutput) { Remove-Item $ReportOutput }
   
#Call the function
Get-SPODocumentInventory $SiteCollURL
Microsoft 365 and Office | SharePoint | Development
Microsoft 365 and Office | SharePoint | For business | Windows
{count} votes

1 answer

Sort by: Most helpful
  1. Yanli Jiang - MSFT 31,611 Reputation points Microsoft External Staff
    2023-08-23T09:09:11.46+00:00

    Hi @SR VSP ,

    You can first use the method in the following article to copy all the files in site A to site B:

    https://www.sharepointdiary.com/2017/02/sharepoint-online-copy-files-between-site-collections-using-powershell.html

    Note: Microsoft is providing this information as a convenience to you. The sites are not controlled by Microsoft. Microsoft cannot make any representations regarding the quality, safety, or suitability of any software or information found there. Please make sure that you completely understand the risk before retrieving any suggestions from the above link.

    Then use the following code to filter out files no older than two years in site B, and then delete them.

    #Function to delete documents in a SharePoint Online Site Collection
    Function Remove-SPODocument($SiteURL)
    {
        Try {
            #Setup the context
            $Ctx = New-Object Microsoft.SharePoint.Client.ClientContext($SiteURL)
            $Ctx.Credentials = $Credentials
        
            #Get the web from given URL and its subsites
            $Web = $Ctx.web
            $Ctx.Load($Web)
            $Ctx.Load($Web.Lists)
            $Ctx.Load($web.Webs)
            $Ctx.executeQuery()
      
            #Arry to Skip System Lists and Libraries
            $SystemLists =@("Converted Forms", "Master Page Gallery", "Customized Reports", "Form Templates", "List Template Gallery", "Theme Gallery",
                   "Reporting Templates", "Solution Gallery", "Style Library", "Web Part Gallery","Site Assets", "wfpub", "Site Pages", "Images")
          
            Write-host -f Yellow "Processing Site: $SiteURL"
      
            #Filter Document Libraries to Scan
            $Lists = $Web.Lists | Where {$_.BaseType -eq "DocumentLibrary" -and $_.Hidden -eq $false -and $SystemLists -notcontains $_.Title -and $_.ItemCount -gt 0}
            #Loop through each document library
            Foreach ($List in $Lists)
            {
                #Define CAML Query to Get List Items in batches
                $Query = New-Object Microsoft.SharePoint.Client.CamlQuery
                $Query.ViewXml ="
                    <View Scope='RecursiveAll'>
                       <Query>
                          <OrderBy><FieldRef Name='ID' Ascending='TRUE'/></OrderBy>
                       </Query>
                       <RowLimit Paged='TRUE'>$BatchSize</RowLimit>
                    </View>"
      
                Write-host -f Cyan "`t Processing Document Library: '$($List.Title)' with $($List.ItemCount) Item(s)"
      
                Do {
                    #Get List items
                    $ListItems = $List.GetItems($Query)
                    $Ctx.Load($ListItems)
                    $Ctx.ExecuteQuery()
     
                    #Filter Files
                    $Files = $ListItems | Where { $_.FileSystemObjectType -eq "File"}
     
                    #Iterate through each file and delete
            
                    Foreach($Item in $Files)
                    {
                        $Ctx.Load($Item)
    
                        $filterDate = (Get-Date).AddDays(-730).Date
                        if ($File.TimeLastModified -rt $filterDate){
    
                        	 Remove-Item $item
    
                        }
      
                    }
                    $Query.ListItemCollectionPosition = $ListItems.ListItemCollectionPosition
                } While($Query.ListItemCollectionPosition -ne $null)
            }
               
            #Iterate through each subsite of the current web and call the function recursively
            ForEach ($Subweb in $Web.Webs)
            {
                #Call the function recursively to process all subsites underneaththe current web
                Remove-SPODocument($Subweb.url)
            }
        }
        Catch {
            write-host -f Red "Error Deleting Document!" $_.Exception.Message
        }
    }
      
    
    [Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
    
    #Config Parameters
    $SiteCollURL="https://xxxx.sharepoint.com/sites/siteB"
    $BatchSize = 500
     
    #Set user name and password to connect
    $UserName="******@yyy.onmicrosoft.com"
    $Password = "pass"
     
    #Create Credential object from given user name and password
    $Cred = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($UserName,(ConvertTo-SecureString $Password -AsPlainText -Force))
        
    #Set up the context
    $Ctx = New-Object Microsoft.SharePoint.Client.ClientContext($SiteUrl)
    $Ctx.Credentials = $Cred
       
    #Call the function
    Remove-SPODocument $SiteCollURL
    

    If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.