Not getting all the documents from SharePoint Site using Powershell

Anonymous
2023-06-27T18:06:27.5633333+00:00

Hello, I am trying to get 95000 documents from SharePoint site and convert it into .csv by SharePoint Online Management PowerShell.

I am not able to get all of the files. I am getting 503 service unavailable.

Below is my code:

#Load SharePoint CSOM Assemblies
Add-Type -Path "C:\Program Files\SharePoint Online Management Shell\Microsoft.Online.SharePoint.PowerShell\Microsoft.SharePoint.Client.dll" 
Add-Type -Path "C:\Program Files\SharePoint Online Management Shell\Microsoft.Online.SharePoint.PowerShell\Microsoft.SharePoint.Client.Runtime.dll" 
Add-Type -Path "C:\Program Files\SharePoint Online Management Shell\Microsoft.Online.SharePoint.PowerShell\Microsoft.Online.SharePoint.Client.Tenant.dll" 
    
#Function to Generate Report on all documents in a SharePoint Online Site Collection
Function Get-SPODocumentInventory($SiteURL)
{
    Try {
        #Setup the context
        $Ctx = New-Object Microsoft.SharePoint.Client.ClientContext($SiteURL)
        $Ctx.Credentials = $Credentials
    
        #Get the web from given URL and its subsites
        $Web = $Ctx.web
        $Ctx.Load($Web)
        $Ctx.Load($Web.Lists)
        $Ctx.Load($web.Webs)
        $Ctx.executeQuery()
  
        #Arry to Skip System Lists and Libraries
        $SystemLists =@("Converted Forms", "Master Page Gallery", "Customized Reports", "Form Templates", "List Template Gallery", "Theme Gallery",
               "Reporting Templates", "Solution Gallery", "Style Library", "Web Part Gallery","Site Assets", "wfpub", "Site Pages", "Images")
      
        Write-host -f Yellow "Processing Site: $SiteURL"
  
        #Filter Document Libraries to Scan
        $Lists = $Web.Lists | Where {$_.BaseType -eq "DocumentLibrary" -and $_.Hidden -eq $false -and $SystemLists -notcontains $_.Title -and $_.ItemCount -gt 0}
        #Loop through each document library
        Foreach ($List in $Lists)
        {
            #Define CAML Query to Get List Items in batches
            $Query = New-Object Microsoft.SharePoint.Client.CamlQuery
            $Query.ViewXml ="
                <View Scope='RecursiveAll'>
                   <Query>
                      <OrderBy><FieldRef Name='ID' Ascending='TRUE'/></OrderBy>
                   </Query>
                   <RowLimit Paged='TRUE'>$BatchSize</RowLimit>
                </View>"
  
            Write-host -f Cyan "`t Processing Document Library: '$($List.Title)' with $($List.ItemCount) Item(s)"
  
            Do {
                #Get List items
                $ListItems = $List.GetItems($Query)
                $Ctx.Load($ListItems)
                $Ctx.ExecuteQuery()
 
                #Filter Files
                $Files = $ListItems | Where { $_.FileSystemObjectType -eq "File"}
 
                #Iterate through each file and get data
                $DocumentInventory = @()
                Foreach($Item in $Files)
                {
                    $File = $Item.File
                    $Ctx.Load($File)
                    $Ctx.ExecuteQuery()
  
                    $DocumentData = New-Object PSObject
                    $DocumentData | Add-Member NoteProperty SiteURL($SiteURL)
                    $DocumentData | Add-Member NoteProperty DocLibraryName($List.Title)
                    $DocumentData | Add-Member NoteProperty FileName($File.Name)
                    $DocumentData | Add-Member NoteProperty FileURL($File.ServerRelativeUrl)
                    $DocumentData | Add-Member NoteProperty CreatedBy($Item["Author"].Email)
                    $DocumentData | Add-Member NoteProperty CreatedOn($File.TimeCreated)
                    $DocumentData | Add-Member NoteProperty ModifiedBy($Item["Editor"].Email)
                    $DocumentData | Add-Member NoteProperty LastModifiedOn($File.TimeLastModified)
                    $DocumentData | Add-Member NoteProperty FileIcon($File.DocIcon)
		    $DocumentData | Add-Member NoteProperty GUID($List.Id)

                        
                    #Add the result to an Array
                    $DocumentInventory += $DocumentData
                }
                #Export the result to CSV file
                $DocumentInventory | Export-CSV $ReportOutput -NoTypeInformation -Append
                $Query.ListItemCollectionPosition = $ListItems.ListItemCollectionPosition
            } While($Query.ListItemCollectionPosition -ne $null)
        }
           
        #Iterate through each subsite of the current web and call the function recursively
        ForEach ($Subweb in $Web.Webs)
        {
            #Call the function recursively to process all subsites underneaththe current web
            Get-SPODocumentInventory($Subweb.url)
        }
    }
    Catch {
        write-host -f Red "Error Generating CSV. Please contact with Administrator" $_.Exception.Message
    }
}
   
#Config Parameters
$SiteCollURL="https://wvbss.sharepoint.com/sites/AH1"
$ReportOutput="C:\temp\BubEx1.csv"
$BatchSize = 500
 
#Setup Credentials to connect
$Cred= Get-Credential
$Credentials = New-Object Microsoft.SharePoint.Client.SharePointOnlineCredentials($Cred.Username, $Cred.Password)
   
#Delete the Output Report, if exists
if (Test-Path $ReportOutput) { Remove-Item $ReportOutput }
   
#Call the function
Get-SPODocumentInventory $SiteCollURL
Microsoft 365 and Office | SharePoint | For business | Windows
0 comments No comments
{count} votes

Accepted answer
  1. Anonymous
    2023-06-28T05:42:47.41+00:00

    Hi @Sayantan Ganguly,

    A 503 error can occur when a large number of requests are being sent to the server. This is because the server is unable to handle too many requests and exceeds its capacity. This usually happens when there is a need to process large amounts of data or perform complex operations.

    In your case, when fetching a large number of documents from a SharePoint site and exporting as a CSV file, due to the high number of requests, the server might not be able to respond to all requests immediately, resulting in a 503 error.

    Here are the recommended steps to check and debug the issue:

    1. From the official documents, we can see that there are restrictions on sharepoint requests:

    Here is the official document, please understand the request limit through the document:https://learn.microsoft.com/en-us/sharepoint/dev/general-development/how-to-avoid-getting-throttled-or-blocked-in-sharepoint-online

    2.We also need to understand some limitations of CSOM.

    https://learn.microsoft.com/en-us/office/client-developer/project/what-the-csom-does-and-does-not-do

    3.Setting $BatchSize to a smaller value, such as 200, helps to reduce the number of requests per batch, thereby reducing the load on the server. Smaller batch sizes mean fewer documents are processed per request, which increases the efficiency with which the server processes requests and reduces the likelihood of 503 errors.

    4.To add a delay in your code to avoid 503 errors, you can add a Start-Sleep command between each batch query to introduce some delay. By increasing the delay, you can give the server enough time to process the request and reduce the occurrence of errors. Here's an example of adding a delay to your script:

    Do {
        #Get List items
        $ListItems = $List.GetItems($Query)
        $Ctx.Load($ListItems)
        $Ctx.ExecuteQuery()
    
        #Filter Files
        $Files = $ListItems | Where { $_.FileSystemObjectType -eq "File" }
    
        #Iterate through each file and get data
        $DocumentInventory = @()
        Foreach ($Item in $Files) {
            $File = $Item.File
            $Ctx.Load($File)
            $Ctx.ExecuteQuery()
    
            # ...
    
            #Add the result to an Array
            $DocumentInventory += $DocumentData
        }
    
        #Export the result to CSV file
        $DocumentInventory | Export-CSV $ReportOutput -NoTypeInformation -Append
        $Query.ListItemCollectionPosition = $ListItems.ListItemCollectionPosition
    
        #Introduce a delay between batches
        Start-Sleep -Seconds 1  # Adjust the delay as needed
    } While ($Query.ListItemCollectionPosition -ne $null)
    
    
    

    When dealing with a large number of files, it is necessary to make certain adjustments in order to ensure the performance and stability of the program. These adjustments may result in longer execution times, but they are necessary to avoid server resource issues and request timeout errors.

    We kindly ask for your understanding regarding these changes and the extended execution time they may cause. These adjustments are essential for successfully processing a large volume of files and minimizing potential errors.

    We understand that this may impact your time, but currently, there are no alternative solutions available. We appreciate your understanding and patience while the program completes its execution to ensure a smooth operation.

    Thank you for your understanding and support. If you have any questions or need further assistance, please don't hesitate to contact us. We will do our best to provide you with the utmost service.


    If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    Best Regards

    Cheng Feng

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.