SharePoint 2013: Crawl does not happen with error m_DocumentFeeder->Init failed with 0x80131537
Worked with a customer on SP 2013 Farm crawling SP 2010 sites and also crawling content from Open Text. They had some database connectivity issues in the weekend. Post this, you seeing that none of the crawls are progressing. The status is "Crawling Full" but the success/error count is 0.
The crawl on any of the Content Sources run for at least an hour approximately without any successes.
Symptoms:
Topology with 7 servers with 4 content processing components, and an existing index of say 7 million items, there would be a clear need for freshness of searchable items. The issue would be that the existing items are searchable without issues and further crawls don’t work
Crawl rate: The foremost indication is the crawl rate remaining at zero items throughout the crawl period. This should immediately catch your attention when we start a crawl on any content source.
Second symptom is the Services restart: Any admin would know that the second step would be to stop host controller and osearch. And start osearch [which basically brings back host controller as well]. This doesn’t make the crawl successful either.
Third Symptom is to create a new ntlm based web app and add a content source on the new one and test the behavior. Since the content access account needs to be a domain account, the crawler should be able to get contents from the new web app. But in case this fails on the new web app/ content source as well, the inference is that the issue is not with the SSA itself but at the components of the topology.
A quick execute of the following command should get you their current server locations for those components.
(Get-SPEnterpriseSearchServiceApplication | Get-SPEnterpriseSearchTopology).GetComponents() | select Name, servername
Name |
ServerName |
---- |
---------- |
CrawlComponent0 |
Server3 |
CrawlComponent1 |
Server4 |
CrawlComponent2 |
Server5 |
CrawlComponent3 |
Server6 |
IndexComponent2 |
Server7 |
IndexComponent1 |
Server1 |
IndexComponent2 |
Server2 |
AdminComponent1 |
Server3 |
AdminComponent2 |
Server4 |
ContentProcessingComponent1 |
Server3 |
ContentProcessingComponent2 |
Server4 |
ContentProcessingComponent3 |
Server5 |
ContentProcessingComponent4 |
Server6 |
AnalyticsProcessingComponent0 |
Server3 |
AnalyticsProcessingComponent1 |
Server4 |
AnalyticsProcessingComponent2 |
Server5 |
AnalyticsProcessingComponent3 |
Server6 |
QueryProcessingComponent0 |
Server1 |
QueryProcessingComponent1 |
Server2 |
QueryProcessingComponent2 |
Server7 |
In the ULS CSFeedersManager will initialize the Content processing components after test verifying Cgatherer object with a PingCrawl over event ID 0x57D0. On the same event ID sequence for 0x57D0 we would see the message m_DocumentFeeder->Init failed with 0x80131537 .
08/18/2015 19:52:31.77 mssearch.exe (0xBFDC) 0x57D0 SharePoint Server Search Crawler:Gatherer Plugin e5bn Verbose CGatherer::PingCrawl 2, component 938dfd17-de71-4c33-8c25-c54075b96d00-crawl-1 [gatherobj.cxx:2492] search\native\gather\server\gatherobj.cxx
08/18/2015 19:52:31.77 mssearch.exe (0xBFDC) 0x57D0 SharePoint Server Search Crawler:Content Plugin af7x6 High CSSFeedersManager::Init: addresses = net.tcp://Server3/AF3440/ContentProcessingComponent1/ContentSubmissionServices/content,
net.tcp://Server4/AF3440/ContentProcessingComponent2/`` ContentSubmissionServices/content,
net.tcp:///AF3440/ContentProcessingComponent3/ContentSubmissionServices/content,net.tcp:///AF3440/ContentProcessingComponent4/
ContentSubmissionServices/content
08/18/2015 19:52:31.77 mssearch.exe (0xBFDC) 0x57D0 SharePoint Server Search Crawler:Content Plugin ab3jl High m_DocumentFeeder->Init failed with 0x80131537 [contentpiobj.cxx:386] search\native\gather\plugins\contentpi\contentpiobj.cxx
In our case, the list of content processing components would look peculiar. As per the previous list we know that we have to see Content processing components 1 and 2 on Servers 3 and 4 and Content processing components 3 and 4 on Servers 5 and 6. But we can see that the server names for Server5 and Server6 are not present in the net.tcp addresses.
For instance we are seeing net.tcp:///AF3440/ContentProcessingComponent3 instead of net.tcp://Server5/AF3440/ContentProcessingComponent3
This is the litmus test and this is telling us that the server names for components in the active topology are not populated for some reason.
We can also confirm this via a simple process monitor trace filtered on mssearch.exe. The RegQuery would happen on following registry key
HKLM\SOFTWARE\Microsoft\OfficeServer\15.0\Search\Applications\<SearchServiceAppGuid>\CatalogNames\FastConnector:ContentDistributor
6:48:27.6165187 AM mssearch.exe 49116 RegCloseKey HKLM\SOFTWARE\Microsoft\Office Server\15.0\Search\Applications\938dfd17-de71-4c33-8c25-c54075b96d00 SUCCESS
6:48:27.6165515 AM mssearch.exe 49116 RegCloseKey HKLM\SOFTWARE\Microsoft\Office Server\15.0\Search\Applications\938dfd17-de71-4c33-8c25-c54075b96d00 SUCCESS
6:48:27.6165843 AM mssearch.exe 49116 RegCloseKey HKLM\SOFTWARE\Microsoft\Office Server\15.0\Search\Applications SUCCESS
6:48:27.6166170 AM mssearch.exe 49116 RegQueryValue HKLM\SOFTWARE\Microsoft\Office Server\15.0\Search\Applications\938dfd17-de71-4c33-8c25-
6:48:27.6167305 AM mssearch.exe 49116 RegQueryValue HKLM\SOFTWARE\Microsoft\Office Server\15.0\Search\Applications\938dfd17-de71-4c33-8c25-c54075b96d00\CatalogNames\FastConnector:ContentDistributor SUCCESS Type: REG_SZ, Length: 668, Data:
net.tcp://Server3/AF3440/ContentProcessingComponent1/ContentSubmissionServices/content,
net.tcp://Server4/AF3440/ContentProcessingComponent2/ContentSubmissionServices/content,
net.tcp:///AF3440/ContentProcessingComponent3/ContentSubmissionServices/content,
net.tcp:///AF3440/ContentProcessingComponent4/ContentSubmissionServices/content
As can be seen we would not have the server names there as well.
Solution
In Sharepoint 2013 we know that this value comes over from the Admin SSA database from the MSSConfiguration table if we look for the rows having net.tcp switch
SELECT [Name]
,[Value]
,[LastModified]
FROM [SSA_AdminDB].[dbo].[MSSConfiguration]
where convert(nvarchar, Value ) like '%net.tcp%'
It is expected that we get 2 rows one for the content processing component and one for the Query component and these values should basically coincide with what is present on the registry entries mentioned above.
Name |
Value |
938dfd17-de71-4c33-8c25-c54075b96d00\CatalogNames\FastConnector:ContentDistributor |
net.tcp://Server3/AF3440/ContentProcessingComponent1/ContentSubmissionServices/ content,net.tcp://Server4/AF3440/ContentProcessingComponent2/ContentSubmissionServices/ content,net.tcp:///AF3440/ContentProcessingComponent3/ContentSubmissionServices/content, net.tcp:///AF3440/ContentProcessingComponent4/ContentSubmissionServices/content |
ImsQueryInternalUri |
net.tcp://Server1/AF3440/QueryProcessingComponent1/ImsQueryInternal; net.tcp://Server2/AF3440/QueryProcessingComponent2/ImsQueryInternal; net.tcp://Server7/AF3440/QueryProcessingComponent3/ImsQueryInternal; |
Here is where you can take two routes
Scenario: 1
In the first case, the values are alright on the SQL side and still the registry keys are messed up, then we know that the polling Application server admin service timer job that executes once every minute did not update it. The next step would be to check in the ULS if this job ran on the affected servers.
Time |
Process |
EventID |
Level |
Message |
Correlation |
8/18/2015 19:53 |
OWSTIMER.EXE (0x7484) |
xmnv |
Medium |
Name=Timer Job job-application-server-admin-service |
8256259d-c875-7067-44f2-e6e7a47a74fa |
8/18/2015 19:54 |
OWSTIMER.EXE (0x7484) |
xmnv |
Medium |
Name=Timer Job job-application-server-admin-service |
9156259d-8828-7067-44f2-ec6295432737 |
8/18/2015 19:55 |
OWSTIMER.EXE (0x7484) |
xmnv |
Medium |
Name=Timer Job job-application-server-admin-service |
9f56259d-28b8-7067-44f2-e03e8ba63a84 |
8/18/2015 19:56 |
OWSTIMER.EXE (0x7484) |
xmnv |
Medium |
Name=Timer Job job-application-server-admin-service |
ae56259d-e86a-7067-44f2-e54caca2ccd5 |
8/18/2015 19:57 |
OWSTIMER.EXE (0x7484) |
xmnv |
Medium |
Name=Timer Job job-application-server-admin-service |
bd56259d-a81b-7067-44f2-ef63668b17b7 |
We can also alternatively take several corrective measures like checking out if the timer service instances are running on these servers via (Get-SpFarm).TimerService.Instances
We can also try restarting the timer service on these boxes or clearing config cache on these boxes via https://blogs.msdn.com/b/josrod/archive/2007/12/12/clear-the-sharepoint-configuration-cache-for-timer-job-and-psconfig-errors.aspx and take it from there. We might have to just make that job run and things would be fixed.
Scenario 2:
In our second case, the values in the SQL table are also messed up then these components are never usable and crawl would never run since the ping to these components would always fail with the message m_DocumentFeeder->Init failed with 0x80131537
This means that we have to modify the topology; remove and then recreate the components on the same machines again and activate the topology again
Please note that modifying the topology is a very CPU intensive operation which will ensure down time on search and query. Also we have done a clone of the topology on exactly the same computers. In case topology is activated with a move over of index components, then there is a good chance of the copy over to either take a really long time or fail and cause corruption of index.
Although topology clone and change over is doable and there are methods to do this; we don’t suggest this. In our case the content processing component is the least impact in terms of removing and adding them back. In case the issue was with an index component this might not be the way out.
Modifying current topology and removing and adding back the components can be achieved by the following script.
$ssa = Get-SPEnterpriseSearchServiceApplication
$active = $ssa.ActiveTopology
$clone = $active.Clone()
$cpc4_old = $clone.GetComponents() | ?{$_.name -match 'contentprocessingcomponent4'} #Server4
$cpc3_old = $clone.GetComponents() | ?{$_.name -match 'contentprocessingcomponent3'} #Server3
$cpc2_old = $clone.GetComponents() | ?{$_.name -match 'contentprocessingcomponent2'} #Server2
$cpc1_old = $clone.GetComponents() | ?{$_.name -match 'contentprocessingcomponent1'} #Server1
$Server4= Get-SPEnterpriseSearchServiceInstance -Identity Server4
$Server3 = Get-SPEnterpriseSearchServiceInstance -Identity Server3
$Server2 = Get-SPEnterpriseSearchServiceInstance -Identity Server2
$Server1 = Get-SPEnterpriseSearchServiceInstance -Identity Server1
$clone.RemoveComponent($cpc4_old)
$clone.RemoveComponent($cpc3_old)
$clone.RemoveComponent($cpc2_old)
$clone.RemoveComponent($cpc1_old)
$clone.GetComponents()
$cpc4_new = New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $clone -SearchServiceInstance $Server4
$cpc3_new = New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $clone -SearchServiceInstance $Server3
$cpc2_new = New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $clone -SearchServiceInstance $Server2
$cpc1_new = New-SPEnterpriseSearchContentProcessingComponent -SearchTopology $clone -SearchServiceInstance $Server1
$clone.GetComponents() | ?{$_.name -match 'contentprocessingcomponent'}
$clone.Activate()
Once we have modified the topology we can go back and verify the current values again in the MSSCONFIGURATION table. In our case we were able to change back
Name | Value |
938dfd17-de71-4c33-8c25-c54075b96d00\CatalogNames\FastConnector:ContentDistributor | net.tcp://Server4/AF3440/ContentProcessingComponent5/ContentSubmissionServices/content, |
net.tcp://Server3/AF3440/ContentProcessingComponent6/ContentSubmissionServices/content, | |
net.tcp://Server2/AF3440/ContentProcessingComponent7/ContentSubmissionServices/content, | |
net.tcp://Server1/AF3440/ContentProcessingComponent8/ContentSubmissionServices/content | |
ImsQueryInternalUri | net.tcp://Server1/AF3440/QueryProcessingComponent1/ImsQueryInternal; |
net.tcp://Server2/AF3440/QueryProcessingComponent2/ImsQueryInternal; | |
net.tcp://Server7/AF3440/QueryProcessingComponent3/ImsQueryInternal; |
We should be able to see the same similarity on the registry level as well. We can provision SSA back again using (Get-SPSearchServiceApplication).provision() for the changes to line in.
Crawls can be restarted and they will work now with many successes.
Post By : Ramanathan Rajamani [MSFT]
Comments
Anonymous
September 20, 2015
Very interested but I em now so I got plenty mistakesAnonymous
June 17, 2016
Very helpfull. we had similar issue after updating march CU2016.Anonymous
July 19, 2016
Same scenario when we swap servers in the farm. Scenario 2 working like a charm in that case, very helpful.Anonymous
August 10, 2016
The comment has been removedAnonymous
September 04, 2017
Thanks a lot :))