Thursday, April 25, 2013

Manage deletion of index items (SharePoint Server 2010)

While re-configuring the search configuration and checking / validating the search functionality I discovered around 15.000 errors in the crawl-log.
The reason for this was that I changed the architecture for a specific site collection. In my first setup, I had a site collection with about 55 sub-sites. Each sub-site had it's own document library and security profile.

I already migrated about 15.000 documents in some of these sub-sites when I realized this structure wasn't going to work for me.
The maintenance and template creation of these sub-sites was so high, together with some problem in SharePoint with these sub-sites (I could not get the import-server to see the sub-sites to import the data in it, and some users complained they could not see the sub-sites even though they had access to it).

So, I decided to delete all sub-sites (thank god for powershell) and created 55 document libraries under the main site-collection.
Of course, the search crawlers already crawled the previous content (with sub-sites) and when executing the full crawl schedule now, it resulted in it not finding the 15K documents (and 15K errors!).

Thanks to this great TechNet article I found the correct properties to change in the SharePoint 2010 Search Service application. It was quite a challenge because of the many documented other problems with SP2010 Search (404 not found ->Loopback check, etc. etc.) but after a while I found the relevant information :

Delete policy for access denied or file not found

ErrorDeleteCountAllowed: Default value: 30
ErrorDeleteIntervalAllowed: 720 Hours (30 days)

So it would have taken SharePoint 30 + 30 days (2 months !) before it would have removed the reference and resolve the error automatically...
I made the numbers a bit lower and saw this morning that the crawler has deleted 15K references :)

Next: to find out why the SearchContentAccess account can't access 2 site collections in a web-application that allows this account FULL READ policy access (and works for the other 50+ site collections..)
Cheers, Jeroen

Command to change the policy-parameter values:

$SearchApplication = Get-SPEnterpriseSearchServiceApplication -Identity "<SearchServiceApplicationName>"
$SearchApplication.SetProperty("<PropertyName>", <NewValue>)