KN and Desktop Search

Recently, we received a comment on the KN team blog asking about how the KN client analysis process relates to Windows Desktop Search and indicating concern about the performance implications of running both Desktop Search and KN analysis. Since these are questions we get frequently, rather than respond in the comments section, I’ve decided to tackle the answers here. Also, since this is my first blog entry, let me first introduce myself. I’m Glen Anderson, Group Program Manager for the KN team. I had a chance to meet some of you during my KN “Under the Hood” session at the SharePoint conference earlier this year and more of you will “see” me in an upcoming channel9 interview.

For those that want to cut to the chase, the simple answer is “with KN V1 there is no dependency between the KN analysis engine and the Windows Desktop Search indexing engine.” They are two different processes that can run in parallel. An important point here is that we strongly encourage you to run KN analysis only after the initial indexing of your desktop is complete. This applies to any Desktop Search indexing solution; Microsoft or otherwise. For KN V2 we will evaluate deeper integration between the KN mail synching component (this makes more sense after you read my longer answer below) and Desktop Search indexing.

The longer answer is more involved (as always). At the end of the day, although KN and Windows Desktop Search seem very similar on the surface, the requirements are quite different. The KN analysis process consists of four distinct steps from a logical perspective. (From a physical perspective we are still optimizing the KN algorithms for performance prior to final release.) These steps take place when you first run your KN client and on each subsequent “incremental” analysis:

1. Synchronization: Your KN client scans your emails from the folders you select. Keyword and contact data is captured locally per email.

2. Contact resolution: Your KN client compares each unique contact entry found in your email headers against your Outlook Address Book to determine whether the contact is internal to your organization, external, or unresolved.

3. Update: Your KN client calculates statistics across all the keyword and contact data captured in steps 1 and 2.

4. Recommendation: Your KN client “recommends” your profile to you including applying some of our "special sauce".

So to start with, you see that it is really only the "synchronization" step that is similar in nature to Desktop Search indexing. If you look at a level deeper and evaluate the form of data that is processed by KN, you can also see that the requirements are different even in the “synchronization” step. A search index typically is composed of individual “broken” keywords with links, positions, counts, etc. KN processes “unbroken” paragraphs from the email body for “noun phrase” extraction. Access to this unbroken information is not reliably available across all Desktop Search indexing solutions. KN would have to use MAPI to get the unbroken information anyway so each message would still be processed twice.

The performance characteristics of KN analysis and Desktop Search indexing are also quite different. The first time through, both systems need to process all the email data selected by the user. But that is where the similarity ends. WDS updates the index with new items based on notifications as they arrive. Users usually prefer that new items be available to Desktop Search as soon as possible. In contrast, incremental KN analysis does not need to be real time and is executed on a scheduled basis only (default is every 14 days). As a user, you usually don’t need to update your expertise profile or social network on an hourly or even daily basis. Your social network typically just doesn’t change significantly that often. So after the initial KN analysis, there is very little cause for concern about performance issues related to having both the KN client and a Desktop Search solution on your machine.For more information about Windows Desktop Search behavior see the Windows Desktop Search Administration Guide.

A final note on performance: our KN Dev team has already made great improvements in the performance of the KN analysis process in the last month or so, both in terms of the time to complete client analysis and in terms of disk I/Os generated. So those of you who have had the chance to get your hands on the KN beta client will see significant additional performance benefits by the time we complete the KN final release.