FOXIT vs. Adobe PDF IFilter [ 32-bit only ]
Sometimes back I had the chance to run a performance and international sufficiency analysis on the Adobe and FOXIT ifilters for some of our customers. The following report is now made available for a broader audience.
PERFORMANCE ANALYSIS OF 32-BIT FOXIT PDF IFILTER vs. ADOBE PDF IFILTER
Machine : Intel Xeon CPU @ 1.4 GHz (4 hyperthreaded processors)
4.00 GB of RAM
32-bit Win2K3 SP1
Indexer performance set to partly reduced.
FOXIT v1.0 |
ADOBE v.8 |
|
Total # of pdf documents |
10917 |
10917 |
# successful crawls |
10871 |
10909 |
# errors |
44 (expired ebooks etc) |
0 |
# warnings |
2 (corrupted doc) |
2 (corrupted doc) |
CRAWL TIME: |
||
Portal Content |
00:49:21.163 |
03:34:39.237 |
Anchor Crawl 1 |
00:02:03.527 |
00:02:39.073 |
Anchor Crawl 2 |
00:00:02.173 |
00:00:02.437 |
TOTAL Crawl Time |
00:51:26.863 (~ 51 minutes) |
03:38:00.747 (~ 218 minutes) |
Analysis:
1. The FOXIT filter is 4.27 times faster than the Adobe filter on a quad proc machine. This is expected since the adobe filter is not truly multithreaded and serialized the threads.
2. The Adobe filter crawls some documents which ideally should not be crawled (expired ebooks etc).
INTL SUFFICIENCY ANALYSIS OF 32-BIT FOXIT PDF IFILTER vs. ADOBE PDF IFILTER
Both the adobe and FOXIT filters do not return the correct locale for non-english documents. Both of them always emits LOCALE = 1033 (en-us).Hence we pass them to the neutral wordbreaker and this compromises search relevance.
Tests were performed on JPN, CHS, FRE and HEB pdf documents using both the indexer and standalone test tools.
Language |
# Tokens |
MOSS returns result with FOXIT ? |
MOSS returns result with Adobe? |
Correct locale emitted by FOXIT? |
Correct locale emitted by Adobe? |
JPN |
2 |
No |
No |
No |
No |
CHS |
2 |
No |
No |
No |
No |
FRE |
2 |
Yes |
Yes |
No |
No |
HEB |
2 |
Yes |
Yes |
No |
No |
Note that since French is syntactically very close to English, we still get back valid results. In case of the Hebrew documents, I’d say it’s a matter of coincidence that the token the language expert gave me was correctly wordbroken.
Comments
Anonymous
November 14, 2007
Sometimes back I had the chance to run a performance and international sufficiency analysis on the AdobeAnonymous
November 14, 2007
Sometimes back I had the chance to run a performance and international sufficiency analysis on the AdobeAnonymous
November 15, 2007
Deb, regarding your comments about the returned locale information: Do the indexed PDF documents actually contain locale information? As far as I know only "Tagged PDF" documents can optionally contain locale information, and those are fairly rare. Regards StephanAnonymous
November 15, 2007
Stephan, thanks for bringing up the point.I was not aware only Tagged pdf docs contain locale info. A lot of our customers (especially ones in east asia) complained about poor relevance in search results on localized pdf docs.The reason is since we always get back an english locale all the time, the proper WordBreaker and stemmer is never invoked.As Stephan mentioned above, tagging the pdf documents with correct locale might give better relevance. Any thoughts ? :) regards, Deb.Anonymous
November 26, 2007
Adobe ha finalmente rilasciato la versione a 64 bit dell'IFilter per indicizzare i documenti PDFAnonymous
November 30, 2007
Filter pack - when will it be available? We are now in December and there was an announcement elsewhere that it would be available in July and an announcement here that it would be available in August. If it was released can you tell us where to find it, and if it was not released can you tell us whether it ever will be?Anonymous
January 21, 2008
it's weird. i can index and search non-english documents with Foxit ifilter. hmmmm.Anonymous
January 21, 2008
Foxit PDF iFilter Support Chinese/Japanese/Korean PDF documentsAnonymous
July 22, 2008
I often work with customers who are running the 64 bit version of MOSS, and they always want to knowAnonymous
December 09, 2008
I think one of the most common tasks any new born MOSS environment gets as a christening, is a installationAnonymous
December 09, 2008
After so long a time Adobe finally released its 64bit version of PDF iFilter! http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025Anonymous
December 09, 2008
Adobe has finally released their free 64-bit PDF iFilter for MOSS. If you do not know what this is for;Anonymous
December 10, 2008
Adobe has FINALLY released a 64 bit PDF iFilter. For everyone out there running 64 bit MOSS, withoutAnonymous
December 15, 2008
PDF iFilter Battle FoxIT v.s. Adobe iFilter 9Anonymous
January 29, 2009
Somehow I missed this news late last year. Adobe has finally released a 64bit version of it’s PDF iFilterAnonymous
May 06, 2009
Windows Share Point 3.0 e IFilter per PDF