Lucene and Content is not allowed in prolog

Hi

We’ve recently upgraded a customer from 6.5.2 to 6.6.1 (templates are still XSL) and after building the search catalogue using “search index type” in the ServerAdministrator I notice the following error in the console.log file. The field dw_description is a shared textarea field used on pretty much every content type so the error appears for each content item.

2009-09-11 12:25:48,976 ERROR [PSSearchIndexerImpl] Skipping indexing of field (name:dw_description,unitid:5033).
com.percussion.extension.PSExtensionProcessingException: An exception occurred while processing the "com.percussion.search.lucene.textconverter.PSTextConverterHtml" extension: org.xml.sax.SAXParseException: Content is not allowed in prolog..
	at com.percussion.search.lucene.textconverter.PSTextConverterHtml.getConvertedText(Unknown Source)
	at com.percussion.search.lucene.PSSearchIndexerImpl.o00000(Unknown Source)
	at com.percussion.search.lucene.PSSearchIndexerImpl.o00000(Unknown Source)
	at com.percussion.search.lucene.PSSearchIndexerImpl.o00000(Unknown Source)
	at com.percussion.search.lucene.PSSearchIndexerImpl.update(Unknown Source)
	at com.percussion.search.i.super(Unknown Source)
	at com.percussion.search.i.super(Unknown Source)
	at com.percussion.search.i.super(Unknown Source)
	at com.percussion.search.i.super(Unknown Source)
	at com.percussion.search.i$1.run(Unknown Source)

I had a look a other forum posts and made sure that the allow inline links checkbox is not ticked.

Any clues?

Cheers
James

This has been fixed in 6.7.
There is a workaround you can use if you are not planning on upgrading.
If the content of the description field is plain text then set the mimetype mode to specified and value to to text/plain on that shared field.

Hi Bhuvaneshwar

Thanks for the advice but unfortunately after making the change, restarting the service and running “search index type” the error still appears in the logs.

Cheers
James