piater writes

Instead of skipping XML dialects without dedicated support altogether, it would be better to index them with baseline support. To this end, I created a new filter rclxml (attached). It creates an HTML file, includes a <title > if it finds a <title > element (in any namespace) in the XML document, and turns all text() nodes into <p >.

To trigger this reliably, I found that I had to add {{{ application/xml = exec rclxml text/xml = exec rclxml }}}

to my ~/.recoll/mimeconf. The first is the MIME type returned by "file -i", while the second appears to be (at least in my case) what recoll concludes based on the .xml extension.

medoc writes

I’ll add the rclxml filter to the next release and the "new filters" section on the web site.

About text/xml vs application/xml: I think that I have seen //file -i// output either of them depending on system/version. It doesn’t hurt significantly to have both anyway.

Thanks a lot !

jf