Unknown reporter writes

Hi, I love recoll. I’m using the program with Debian Wheezy kombined with KDE. Up to now I can only say that it’s the best indexing software out there that I know of.

One thing really bothers me though: If I index files I keep getting errors because this or that file cannot be indexed. Recoll always completly interrupts the whole process and I have to exclude single files manually. This takes very long and is extremely annoying.

Would it be possible to include a parameter for recollindex that makes recoll index everything it CAN index and to ignore the rest? This would save a lot of time!

Best regards desputin

medoc writes

Hi,

What makes you think that the indexer stops on errors ? It’s certainly not supposed to ! In theory it behaves like you describe it should (in the last paragraph). Could you please describe in more detail the command you are using and what is happening ?

desputin writes

Hi, thanks for the quick answer!

I get a message that ends with:

" :3:../rcldb/rcldb.cpp:1352:Db::add: docid 2904936 updated [/home/desputin/farben openoffice.rar|] :2:../internfile/internfile.cpp:878:FileInterner::internfile: next_document error [/home/desputin/fuehrerschein.zip] application/zip :3:../rcldb/rcldb.cpp:1352:Db::add: docid 2925361 updated [/home/desputin/fuehrerschein.zip|] This OLE file does not contain a Word document :2:../internfile/mh_exec.cpp:115:MimeHandlerExec: command status 0x100 for /usr/share/recoll/filters/rcldoc :2:../internfile/internfile.cpp:878:FileInterner::internfile: next_document error [/home/desputin/Desktop/Diverses/locrSetup.msi] application/msword :3:../rcldb/rcldb.cpp:1352:Db::add: docid 2925630 updated [/home/desputin/Desktop/Diverses/locrSetup.msi|] :3:../index/fsindexer.cpp:149:FsIndexer::index missing helper program(s): python:midi (audio/x-karaoke) python:rarfile (application/x-rar)

CGot signal, registering stop request [[A:3:../rcldb/rcldb.cpp:1554:Db::purge: partially cancelled Indexing failed "

Operating system as I said Debian Wheezy.

medoc writes

The only thing abnormal in the above messages is that the indexer was interrupted by a signal from the terminal (^C), so it did not complete the index purge (deletion of data from files which do not exist any more). The other error messages concern individual documents. It might make sense to take a look why these fail, but the messages do not indicate a general indexing failure.

It’s quite common to have a few documents which the indexer is unable to process, and the other documents are probably indexed normally, as far as I can see.

You probably get the same messages at each indexing, because the failed documents are retried every time.

desputin writes

Hi, so I understand it that the indexer reads all documents until the "end" and gives this error output for certain files when it’s done with indexing.

Nevertheless I had the impression that earlier interruptions (before I included quite a few files to the blacklist) came before the indexer was all done.

medoc writes

Actually, the error output comes whenever an error is encountered, not at the end of indexing, but you are otherwise correct. The messages above indicate that indexing was done: printing the missing helpers and purging come at the end.

recollindex will never stop "voluntarily" because of a file error. On the other hand, maybe you saw an indexer crash, which would be another issue.

medoc writes

No real feedback about what the problem actually was.