idealist1508 writes

I noticed it on a my book collection. I have some fb2, epub and many pdf files. It occuers only on pdfs.

CPU last goes to zero and nothing happens. In the log i found only something about handler timeout.

3:..\..\rcldb\rcldb.cpp:624:Db::add: docid 4093 added [C:/CalibreLib/Nieizviestnyi/187 (52)/187 - Nieizviestnyi.pdf|]
:2:..\..\internfile\mh_exec.cpp:62:MimeHandlerExec: filter timeout (1200 S)
:2:..\..\internfile\mh_exec.cpp:121:MimeHandlerExec: handler timeout
:2:..\..\internfile\mh_exec.cpp:130:MimeHandlerExec: command status 0x110f for c:/tools/pdftotextFilter.bat
:2:..\..\internfile\internfile.cpp:744:FileInterner::internfile: next_document error [C:/CalibreLib/Josephine Mutzenbacher/Die Geschichte einer Wienerischen D (2683)/Die Geschichte einer Wienerisch - Josephine Mutzenbacher.pdf] application/pdf
:3:..\..\rcldb\rcldb.cpp:624:Db::add: docid 4094 added [C:/CalibreLib/Josephine Mutzenbacher/Die Geschichte einer Wienerischen D (2683)/Die Geschichte einer Wienerisch - Josephine Mutzenbacher.pdf|]

I thought maybe it is something wrong with a execm filter. I tried other converters, such as mutool. I tried to use it as exec filter too, but nothihg helps.

If I start indexing only this one particular file then it works. So i thing it should be someting wrong with a call function.I can index for example complete my library with a swisch-e. Swish-e is using the same extern filters as recoll.

I suppose it will be hard to reproduce it. Maybe it is possible to implement different methods to call a filter, and i can test it on my collection.

Thanks

medoc writes

Is a particular pdf implied each time or does this happen on random pdfs ?

What is the pdftotextFilter.bat program ?

idealist1508 writes

I suppose, they are random. I make some test at home to be sure.

pdftotextFilter.bat converts pdf to text. On last run the content of it was following:

#!batch
mutool draw -F text -i %1
rem pdftotext -enc UTF-8 %1 -

idealist1508 writes

This is very strange. I have got no errors with a filter, but recollindex seems to freeze. It doing nothig for for a hours last log entry is:

#!log
:3:..\..\rcldb\rcldb.cpp:624:Db::add: docid 11948 added [C:/calibrelib/Wollschlager, Daniel/R Kompakt_ Der Schnelle Einstieg in (2933)/R Kompakt_ Der Schnelle Einstie - Wollschlager, Daniel.pdf|]
:3:..\..\rcldb\rcldb.cpp:1713:Db::waitUpdIdle: total xapian work 382204 mS
:3:..\..\index\fsindexer.cpp:243:fsindexer index time:  4005528 mS
:3:..\..\utils/workqueue.h:202:DbUpd: tasks 11948 nowakes 14215 wsleeps 8600 csleeps 1029
:2:..\..\utils/workqueue.h:148:WorkQueue::waitIdle:DbUpd: not ok or ca

medoc writes

Yes, this is weird, it looks like multithreading is enabled. I could never get this to work on Windows (with the symptom you are getting: process never exits).

What version are you using ?

idealist1508 writes

Recoll version: Recoll 1.22.0 + Xapian 1.2.21 Yes, multithreading was enabled. Is it bad Idea? After I disabled it. I was able to run recollindex without issues from cmd.

I use recoll not direct, but from a [plugin for callibre](https://github.com/idealist1508/Recoll-Full-Text-Search-For-Calibre). The issue with a filter occurred only if i call recoll from calibre. But now I was able to find [why](https://github.com/idealist1508/Recoll-Full-Text-Search-For-Calibre/commit/5393c33911ce40e459d512d87efe2afe2a06a93e).

Thank you for helping me. If i can somehow help you to diagnose the issue "process never exits, when multithreading is enabled. say me what can i do.

idealist1508 writes

Works for me without multithreading.

medoc writes

I just realized that you are not on Windows are you ? But I just did pretty intensive testing of PDF indexing on Linux (http://www.recoll.org/perfs.html), so I just assumed that your problem had to be Windows-related…

There is no problem that I know of with Unix/Linux Recoll multithreading. On WIndows, I’ve more or less renounced on multithreading, because it’s both slower (!) and buggy. Windows recoll indexing is very slow. I guess either the architecture is not adapted to Windows. Maybe a Windows expert would find something to fix, but I’m not even sure of this.

idealist1508 writes

I am on windows. It is my work computer. But private i use linux. By the way Is it possible to use the same database for searching under linux and windows? Is format compatible? I know that path will be different, but i need only Filename.

medoc writes

Ok. I should wait till I’m awake to write stuff. It says Windows in the title …

The index is fully portable. You can enter a path translation so that Preview and Open will work even if the local paths are different: https://www.lesbonscomptes.com/recoll/usermanual/webhelp/docs/RCL.SEARCH.PTRANS.html