Unknown reporter writes

terminate called after throwing an instance of std::out_of_range what(): basic_string::substr: __pos (which is 12) > this- >size() (which is 7) [1] 16185 abort (core dumped) recollindex

medoc writes

Hi,

Thanks for reporting this.

Is the problem reproducible ?

Could you please try to use the following instructions to obtain a more usable description. Especially, a stack trace would be nice: https://bitbucket.org/medoc/recoll/wiki/ProblemSolvingData

mskowr writes

Yes the issue is reproducible. I’ll follow the instructions you provided to provide more detailed description and post later.

medoc writes

Ok, that’s nice. If you are able to reproduce it in single-threaded mode, it will be even better. Try adding:

thrQSizes = -1 -1 -1

to ~/.recoll/recoll.conf

mskowr writes

Yes in single threaded mode it still core dumps with the following message: :2:../internfile/mh_mail.cpp:640:walkmime: transcode failed from cs ks_c_5601-1987 to UTF-8 :2:../common/textsplit.cpp:480:Textsplit: error occured while scanning UTF-8 string terminate called after throwing an instance of std::out_of_range what(): basic_string::substr: __pos (which is 12) > this- >size() (which is 7) [1] 8969 abort (core dumped) recollindex

medoc writes

Any chance you could forward the problem message to me ? (I understand that there may be privacy issues of course). jf at dockes.org.

If the message is part of an mbox, you can see its number by looking at the indexing log file. For example, with the verbosity level at 3, it would look like the following:

:3:rcldb/rcldb.cpp:606:Db::add: docid 160 added [/home/dockes/mbox|10]
:3:rcldb/rcldb.cpp:606:Db::add: docid 161 added [/home/dockes/mbox|11]
:3:rcldb/rcldb.cpp:606:Db::add: docid 162 added [/home/dockes/mbox|12]
:3:rcldb/rcldb.cpp:606:Db::add: docid 163 added [/home/dockes/mbox|13]

the number after the | is the message number, so you can see at which point it stops.

If this is not possible, I’ll just go look into the utf-8 code to see how this can happen, but it would help to have a test sample.

medoc writes

Hi again. I had a look at the code, but I have no way to know where the problem occurs, I really need sample data or a stack trace to fix this (which I would very much like).

mskowr writes

i will increase verbosity per above and run the recollindex again. Also, i was able to generate a core dump file but had some issues with running the backtrace in gdb due to it being compressed. i should have some time this weekend to try this again.

Also, if i downgrade to 1.20.4.1 the problem isn’t there.

Thanks!

medoc writes

Do you mean 1.20.1 ?

medoc writes

no feedback