monperrus writes

I realize that ~/.recollweb/ToIndex takes 4GB, and that index files are not removed once indexed.

Is there a specific configuration parameter for this?

medoc writes

This is normally done by default, but I am seeing errors too. Apparently it happens for files where the auxiliary metadata file was not created by the firefox module. This would be signalled by the following kind of message in the log:

:2:/home/dockes/projets/fulltext/recoll/src/index/beaglequeue.cpp:86::BeagleDotFile: open failed for [/home/dockes/.recollweb/ToIndex/.firefox-recoll-web-8ed8a659dd206fd66f3b1b636e46d154]

I need to check why the firefox module is not creating some of the dot files. Meanwhile, you can just delete the files.

medoc writes

Wow, thanks for reporting this ! This was a really nasty bug in the firefox extension, resulting in bogus files being left around, and, worse, probably the wrong data being indexed for a given url in some cases.

Version 2.2.2 of the extension fixes the issue, it will appear on the mozilla addons site as soon as it is reviewed (I’d propose a file now, but it would not be signed, which results in complications).

medoc writes

Solved by recoll firefox extension update

monperrus writes

Awesome, thanks a lot.

Do you have a URL to the diff?

medoc writes

The only interesting part are the few lines about currentListener The rest is uninteresting logging stuff