franckw writes

When I enable the webcache indexing : processbeaglequeue = 1 webcachedir = /home/USERNAME/tmp/wq

and start recollindex : $ recollindex gzip: /tmp/rcltmpfo2fbw/local_cluster.graphmlz: unknown suffix — ignored recollindex: ../utils/circache.cpp:408: CCScanHook::status CirCacheInternal::readEntryHeader(off_t, EntryHeaderData&): Assertion ‘m_fd >= 0’ failed. Abandon (core dumped)

Recoll version used is 1.17.3 (on ubuntu 12.04)

medoc writes

I can reproduce the problem if /home/USERNAME/tmp/wq does not exist.

Could you please create /home/USERNAME/tmp/wq and retry ?

I’ll fix the abort for the next release (bad sleep the day I wrote this I guess :) )

franckw writes

Well, directory exists already…

It happens, whatever name I put as directory, empty or with documents inside.

As soon as I set processbeaglequeue = 1 in recoll.conf, it doesn’t work anymore.

medoc writes

Well stupid me, that’s the contrary. The parent directory should not exist, it always assumes that the file exists if the directory exists. Could you please try to either delete the directory or touch an empty circache.crch inside it ?

Thanks, jf

franckw writes

Ok, it’s working both cases : - empty circache.crch - no wq directory

Good for me then… I just have to teach elinks to save the proper thing is that directory. thanks

franckw writes

hm.. is recoll supposed to index stuff in $HOME/.beagle/ToIndex if i specify a webcachedir = /home/USERNAME/tmp/wq ?

medoc writes

I think that you are beginning to discover that I never tested this with non-default paths :)

I think that the best approach while I sort this out might be to keep the default webcachedir, maybe set up a symbolic link if you really don’t want the data to be in there.

By the way I’m interested by what you are trying to do, my email is jf@dockes.org, email may be more convenient. Have you got a sample of a beagle "page description" file? These have the same names as the page file, with a dot prepended, and contain the URL etc. I am attaching one to this. The code to read them is at the top of beaglequeue.cpp (look for BeagleDotFile::toDoc)

medoc writes

I should not try to answer stuff in the morning. Yes it’s actually normal that it’s indexing from ~/.beagle/ToIndex. That’s the place where the Beagle/Firefox module stores stuff, there is no easy way (as in: editing config file) to change it, it’s hard-coded in the module. The webcachedir thing is the place where Recoll stores a copy of the data after it’s removed from the Beagle queue. This is the place from which Preview gets its data, and also where from we reindex after an index reset (the data in .beagle/ToIndex is gone after being processed).

franckw writes

ok… I figured out by looking at beagle sources. I’ll keep the lil chat by mail, no need to push further the discussion here.

medoc writes

Crash is fixed in current trunk code, closing.