humble_user writes
Recoll rea-time indexing was hogging processor time and disk access, slowing machine down.
Most noticable symptom, aside from slow general performance of computer, was persistent pattern of disk access: 7 second continuous access to main disk, with a sound like nutmeg on a grater, followed by four seconds of non-access, followed by 7-second grating access, and so on, repeating persistently and without pause in the cycle.
Switching off the Recoll real-time indexer caused the problem to stop.
Please let me know if there is anything I can do to help diganose the problem. I’ll see if I can’t generate a bug report while its grating.
I have been using Recoll for a good nine months. The error coincided with two events: upgrade to 1.19.3 (I had been using 1.17.3, which was served up by the Ubuntu Software Centre); and locating the primary target of the indexer from local hard disk to external over usb.
humble_user writes
The problem has occured again with the real-time indexer off. The reoccurance suggests a more fundamental problem with the indexer.
With real-time off, the following happened:
Connected USB drive with content to be indexed Selected File > Update Index
Indexer seemed to be progressing normally. But it eventually seemed to get stuck and go into a similar pattern to the one described in my original error report above: grating disk access for 5 seconds, followed by about 10 to 20 seconds of inactivity, followed by grating disk access, and so on. The cycle is slower than with the real-time indexer.
During these periodic disk accesses, Recoll hogs machine resources and locks other applications up. Indexing is progressing very slowly. It’s been like this for about an hour. Progress bar presently says: Indexing in progress (15120/69/15120). It’s increasing at about a handful of files per minute.
I cannot determine if it is a problem with any particular file types because the indexer is using too narrow a window to display its progress: it’s current path is too long for it to be displayed so I cannot tell precisely where it is stuck.
Firefox extension was in use but was disabled a few days ago from within the Recoll preferences. Firefox continued reporting that the Recoll indexer was operating. So I turned off the Firefox add-on from within Firefox.
This may of course be a problem with having the files for index on an external, encrypted USB drive.
A problem I shall have to report separately when I get round to it is that Recoll is having to rebuild the index from scratch everytime the USB is reconnected / computer switched on. With USB plugged in, and Recoll switched on, a search will get no results until Recoll has rebuilt the index.
humble_user writes
E.g.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3842 mark 39 19 429m 164m 2896 S 161.8 4.7 151:23.72 recollindex 2362 mark 20 0 1607m 553m 29m S 4.6 15.8 31:21.54 firefox 25614 root 20 0 0 0 0 S 3.0 0.0 1:00.05 kworker/1:2 2435 mark 20 0 513m 37m 12m S 1.3 1.1 3:22.76 plugin-containe 1328 root 20 0 194m 60m 41m S 0.7 1.7 5:48.95 Xorg 3753 mark 20 0 1935m 35m 20m S 0.7 1.0 0:58.55 recoll 82 root 0 -20 0 0 0 S 0.3 0.0 0:00.44 kworker/0:1H 3672 mark 20 0 554m 14m 9344 S 0.3 0.4 0:20.91 xfce4-terminal 26036 root 20 0 0 0 0 S 0.3 0.0 0:00.02 kworker/u:0 26046 mark 20 0 20704 1632 1100 R 0.3 0.0 0:00.08 top 1 root 20 0 27072 2104 1048 S 0.0 0.1 0:02.24 init 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
medoc writes
Hi,
We need to see what recoll is doing while it’s having problems. For this, we need to set up the message log.
There is a small doc about using the log here: https://bitbucket.org/medoc/recoll/wiki/WhyIsMyFileNotIndexed
For the first try, I think that it should be enough to set the log level to 3, this should show what the indexer is doing. If this is not enough, we’ll set it to 6, which is very verbose.
About external drives, there is basically no specific support for them in recoll. I think that the best approach would be to only run the indexer when the drive is connected (so no real time indexing will be really practical). If the indexer runs while the drive is not connected, it will purge all the data for the absent files.
humble_user writes
Very helpful, thank you. Thanks for tip re. real-time indexer and external drive. (Real-time indexer not important for the purpose, but had intended to make greater use of Recoll with external drives for accessing, e.g., old email records).
As per your instructions, edited ~/.recoll/recoll.conf.
Then:
$ sudo bash -c recollindex > /tmp/myindexlog 2 >&1
$ cat /tmp/myindexlog :3:recollindex.cpp:402:recollindex: changing current directory to [/tmp] :3:recollindex.cpp:423:recollindex: starting up :3:../rcldb/rcldb.cpp:491:Db::add: docid 1 added [/root|] :3:../rcldb/rcldb.cpp:1454:Db::waitUpdIdle: total xapian work 252 mS :2:../index/fsindexer.cpp:249:fsindexer index time: 232 mS :3:../utils/workqueue.h:215:Internfile: tasks 1 nowakes 1 wsleeps 3 csleeps 0 :3:../utils/workqueue.h:215:Split: tasks 1 nowakes 1 wsleeps 3 csleeps 0 :3:../rcldb/rcldb.cpp:491:Db::add: docid 1 added [/root|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 2 added [/root/.local|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 3 added [/root/.local/share|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 4 added [/root/.local/share/webkit|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 5 added [/root/.local/share/webkit/icondatabase|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 6 added [/root/.local/share/webkit/icondatabase/WebpageIcons.db|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 7 added [/root/.gnome2_private|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 8 added [/root/.pulse-cookie|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 9 added [/root/.pulse/e9205e39d377fcb981de2f7550b62ff9-runtime|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 10 added [/root/.pulse|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 11 added [/root/.dbus|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 12 added [/root/.dbus/session-bus|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 13 added [/root/.config|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 14 added [/root/.config/gtk-2.0|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 15 added [/root/.bashrc|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 16 added [/root/.config/ibus|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 17 added [/root/.dbus/session-bus/e9205e39d377fcb981de2f7550b62ff9-0|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 18 added [/root/.config/ibus/bus|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 19 added [/root/.config/gtk-2.0/gtkfilechooser.ini|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 20 added [/root/.gnome2|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 21 added [/root/.gnome2/accels|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 22 added [/root/.gconf|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 23 added [/root/.gnome2/accels/system-config-lvm|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 24 added [/root/.gconf/apps|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 25 added [/root/.gconf/apps/firestarter|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 26 added [/root/.gnome2/firestarter|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 27 added [/root/.gconf/apps/firestarter/firewall|] /root/.gconf/apps/firestarter/%gconf.xml:1: parser error : Document is empty
^ /root/.gconf/apps/firestarter/%gconf.xml:1: parser error : Start tag expected, < not found
^ unable to parse /root/.gconf/apps/firestarter/%gconf.xml :3:../rcldb/rcldb.cpp:491:Db::add: docid 28 added [/root/.gconf/apps/firestarter/%gconf.xml|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 29 added [/root/.gconf/apps/firestarter/firewall/tos|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 30 added [/root/.gconf/apps/firestarter/firewall/%gconf.xml|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 31 added [/root/.gconf/apps/firestarter/firewall/icmp|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 32 added [/root/.gconf/apps/firestarter/firewall/tos/%gconf.xml|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 33 added [/root/.gconf/apps/firestarter/firewall/dhcp|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 34 added [/root/.gconf/apps/firestarter/firewall/dhcp/%gconf.xml|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 35 added [/root/.gconf/apps/firestarter/client|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 36 added [/root/.gconf/apps/firestarter/firewall/icmp/%gconf.xml|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 37 added [/root/.gconf/apps/firestarter/client/filter|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 38 added [/root/.gconf/apps/firestarter/client/filter/%gconf.xml|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 39 added [/root/.gconf/apps/firestarter/client/%gconf.xml|] /root/.gconf/apps/%gconf.xml:1: parser error : Document is empty
^ /root/.gconf/apps/%gconf.xml:1: parser error : Start tag expected, < not found
^ unable to parse /root/.gconf/apps/%gconf.xml :3:../rcldb/rcldb.cpp:491:Db::add: docid 40 added [/root/.gconf/apps/%gconf.xml|] :3:../rcldb/rcldb.cpp:491:Db::add: docid 41 added [/root/.profile|] :3:../rcldb/rcldb.cpp:1454:Db::waitUpdIdle: total xapian work 517 mS :2:../index/fsindexer.cpp:249:fsindexer index time: 512 mS :3:../utils/workqueue.h:215:DbUpd: tasks 42 nowakes 45 wsleeps 27 csleeps 3 :2:../utils/workqueue.h:161:WorkQueue::waitIdle:DbUpd: not ok or can’t lock :3:../rcldb/rcldb.cpp:1454:Db::waitUpdIdle: total xapian work 605 mS :3:../rcldb/rcldb.cpp:1454:Db::waitUpdIdle: total xapian work 193 mS :3:../utils/workqueue.h:215:DbUpd: tasks 0 nowakes 0 wsleeps 1 csleeps 0 :3:../utils/workqueue.h:215:Internfile: tasks 41 nowakes 45 wsleeps 13 csleeps 12 :3:../utils/workqueue.h:215:Split: tasks 41 nowakes 35 wsleeps 41 csleeps 3
It doesn’t appear to have done anything more.
medoc writes
I don’t see anything strange in the log. I guess that you did not see any periodical disk thrashing during this run ?
The GUI update index menu entry just executes recollindex, like you would do on the command line, there is no difference (afaik), so you should be able to reproduce the problem on the command line. Maybe retry after unmounting/remounting the volume ?
By the way, using multiple indexes can be handy with removable volumes, there is a small doc [here](https://bitbucket.org/medoc/recoll/wiki/MultipleIndexes), in addition to the documentation from the manual.
Also, about the Firefox history indexing, there are two parts in it:
-
Recoll scanning a queue directory, which you can turn off from the recoll GUI (or the config file).
-
The Firefox extension creating files inside the queue directory (which it signals by a quite misleading message of indexing such and such…) which you can turn off in Firefox
In general if you use this intermittently, it should be enough to toggle the Firefox extension, the fact that Recoll scans the queue directory or not will not significantly affect anything.
humble_user writes
Removed usb drive. Replaced. Ran indexer from command line. Trace log as follows:
It was over within seconds. As when executed from the command line before, it appears not to have actually indexed the target. It usually takes rather a long time because there are rather a lot of files. Although, I had noticed recently that the indexer had been taking a while to wake up.
I shall leave this a while and see if there is anything that wakes up. Though this is quite doubtful as their are no Recoll processes running.
I shall also run from the GUI again and see if the problem repeats as before.
medoc writes
I think that the last run did not do anything because there was apparently another indexer running. Anyway, what I’d do would be to set the log file at level 4 to have a bit of debugging messages, let it go to some file (set logfilename), proceed with normal usage, and wait until the thrashing happens, at which moment I would tail the log file to see what it’s doing.
humble_user writes
I will do this.
I have meanwhile re-run update index from the GUI. It finished within 10 minutes without a problem. This perhaps suggests that the crunching problem was related to the real-time indexer. But the problem had persisted after I switched the real-time indexer off this morning, and even after replugging the target USB, as reported in my second comment above. I may try and recreate the problem at the higher reporting level and see what happens.
humble_user writes
BTW, as far as I’m aware there is no other indexer. I verified that there were no recoll processes left running when the command line indexer finished its job and I reported the trace log here. Was there anything to indicate another indexer?
medoc writes
About the other indexer: sorry, my bad, I misread the log file (it’s garbled by bitbucket).
Actually, pursuing this may be simpler by email: jf at dockes.org
If you want to try the real time indexer again, you can set an alternate log file and level for it. This may make diagnostic easier (the log will not be reset by using recoll to query). In recoll.conf:
daemloglevel = 4
daemlogfilename = /path/to/file
But, as written earlier, I don’t think that the real-time indexer will work well with a removable volume.
humble_user writes
Yes, I thought it might be helpful to try and recreate the problem. But it’s working fine for now. I shall report if it reoccurs.
medoc writes
Until further data comes up