Real time indexing

Real time monitoring/indexing is performed by starting the recollindex -m command. With this option, recollindex will detach from the terminal and become a daemon, permanently monitoring file changes and updating the index.

Under KDE, Gnome and some other desktop environments, the daemon can automatically started when you log in, by creating a desktop file inside the ~/.config/autostart directory. This can be done for you by the Recoll GUI. Use the Preferences->Indexing Schedule menu.

With older X11 setups, starting the daemon is normally performed as part of the user session script.

The rclmon.sh script can be used to easily start and stop the daemon. It can be found in the examples directory (typically /usr/local/[share/]recoll/examples).

For example, my out of fashion xdm-based session has a .xsession script with the following lines at the end:

recollconf=$HOME/.recoll-home
      recolldata=/usr/local/share/recoll
      RECOLL_CONFDIR=$recollconf $recolldata/examples/rclmon.sh start

      fvwm 

      

The indexing daemon gets started, then the window manager, for which the session waits.

By default the indexing daemon will monitor the state of the X11 session, and exit when it finishes, it is not necessary to kill it explicitly. (The X11 server monitoring can be disabled with option -x to recollindex).

If you use the daemon completely out of an X11 session, you need to add option -x to disable X11 session monitoring (else the daemon will not start).

By default, the messages from the indexing daemon will be setn to the same file as those from the interactive commands (logfilename). You may want to change this by setting the daemlogfilename and daemloglevel configuration parameters. Also the log file will only be truncated when the daemon starts. If the daemon runs permanently, the log file may grow quite big, depending on the log level.

When building Recoll, the real time indexing support can be customised during package configuration with the --with[out]-fam or --with[out]-inotify options. The default is currently to include inotify monitoring on systems that support it, and, as of Recoll 1.17, gamin support on FreeBSD.

While it is convenient that data is indexed in real time, repeated indexing can generate a significant load on the system when files such as email folders change. Also, monitoring large file trees by itself significantly taxes system resources. You probably do not want to enable it if your system is short on resources. Periodic indexing is adequate in most cases.

Increasing resources for inotify

On Linux systems, monitoring a big tree may need increasing the resources available to inotify, which are normally defined in /etc/sysctl.conf.

        ### inotify
        #
        # cat  /proc/sys/fs/inotify/max_queued_events   - 16384
        # cat  /proc/sys/fs/inotify/max_user_instances  - 128
        # cat  /proc/sys/fs/inotify/max_user_watches    - 16384
        #
        # -- Change to:
        #
        fs.inotify.max_queued_events=32768
        fs.inotify.max_user_instances=256
        fs.inotify.max_user_watches=32768
      

Especially, you will need to trim your tree or adjust the max_user_watches value if indexing exits with a message about errno ENOSPC (28) from inotify_add_watch.