Movable datasets

As of Recoll 1.24, it has become easy to build self-contained datasets including a Recoll configuration directory and index together with the indexed documents, and to move such a dataset around (for example copying it to an USB drive), without having to adjust the configuration for querying the index.

Note

This is a query-time feature only. The index must only be updated in its original location. If an update is necessary in a different location, the index must be reset.

The examples below will assume that you have a dataset under /home/me/mydata/, with the index configuration and data stored inside /home/me/mydata/recoll-confdir.

In order to be able to run queries after the dataset has been moved, you must ensure the following:

  • The main configuration file must define the orgidxconfdir variable to be the original location of the configuration directory (orgidxconfdir=/home/me/mydata/recoll-confdir must be set inside /home/me/mydata/recoll-confdir/recoll.conf in the example above).

  • The configuration directory must exist with the documents, somewhere under the directory which will be moved. E.g. if you are moving /home/me/mydata around, the configuration directory must exist somewhere below this point, for example /home/me/mydata/recoll-confdir, or /home/me/mydata/sub/recoll-confdir.

  • You should keep the default locations for the index elements (they are relative to the configuration directory by default). Only the paths referring to the documents themselves (e.g. topdirs values) should be absolute (in general, they are only used when indexing anyway).

Only the first point needs an explicit user action, the Recoll defaults are compatible with the second one, and the third is natural.

If, after the move, the configuration directory needs to be copied out of the dataset (for example because the thumb drive is too slow), you can set the curidxconfdir, variable inside the copied configuration to define the location of the moved one. For example if /home/me/mydata is now mounted onto /media/me/somelabel, but the configuration directory and index has been copied to /tmp/tempconfig, you would set curidxconfdir to /media/me/somelabel/recoll-confdir inside /tmp/tempconfig/recoll.conf. orgidxconfdir would still be /home/me/mydata/recoll-confdir in the original and the copy.

If you are regularly copying the configuration out of the dataset, it will be useful to write a script to automate the procedure. This can't really be done inside Recoll because there are probably many possible variants. One example would be to copy the configuration to make it writable, but keep the index data on the medium because it is too big - in this case, the script would also need to set dbdir in the copied configuration.

The same set of modifications (Recoll 1.24) has also made it possible to run queries from a readonly configuration directory (with slightly reduced function of course, such as not recording the query history).