Parameters affecting what documents we index

topdirs

Space-separated list of files or directories to recursively index. Default to ~ (indexes $HOME). You can use symbolic links in the list, they will be followed, independently of the value of the followLinks variable.

monitordirs

Space-separated list of files or directories to monitor for updates. When running the real-time indexer, this allows monitoring only a subset of the whole indexed area. The elements must be included in the tree defined by the 'topdirs' members.

skippedNames

Files and directories which should be ignored. White space separated list of wildcard patterns (simple ones, not paths, must contain no '/' characters), which will be tested against file and directory names.

Have a look at the default configuration for the initial value, some entries may not suit your situation. The easiest way to see it is through the GUI Index configuration "local parameters" panel.

The list in the default configuration does not exclude hidden directories (names beginning with a dot), which means that it may index quite a few things that you do not want. On the other hand, email user agents like Thunderbird usually store messages in hidden directories, and you probably want this indexed. One possible solution is to have ".*" in "skippedNames", and add things like "~/.thunderbird" "~/.evolution" to "topdirs".

Not even the file names are indexed for patterns in this list, see the "noContentSuffixes" variable for an alternative approach which indexes the file names. Can be redefined for any subtree.

skippedNames-

List of name endings to remove from the default skippedNames list.

skippedNames+

List of name endings to add to the default skippedNames list.

onlyNames

Regular file name filter patterns If this is set, only the file names not in skippedNames and matching one of the patterns will be considered for indexing. Can be redefined per subtree. Does not apply to directories.

noContentSuffixes

List of name endings (not necessarily dot-separated suffixes) for which we don't try MIME type identification, and don't uncompress or index content. Only the names will be indexed. This complements the now obsoleted recoll_noindex list from the mimemap file, which will go away in a future release (the move from mimemap to recoll.conf allows editing the list through the GUI). This is different from skippedNames because these are name ending matches only (not wildcard patterns), and the file name itself gets indexed normally. This can be redefined for subdirectories.

noContentSuffixes-

List of name endings to remove from the default noContentSuffixes list.

noContentSuffixes+

List of name endings to add to the default noContentSuffixes list.

skippedPaths

Absolute paths we should not go into. Space-separated list of wildcard expressions for absolute filesystem paths (for files or directories). The variable must be defined at the top level of the configuration file, not in a subsection.

Any value in the list must be textually consistent with the values in topdirs, no attempts are made to resolve symbolic links. In practise, if, as is frequently the case, /home is a link to /usr/home, your default topdirs will have a single entry '~' which will be translated to '/home/yourlogin'. In this case, any skippedPaths entry should start with '/home/yourlogin' *not* with '/usr/home/yourlogin'.

The index and configuration directories will automatically be added to the list.

The expressions are matched using 'fnmatch(3)' with the FNM_PATHNAME flag set by default. This means that '/' characters must be matched explicitly. You can set 'skippedPathsFnmPathname' to 0 to disable the use of FNM_PATHNAME (meaning that '/*/dir3' will match '/dir1/dir2/dir3').

The default value contains the usual mount point for removable media to remind you that it is in most cases a bad idea to have Recoll work on these Explicitly adding '/media/xxx' to the 'topdirs' variable will override this.

skippedPathsFnmPathname

Set to 0 to override use of FNM_PATHNAME for matching skipped paths.

nowalkfn

File name which will cause its parent directory to be skipped. Any directory containing a file with this name will be skipped as if it was part of the skippedPaths list. Ex: .recoll-noindex

daemSkippedPaths

skippedPaths equivalent specific to real time indexing. This enables having parts of the tree which are initially indexed but not monitored. If daemSkippedPaths is not set, the daemon uses skippedPaths.

zipUseSkippedNames

Use skippedNames inside Zip archives. Fetched directly by the rclzip.py handler. Skip the patterns defined by skippedNames inside Zip archives. Can be redefined for subdirectories. See https://www.recoll.org/faqsandhowtos/FilteringOutZipArchiveMembers.html

zipSkippedNames

Space-separated list of wildcard expressions for names that should be ignored inside zip archives. This is used directly by the zip handler. If zipUseSkippedNames is not set, zipSkippedNames defines the patterns to be skipped inside archives. If zipUseSkippedNames is set, the two lists are concatenated and used. Can be redefined for subdirectories. See https://www.recoll.org/faqsandhowtos/FilteringOutZipArchiveMembers.html

followLinks

Follow symbolic links during indexing. The default is to ignore symbolic links to avoid multiple indexing of linked files. No effort is made to avoid duplication when this option is set to true. This option can be set individually for each of the 'topdirs' members by using sections. It can not be changed below the 'topdirs' level. Links in the 'topdirs' list itself are always followed.

indexedmimetypes

Restrictive list of indexed mime types. Normally not set (in which case all supported types are indexed). If it is set, only the types from the list will have their contents indexed. The names will be indexed anyway if indexallfilenames is set (default). MIME type names should be taken from the mimemap file (the values may be different from xdg-mime or file -i output in some cases). Can be redefined for subtrees.

excludedmimetypes

List of excluded MIME types. Lets you exclude some types from indexing. MIME type names should be taken from the mimemap file (the values may be different from xdg-mime or file -i output in some cases) Can be redefined for subtrees.

nomd5types

Don't compute md5 for these types. md5 checksums are used only for deduplicating results, and can be very expensive to compute on multimedia or other big files. This list lets you turn off md5 computation for selected types. It is global (no redefinition for subtrees). At the moment, it only has an effect for external handlers (exec and execm). The file types can be specified by listing either MIME types (e.g. audio/mpeg) or handler names (e.g. rclaudio.py).

compressedfilemaxkbs

Size limit for compressed files. We need to decompress these in a temporary directory for identification, which can be wasteful in some cases. Limit the waste. Negative means no limit. 0 results in no processing of any compressed file. Default 100 MB.

textfilemaxmbs

Size limit for text files. Mostly for skipping monster logs. Default 20 MB. Use a value of -1 to disable.

textunknownasplain

Process unknown text/xxx files as text/plain Allows indexing misc. text files identified as text/whatever by 'file' or 'xdg-mime' without having to explicitely set config entries for them. This works fine for indexing (but will cause processing of a lot of garbage though), but the documents indexed this way will be opened by the desktop viewer, even if text/plain has a specific editor.

indexallfilenames

Index the file names of unprocessed files Index the names of files the contents of which we don't index because of an excluded or unsupported MIME type.

usesystemfilecommand

Use a system mechanism as last resort to guess a MIME type. Depending on platform and version, a compile-time configuration will decide if this executes a command or uses libmagic. This is generally useful, but will usually cause the indexing of many bogus extension-less 'text' files. See 'systemfilecommand' for the command used.

systemfilecommand

Command to use for guessing the MIME type if the internal methods fail. This is ignored on Windows or with Recoll 1.38+ if compiled with --enable-libmagic (the default). Otherwise, this should be a "file -i" workalike. The file path will be added as a last parameter to the command line. "xdg-mime" works better than the traditional "file" command, and is now the configured default (with a hard-coded fallback to "file")

processwebqueue

Decide if we process the Web queue. The queue is a directory where the Recoll Web browser plugins create the copies of visited pages.

textfilepagekbs

Page size for text files. If this is set, text/plain files will be divided into documents of approximately this size. Will reduce memory usage at index time and help with loading data in the preview window at query time. Particularly useful with very big files, such as application or system logs. Also see textfilemaxmbs and compressedfilemaxkbs.

membermaxkbs

Size limit for archive members. This is passed to the filters in the environment as RECOLL_FILTER_MAXMEMBERKB.