Using Synonyms (1.22)

Term synonyms: there are a number of ways to use term synonyms for searching text:

  • At index creation time, they can be used to alter the indexed terms, either increasing or decreasing their number, by expanding the original terms to all synonyms, or by reducing all synonym terms to a canonical one.

  • At query time, they can be used to match texts containing terms which are synonyms of the ones specified by the user, either by expanding the query for all synonyms, or by reducing the user entry to canonical terms (the latter only works if the corresponding processing has been performed while creating the index).

Recoll only uses synonyms at query time. A user query term which part of a synonym group will be optionally expanded into an OR query for all terms in the group.

Synonym groups are defined inside ordinary text files. Each line in the file defines a group.

Example:

        hi hello "good morning"

        # not sure about "au revoir" though. Is this english ?
        bye goodbye "see you" \
        "au revoir" 
      

As usual, lines beginning with a # are comments, empty lines are ignored, and lines can be continued by ending them with a backslash.

Multi-word synonyms are supported, but be aware that these will generate phrase queries, which may degrade performance and will disable stemming expansion for the phrase terms.

The synonyms file can be specified in the Search parameters tab of the GUI configuration Preferences menu entry, or as an option for command-line searches.

Once the file is defined, the use of synonyms can be enabled or disabled directly from the Preferences menu.

The synonyms are searched for matches with user terms after the latter are stem-expanded, but the contents of the synonyms file itself is not subjected to stem expansion. This means that a match will not be found if the form present in the synonyms file is not present anywhere in the document set.

The synonyms function is probably not going to help you find your letters to Mr. Smith. It is best used for domain-specific searches. For example, it was initially suggested by a user performing searches among historical documents: the synonyms file would contains nicknames and aliases for each of the persons of interest.