Introduction

The Recoll Python programming interface can be used both for searching and for creating/updating an index with a program run by the Python3 interpreter. It is available on all platforms (Unix-like systems, MS Windows, Mac OS).

The search interface is used in a number of active projects: the Recoll Gnome Shell Search Provider , the Recoll Web UI, and the upmpdcli UPnP Media Server, in addition to many small scripts.

The index updating part of the API can be used to create and update Recoll indexes. Up to Recoll 1.37 these needed to use separate configurations (but could be queried in conjunction with the regular index). As of Recoll 1.37, an external indexer based on the Python extension can update the main index. For example the Recoll indexer for the Joplin notes application is using this method.

The search API is modeled along the Python database API version 2.0 specification (early versions used the version 1.0 spec).

The recoll package contains two modules:

  • The recoll module contains functions and classes used to query or update the index.

  • The rclextract module contains functions and classes used at query time to access document data. This can be used, for example, for extracting embedded documents into standalone files.

There is a good chance that your system repository has packages for the Recoll Python API, sometimes in a package separate from the main one (maybe named something like python-recoll). Else refer to the Building from source chapter.

As an introduction sample, the following small program will run a query and list the title and url for each of the results. The python/samples source directory contains several examples of Python programming with Recoll, exercising the extension more completely, and especially its data extraction features.

#!/usr/bin/python3

from recoll import recoll

db = recoll.connect()
query = db.query()
nres = query.execute("some query")
results = query.fetchmany(20)
for doc in results:
    print("%s %s" % (doc.url, doc.title))

You can also take a look at the source for (in order of complexity) the Recoll Gnome Shell Search Provider or WebUI, and the upmpdcli local media server .