The update methods are part of the recoll
module described
above. The connect() method is used with a writable=true
parameter to
obtain a writable Db
object. The following Db
object methods are then available.
- addOrUpdate(udi, doc, parent_udi=None)
Add or update index data for a given document The
udi
string must define a unique id for the document. It is an opaque interface element and not interpreted inside Recoll.doc
is aDoc
object, created from the data to be indexed (the main text should be indoc.text
). Ifparent_udi
is set, this is a unique identifier for the top-level container (e.g. for the filesystem indexer, this would be the one which is an actual file). Thedoc
url
and possiblyipath
fields should also be set to allow access to the actual document after a query. Other fields should also be set to allow further access to the data, see the description further down:rclbes
,sig
,mimetype
. Of course, any standard or custom Recoll field can also be added.- delete(udi)
Purge index from all data for
udi
, and all documents (if any) which have a matchingparent_udi
.- needUpdate(udi, sig)
Test if the index needs to be updated for the document identified by
udi
. If this call is to be used, thedoc.sig
field should contain a signature value when callingaddOrUpdate()
. TheneedUpdate()
call then compares its parameter value with the storedsig
forudi
.sig
is an opaque value, compared as a string.The filesystem indexer uses a concatenation of the decimal string values for file size and update time, but a hash of the contents could also be used.
As a side effect, if the return value is false (the index is up to date), the call will set the existence flag for the document (and any subdocument defined by its
parent_udi
), so that a laterpurge()
call will preserve them).The use of
needUpdate()
andpurge()
is optional, and the indexer may use another method for checking the need to reindex or to delete stale entries.- purge()
Delete all documents that were not touched during the just finished indexing pass (since open-for-write). These are the documents for the needUpdate() call was not performed, indicating that they no longer exist in the primary storage system.
- createStemDbs(lang|sequence of langs)
Create stemming dictionaries for query stemming expansion. Should be called when done updating the index. Available only after Recoll 1.34.3. As an alternative, you can close the index and execute:
recollindex -c <confdir> -s <lang(s)>
The Python module currently has no interface to the Aspell speller functions, so the same approach can be used for creating the spelling dictionary (with option
-S
).