Release notes for Recoll 1.25.x
Installing over an older version
1.20-25 indexes are fully backward compatible. Installing 1.25 over an 1.19 index is possible, but there have been small changes in the way compound words (e.g. email addresses) are indexed, so it will be best to reset the index. Still, in a pinch, 1.25 search can mostly use an 1.19 index.
New index format with Xapian 1.4: the default on-disk format of Xapian 1.4 (Glass) has changed to improve the performance of phrase searches. This had the infortunate consequence of rendering the Recoll snippets generation method excessively slow except for very small indexes. In consequence, new indexes created by Recoll 1.24/25 using Xapian 1.4 have a different format and store the document texts inside the index. No specific action is required from the user, except if you have and old index and want to use the new format (nicer snippets, faster phrase searches), in which case you should delete the old index (see next).
Always reset the index if you do not know by which version it
was created (e.g.: you're not sure it's at least 1.18). The
best method is to quit all Recoll programs and delete the
index directory (
rm -rf ~/.recoll/xapiandb), then start
recollindex -z will do the same in most, but not all, cases. It's better to use the rm method, which will also ensure that no debris from older releases remain (e.g.: old stemming files which are not used any more).
On Windows, the index is located by default in C:/Users/[me]/AppData/Local/Recoll/xapiandb.
Case/diacritics sensitivity is off by default. It can be turned on only by editing recoll.conf ( see the manual). If you do so, you must then reset the index.
Changes in Recoll 1.25 (.3)
- GUI: the search entry now has a completion window with contents from the search history and the index terms. This can be turned off in the GUI preferences.
- Better support for mountable volumes: empty elements of topdirs are ignored (the corresponding index data is not purged), making indexing while the volume is not mounted possible.
- GUI: remember and manage maximized window state.
- Path translations changes are used without need for a GUI restart.
- Windows: most system interfaces have been converted to using Unicode (wide chars), and file names are now stored as UTF-8. This fixes a number of serious issues with file names which could not be translated in the default multibyte code page.
- Windows: the installation includes a stripped down Python3 package, no need for a separate Python installation.
- All non-internal input handlers now use Python3.
- Most XML formats which were processed by a Python script interpreting an XSL stylesheet are now processed internally in the C++ code. The place for customization is the stylesheet, the Python code was identical for all. This creates a direct dependancy between the Recoll code and libxml2/libxslt, but the dependancy existed before through the Python module anyway. As an exception, .pptx files are still processed by the script.
- Windows: image tags: the Python3 version of the module previously used to extract image tags under Windows proved too difficult to port, so the image tag extraction utility is now Perl-based (like on Unix, but it is a single-file Perl application on Windows).
Minor releases at a glance
- Fix mbox parser: had issues with gmail dumps.
- GUI: clickable links in about dialog, fix F1 not starting help anymore.
- Windows: avoid purging the index if a share is disconnected while we work. Avoid picking up the wrong pdftotext.
- GUI: ensure WM_CLASS is set to help window management.
- GUI: the size of a prefs entry (restable column widths) could repeatedly double under some circumstances, resulting in an enormous preferences file and slow or impossible startup.
- Linux document Open: pass an encoded URL to xdg-open.
- GUI action editor: ensure that the current row data is always reflected when the row changes, not only on mouse clicks.
- GUI mbox preview: wrong message was displayed for non-cached folders.
- GUI: fix highlighting in table mode (broken again...).
- Windows: fix preview access for big mbox files. The offset cache was not working at all, resulting in extremely slow access to messages.
- GUI: preview: improve operation when the original file has changed or can't be read: use stored data.
- GUI: result table contents were not reset when the search was cleared.
- Windows: fix paths accented character processing in configuration GUI.
- Windows: fix temporary file deletion. Many were not deleted because they were still opened by the handler when we tried to remove them. Fixed a number of handlers to properly close the file, and add a cleanup pass after the handler cache is cleaned at the end of indexing.
- 1.25.19: fix misc issues in the pdf handler (XMP metadata handling and broken OCR in some cases).
- 1.25.18: fix the webengine version of the GUI.
- 1.25.17: misc issues
- Fixed bug in snippets generation. We sometimes dropped the best snippets in case of phrase searches.
- 1.25.16: misc issues
- Small improvements in synonyms handling, significant for big synonym files.
- Raw indexes: fix test for capitals in user input in a few marginal cases.
- GUI: fix highlighting in table mode.
- GUI: add "Open" button to preview window.
- 1.25.15: kio build issues and Python module memory leak.
- The Python module was not recovering memory from documents fetched by a query.
- Forgot an adjustment needed for the kio to build after the changes in 1.25.13.
- Error in djvutxt would produce incorrect output for non-ASCII files.
- Catched more exceptions in some python handlers to avoid system reports.
- 1.25.13: fix GUI crash
- The GUI would crash when reusing a result list after changing preferences (and in other cases too maybe).
- 1.25.11: performance improvement for big pdf and xls files
- pdf, xls and others: avoid building a big Python string by appending. Use a list and join() at the end instead. Massive performance improvement of the input handler for very big files (~10x for 100000 lines).
- Less noise for failing while processing an xls file.
- Add an OLE file idenfication check to avoid needless processing and possible loops.
- 1.25.9: Crash in Python module.
- Fixes crash in Python module. This was causing crashes in the upmpdcli local media server uprcl module, and could possibly have affected recoll-WebUI users too. The problem was present in all semi-recent versions.
- 1.25.8: Small improvements.
- Abstract building was not working when having added external indexes, querying multiple text-storing indexes.
- Slightly modified the text extraction code so that errors let the file have an index record, with at least its name indexed. Also improves the behaviour of retries.
- Added pylogfilename and pyloglevel variable to allow for separately setting up logging for the python module.
- Windows: be more selective about what power events interrupt an indexing.
- Windows: improved antiword text extraction for some table kinds.
- 1.25.5: Small improvements.
- Automatically start search after simple search entry completer activation. You can disable it through a preference.
- Restore the focus to the search entry after a completer selection.
- Change the Preview window title to the user entry for simple searches.
- Windows: have the file selection dialog initially default to the desktop instead of the installation directory.
- Windows: only retry failed files after a software update.
- Python module: fix 'for doc in query' generating an exception when query has no results.
- 1.25.4: Fixes a crash and some issues.
- The snippets generator would crash the GUI in a relatively rare case.
- Conversion errors on archive members ended up mangling the ipaths.
- Preview: multi-word search string were concatenated without spaces.