Unknown reporter writes

In proofreading #285 (can’t edit as I don’t have a bitbucket login) I realize paragraph one seems to contradict the final statement "all other rcltmp* present in /tmp seem to represent LEGITIMATE" extractions. My confusion (in understanding, troubleshooting, reporting) stems from the fact that although this has been a long-running desktop session — apparently using "suspend", vs "shutdown" at least since Dec30, 2015 — I can only troubleshoot based on the files which are CURRENTLY present in /tmp. So, I can’t "prove" it, but I do suspect that other not-residing-in-archivefile items have been affected. Across the past month, I’ve experienced MANY "wha?? I’m certain I’ve already edited that. Lemme try rebuilding the index" episodes.

Below, I’m describing a usability issue regarding working with extracted-from-archive file "opening".

I can’t guess how the webform will wrap/split the following. It is a single search results line from recoll gui:

Today, finally, I understand that line indicates a result found in (upload_docs.py) which is contained on disk within an archive file (setuptools-11.0-py2.py3-none-any.whl). Until today, I didn’t even realize that *.whl is an archive filetype. Of course, when I "open" this file for editing… I’ll be editing a copy of the file extracted to and residing in /tmp, NOT the original file. Problem: until today, it hasn’t been "of course". It’s been more like "wtf?" after realizing performed edits have been lost.

Toward improved usability, I’m suggesting that the searchresults template should be revised (emphasis/color?) to more clearly distinguish matched documents residing within archive files. To prevent me from shooting myself in the foot (losing edits) maybe even a popup dialogbox informing "document will be extracted from archive and opened in your editor" would be warranted here?

Although I’m not an "icon-centric" person, I kinda-sorta recall that for most of those "lost my edits" incidents, the affected files were python scripts, and that a yellow+blue pythonic icon was displayed in the search results. In the example I’m citing here, although that icon is "accurate", it’s presence (for match result residing within archive) is confusing.

With thanks, and best wishes for the New Year,


medoc writes

I am very sorry about your lost edits.

Nobody reported this as confusing up to now, but I can truely see how it can be, all the more if you like and trust the software, because you will tend to think that it can do things more difficult than what it actually does.

Being a programmer, and having written this code, of course, I know that Recoll can’t be replacing pieces of archives. This is what comes from doing things alone, you find evident what will confuse others.

I entirely concur with your suggestion that there should be a warning in this situation. I think that the right place would be when you click Open, not in the result list. Open is the only dangerous operation in this regard.

I think that an option to suppress archive members from the results would also be useful. Colouring the affected results would be distracting for other usages, and not as useful when you want to see real files only.

medoc writes

I have looked into finding a reasonably simple way to exclude subdocuments from results, but this would be seriously complicated, needing a change to the query language etc. Maybe one day.

The warning dialog is implemented by https://bitbucket.org/medoc/recoll/commits/c280089f47c3d9dea300472a1600adfca2a37ce9 and will be in recoll 1.21.4

There are 2 possible ways to help manage the result list: use the query fragments tool to easily restrict the searches to a subtree (supposing that live files and archives are not mixed up too much), and editing the result paragraph format to highlight the internal path: <b >%i</b >, the results which are not files become very apparent.