humble_user writes

Base the indexer’s GUI on a file browser (ideally, actually use the local file browser): List actual view of file system, and present view in colour:

Blue : folder in index Green : file in index Yellow / normal : file excluded from index Red : folder / file not in index Pink : folder / file changed since index Grey : folder / file missing since index

Allow user to earmark selected files / folders using button / context-menu for certain actions:

< Include / Exclude selected folders / files from index > < Re-index files folders >

And: < Commit changes to index > < Update all files / folders > < Rebuild index >

The separate option to commit changes to index would be important, not only to make it feasible for the indexer, but also so the user can toggle their choices on/off while curating the index.

The result would be:

  1. a more efficient index : because it would have less clutter

  2. a more up to date index : because the user would have better control over it

  3. a more manageable index : because index more likely updated quickly, as-you-go

Most ideal of all, this would be implemented as a modification of a decent file browser like Thunar (or most ideally, made common to all file browsers by common agreement). If so, then the index view would be something you would switch on/off in the view menu.

Speculation on implementation method:

Pardon me a moment for speculating on things that are beyond my ken, but I would imagine the only stickler would be mapping the index to the file system.

Ideally, the linux file system would broadcast changes to names and locations of files and folders so applications like Recoll could keep track of them. I assume this cannot yet be done because I’m not aware of any application that does keep track of such changes.

(This has got to be one of the greatest inconveniences that is most easily rectified*).

Failing that, and if Recoll can’t simply monitor file system changes using the locate db or something similar, the GUI would at least make user control of the index more efficient.

(* given that it is desirable and long overdue for apps to keep track of file system changes, then if such a solution as proposed in this aside is actually necessary, it could be implemented simply as a list of dated-before/after path-changes, it would allow all applications to update their repositories from the date when they last checked. Apps that used the track change facility would register with it that they were doing so, and log with it the date of their last check. The facility could then do last-in/first-out on change entries from the last date where all registered apps had checked in. I can think of three apps off-hand that would instantly benefit from this: office apps and other editors that don’t track file system changes that effect files they have open; FreeFileSync; and of course Recoll.).

medoc writes

File system monitoring. Interesting article here if you want background: http://www.lanedo.com/filesystem-monitoring-linux-kernel/

This exists, and the recollindex daemon mode uses inotify to check for changes.

About the file browser indexer interface: this is a great idea, but quite complicated to implement. I am not able to invest the time needed at the moment.

I am aware that you have had major performance issues with the indexing, but this should be investigated as it is not normal at all. For most users, an incremental indexing pass takes at most a few minutes, and even running the real-time indexer is not an issue. So they’ll just run the indexer once a day, or the real-time indexer, and won’t bother investigating exactly what is indexed or not (because in most cases, what should be indexed is indexed…).

Also I’m not sure that I mention that a search of dir:/path/to/my/dir will find all files indexed under a given directory?

If someone wants to implement this (which should probably best be done in Python), I’m ready to help, but I won’t do it myself.

humble_user writes

Righto. Cheers for the link and the offer of diagnosis help.