scasier writes

I think the file name search isn’t as user-friendly as it could be because of the handling of white space in the entry field. [The documentation](http://www.lesbonscomptes.com/recoll/usermanual/RCL.SEARCH.html) states that:

> White space in the entry should match white space in the file name, and is not treated specially

I think this is problematic and can hinder new and experienced users alike.

Say you have indexed a directory with the following files:

#!
sparrow dog warthog.txt
sparrow dog hedgehog.txt
sparrow dog badger.txt
sparrow dog drake.txt
cat eft sparrow.txt
cat fawn sparrow.txt
cat gibbon sparrow.txt
cat heron sparrow.txt

Now you want to find sparrow dog drake.txt. Simple enough, just type drake into the search field. This works because, as the documentation says:

> An entry without any wild card character and not capitalized will be prepended and appended with '*' (ie: etc - > *etc*, but Etc - > etc)

But let’s say you want to find all file names containing cat and sparrow.

cat sparrow won’t work as a search term because Recoll treats it as one singular coherent expression.

Currently the only way to find the document in question would be by using *cat*sparrow*. *sparrow*cat* wouldn’t work because of the order of the terms.

I feel that this is inconsistent with the behaviour of the search field for single-word expressions. If a user can enter drake to find sparrow dog drake.txt they will expect to be able to find cat gibbon sparrow.txt with cat sparrow and sparrow cat.

Most file managers like Nautilus and Dolphin do perform order-insensitive wildcard lookups like this for each whitespace-separated word. Because I was expecting a similar behavior in Recoll I was pretty confused at the start. Eventually I had to consult the documentation to find the correct query mentioned above.

With this in consideration I would recommend modifying the query interpretation to do a wildcard expansion for each whitespace-separated word. Even more importantly, the search should be insensitive to the order of the terms. Having to iterate over all possible order-permutations with wild cards is not only frustrating (e.g.*cat*sparrow* OR *sparrow*cat* ) but can quickly become next to impossible (e.g. *cat*sparrow*fawn* OR *cat*fawn*sparrow* OR *sparrow*cat*fawn* OR *sparrow*fawn*cat* OR *fawn*cat*sparrow* OR *fawn*sparrow*cat*).

To preseve whitespace-sensitive search I would propose handling double-quoted expressions enclosed with * (e.g. *"sparrow cat"*) as whitespace-exact search terms.

I would be very happy to hear your thoughts on this proposal. SC.

medoc writes

Making this change would basically make the filename search a duplicate of using a search on the filename field in the query language:

There are currently two ways to search for a filename in Recoll:

  • Either you use the specific filename mode, which mostly behave like shell expansion or find

  • Or you use the query language and search within the "filename" field, which behaves more like text search. Up to 1.19, entering multiple terms in this mode would entail repeating the "filename" specification which was not convenient. As of 1.20, there is a new syntax for field searches, where fieldname:word1,word2 will mean to search for a match of word1 and word2 within the field, and fieldname:word1/word2 means to search for word1 or word2.

It seems to me that this mode satisfies what you are looking for by modifying the filename search.

Also, the latter would be quite difficult to modify along your ideas, because it currently works not like the rest of the text search, but by actually performing wildcard matching on the unsplit file names.

jf

scasier writes

Thanks for getting back to me so quickly!

>  As of 1.20, there is a new syntax for field searches, where fieldname:word1,word2 will mean to search for a match of word1 and word2 within the field, and fieldname:word1/word2 means to search for word1 or word2.

Oh, wow. I didn’t know about that. That’s very convenient, indeed.

> It seems to me that this mode satisfies what you are looking for by modifying the filename search.

Yes, this pretty much covers my main problem which was to be able to do order-insensitive filename searches. I think we can close this issue.

I do have one last question, though: Is there any way to specify an alias for tags? I would love to be able to type something like fn:this,that instead of the longer version.

SC

medoc writes

There is an alias mechanism for fields, it’s not very convenient because there is no GUI, you will need to create a file named fields in the recoll configuration directory (e.g. ~/.recoll), and add an entry to a section named aliases, such as:

[aliases]
filename = fn

The canonic name is on the left, a list of aliases on the right. You can have a look at the default fields file in /usr/share/recoll/examples for more comments, and there is probably also something in the manual.

I should probably add filename = fn to the default file actually :)

There is just a caveat: the aliases mechanism is used both for indexing and querying, so that, with the previous entry, any field named fn found inside a document will be turned into filename and may interfer with the normal file name indexing (I’d have to check the code to be sure of what happens exactly). I ought to add a mechanism for pure query-time aliasing.

scasier writes

> There is an alias mechanism for fields, it's not very convenient because there is no GUI, you will need to create a file named fields in the recoll configuration directory (e.g. ~/.recoll), and add an entry to a section named aliases

Thank you. That works perfectly. I’ll make sure to report any issues/conflicts I might run into.