aperezmendez writes

Lately, upon restart recoll updates a lot of files that haven’t changed at all.

:3:rcldb/rcldb.cpp:609::Db::add: docid 45847 updated [/home/alex/Documentos/Copias de seguridad de Proyectos/ENABLE/ENABLE FTP/wp0/project_reviews/1st_project_review/OLDs/enable_costCOerBLhfNNKCNatdYy9hhQ]
:4:rcldb/rcldb.cpp:1827::Db::needUpdate:yes: olsig [444091399274329+] new [444091399274329] [Q/home/alex/Documentos/Copias de seguridad de Proyectos/ENABLE/ENABLE FTP/wp0/project_reviews/1st_project_review/OLDs/enable_costaihGFxIEJjZEJ6ukp1t6OA]

Even the oldsig-new seem identical but the "+" sign. I have delted the entire DB and recreated, and still happening.

Using recoll 1.23.1 in Arch Linux.

medoc writes

The + at the end of the sig indicates that there was an error during the last try to index, so that only the file name and attributes were indexed. I think that you need to increase the log level and retry index the file with recollindex -e -i [filename] to see what is happening, and maybe prevent those files from being reindexed some other way.

aperezmendez writes

  File "/usr/share/recoll/filters/rclpdf.py", line 386, in <module >
    rclexecm.main(proto, extract)
  File "/usr/share/recoll/filters/rclexecm.py", line 338, in main
    proto.mainloop(extract)
  File "/usr/share/recoll/filters/rclexecm.py", line 265, in mainloop
    self.processmessage(processor, params)
  File "/usr/share/recoll/filters/rclexecm.py", line 245, in processmessage
    self.answer(data, ipath, eof)
  File "/usr/share/recoll/filters/rclexecm.py", line 192, in answer
    for nm,value in self.fields.iteritems():
AttributeError: 'dict' object has no attribute 'iteritems'

aperezmendez writes

It seems the code is not python3 suitable? In that case, header should not use "python" but "python2"

EDIT: See comment below.

aperezmendez writes

# Python 2 and 3: option 3 from future.utils import iteritems # or from six import iteritems

for (key, value) in iteritems(heights): …

aperezmendez writes

The new code introducing the "bug?" is from mid 2016. That would explain why I have silently complaining about indexing :). I haven’t noticed before since this computer (the one with a huge number of files) is always on, so it hardly reindexes files and I always look for filename. So as long as it is up, it works. On the laptop probably I have the same issue, but it’s a SSD with a significantly smaller number of files to index.

aperezmendez writes

BTW, didn’t make it work with the future thing. It says future module does not exists, and neither future.utils. You can use

#!python

for nm, value in self.fields.items()

instead. It is not such as efficient, but do you expect having a really large list here?

medoc writes

Actually python should point to python2.

Are you using arch ?

aperezmendez writes

Yes I do :) I see the recommendation, sadly Arch does not follow it.

medoc writes

I am going to change the shebang lines to explicitely use python2. See the comment at the end of https://bitbucket.org/medoc/recoll/issues/354/python3-error-when-indexing-pdf-in

medoc writes