837183 writes
I’m sorry, this is a question..not a defect, enhancement request or task..
My ebooks library is getting very big, it takes weeks to index it (1 TB of PDFs), and also - once indexed, a simple search takes minutes.
This is understandable, because that’s a lot of data.
But I’m wondering if there’s a way to help Recoll be faster. will buying a better hardware (CPU?) make Recoll index faster and search faster?
medoc writes
Weeks and minutes seem a lot, even for 1 TB of source documents. I have a number of questions first:
-
What recoll version are you using ?
-
What is the size of the index itself (du $HOME/.recoll/) ?
-
What hardware are you running this on at the moment ?
-
By simple query, do you mean a single-term query ? Or something a bit more complicated?
I have little experience with datasets of this size, but I know that there are others, and the Xapian people have quite a lot, so I’m reasonably certain that there are things to be done.
medoc writes
Hi, I do have ideas about what to do to speed things up, but I need feedback. Please re-open if you want.