rdzidlic writes
I have recently deleted a large tree and notice that flushing takes a long time in some cases. In some cases a single flush takes seemingly minutes, in other cases the line scrolls by without stopping.. what is it that makes the difference?
:4:../rcldb/rcldb.cpp:1703:Db::add/delete: txt size >= 10 Mb, flushing
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273619
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273620
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273621
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273622
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273623
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273624
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273625
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273626
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273627
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273628
:4:../rcldb/rcldb.cpp:1703:Db::add/delete: txt size >= 10 Mb, flushing
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273629
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273631
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273632
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273633
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273634
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273635
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273636
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273637
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273638
:4:../rcldb/rcldb.cpp:1933:Db::purge: deleted document #1273639
:4:../rcldb/rcldb.cpp:1703:Db::add/delete: txt size >= 10 Mb, flushing
medoc writes
I am not sure what Xapian is doing behind the scenes, this is not something I control. The varying times could have any number of causes. Minutes seem excessive though, except if your disk is very busy.
You could try to increase the idxflushmb to 20 instead of the default 10 and see if it decreases the overall indexing time.
When removing a big tree, I think that it’s sometimes faster to run a full indexing, but this depends on the rest of the data of course.
In any case, there isn’t anything I can do about this, the data has to be flushed from time to time (the reason is to keep memory usage under control if I remember well).