[Archivesspace_Users_Group] Spreadsheet imports locking up database
ph448 at cam.ac.uk
Sun Jun 6 04:02:03 EDT 2021
James Bullen wrote on 2021-06-06 11:35:03:
> Bulk update operations usually run in database transactions. This is great because it guarantees that the database will be in a consistent state when the job completes - if it succeeds then all of the updates are applied, if it fails then none of the updates are applied.
> The price for this guarantee is that any affected records are locked until the job completes.
Thanks for your response. I generally agree with your points but have mine to make.
One is that the consistency unit here need not be the entire job, it is sufficient to isolate a single line, and I would normally expect the database to cope fine with such a workload.
Secondly, if I read my database output correctly, in this case the problem is not a wholly isolated bulk update but unnecessary table scans occurring for each line that make the database hold a lock for several seconds instead of milliseconds. I've yet to understand the purpose of the subnote_metadata table and why the delete is taking place, but we did notice that the query tried IDs that were not in the database.
In any case, it should be possible to unroll the subquery or provide optimiser hints for the bottleneck to go away.
> We recently implemented the ability to submit import jobs to be run after hours for a client to work around this.
We were thinking of a similar workaround, do you have anything you could share with us?
More information about the Archivesspace_Users_Group