[Archivesspace_Users_Group] PUI indexing issues

Tom Hanstra hanstra at nd.edu
Tue Mar 16 12:51:30 EDT 2021


Hello again.

I'm still trying to understand some indexing issues. I've now left my PUI
indexing threads and count at the default (which I believe is 1 thread and
25 records/thread). And I have given 4GB to Java processes. I've tried
other values as well, but with similar results.

No matter what values I use, I cannot seem to fully index PUI. Each time,
it will start well but continuously slow down. I've kept a spreadsheet of
the number of records/hr I'm indexing and have several attempts which start
in the 50-60K/hr range and then continuously slow down to the 1800-1500/hr
speed until finally dying with a Java Heap error. I think I'm headed to
that again this round.

Why might this be happening?  Could my data have been corrupted during the
transfer from Lyrasis? (I'm working with a database export of our
production data). Is the database too far away (our database is in an AWS
RDS being accessed from our AWS EC2).

I do have one log which gave this error:

E, [2021-03-12T18:14:53.886243 #2919] ERROR -- : Thread-9472: Failed
fetching archival_object id=1484623: too many connection resets (due to
Net::ReadTimeout - Net::ReadTimeout) after 0 requests on 3150, last used
1615590893.870297 seconds
ago

prior to the Java Heap error. In that log, there were a number of
connections for the staff indexer after the PUI indexer stopped reporting,
then an 88 minute gap prior to the above connection error and then finally
a Java Heap error in the archivesspace.out log.

Does the indexer reauthenticate each time it connects to get more
information?  The earlier question about authentication has me wondering if
my database server might be balking at the number of reconnections or
something. I'm trying to index 760K records.

Bottom line is that I'm still not getting my PUI index creation to
complete. Each run can take several days before it finally fails and I have
to start all over again.  I'm looking for any help to track down why this
slowdown is occurring and what I can do to address it.

Thanks,
Tom
-- 
*Tom Hanstra*
*Sr. Systems Administrator*
hanstra at nd.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20210316/f3f3ab8d/attachment.html>


More information about the Archivesspace_Users_Group mailing list