<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">

<meta name="Generator" content="Microsoft Word 15 (filtered medium)">

<style><!--

/* Font Definitions */

@font-face

        {font-family:"Cambria Math";

        panose-1:2 4 5 3 5 4 6 3 2 4;}

@font-face

        {font-family:Calibri;

        panose-1:2 15 5 2 2 2 4 3 2 4;}

/* Style Definitions */

p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0in;

        font-size:10.0pt;

        font-family:"Calibri",sans-serif;}

a:link, span.MsoHyperlink

        {mso-style-priority:99;

        color:blue;

        text-decoration:underline;}

span.EmailStyle19

        {mso-style-type:personal-reply;

        font-family:"Calibri",sans-serif;

        color:windowtext;}

.MsoChpDefault

        {mso-style-type:export-only;

        font-size:10.0pt;}

@page WordSection1

        {size:8.5in 11.0in;

        margin:1.0in 1.0in 1.0in 1.0in;}

div.WordSection1

        {page:WordSection1;}

--></style>

</head>

<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">

<div class="WordSection1">

<p class="MsoNormal"><span style="color:black"><< Why does one even have to run a periodic indexer? Aren't there guarantees in<br>

the system that updates are seen through to the index in realtime, do bulk<br>

updates not trigger a refresh of updated records selectively? Reading the code<br>

seems to suggest that updates are queued until processed, does AS need a more<br>

durable queue? >></span><span style="font-size:11.0pt"><o:p></o:p></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt">In theory, if your index is “up to date” (according to the indexer_state directory) the periodic indexer should have no work to do. I think this is part of a class of problems that arise when for some reason

 the periodic indexer cannot get through its workload and therefore tries and tries again. That is what happens, for example, when a MySQL database contains a record with a bad timestamp due to DST. If someone could file a JIRA issue with as much info as possible

 for recreating the problem (and maybe someone who could be contacted to supply a database copy) then it could probably be prioritized and addressed.<o:p></o:p></span></p>

<p class="MsoNormal"><span style="font-size:11.0pt"><o:p> </o:p></span></p>

<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">

<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:

</span></b><span style="font-size:12.0pt;color:black">archivesspace_users_group-bounces@lyralists.lyrasis.org <archivesspace_users_group-bounces@lyralists.lyrasis.org> on behalf of Peter Heiner <ph448@cam.ac.uk><br>

<b>Date: </b>Saturday, January 28, 2023 at 3:51 AM<br>

<b>To: </b>Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org><br>

<b>Subject: </b>Re: [Archivesspace_Users_Group] External Solr - Memory Allocation?<o:p></o:p></span></p>

</div>

<div>

<p class="MsoNormal"><span style="font-size:11.0pt">Joshua D. Shaw wrote on 2023-01-26 23:02:10:<br>

> Thanks, Blake!<br>

> <br>

> I'm running default config values for the AS log levels so they are all set to 'debug'. I took a closer look, and the timeout message happens exactly after the timeout amount I set (as you'd expect). Interestingly, Solr is in the middle of deleting documents

 when it goes silent<br>

> <br>

> I, [2023-01-26T09:18:40.357101 #78764]  INFO -- : Thread-3384: Deleted 100 documents: #<Net::HTTPOK:0x72b3d9e><br>

> <br>

> .... 40 minutes pass with all the other AS log chatter ...<br>

> <br>

> E, [2023-01-26T09:58:40.400971 #78764] ERROR -- : Thread-3384: SolrIndexerError when deleting records: Timeout error with  POST {....}<br>

> I, [2023-01-26T09:58:40.410522 #78764]  INFO -- : Thread-3384: Deleted 100 documents: #<Net::HTTPOK:0x4ab44e31><br>

> <br>

> This continuing delete phase goes on for a bit until it stops logging batch deletes.<br>

> <br>

> I, [2023-01-26T09:59:11.734200 #78764]  INFO -- : Thread-3384: Deleted 9 documents: #<Net::HTTPOK:0x1be6c3e9><br>

> <br>

> .... 40 minutes pass with all the other AS log chatter ... And then the commit error pops up<br>

> <br>

> E, [2023-01-26T10:39:11.746166 #78764] ERROR -- : Thread-3384: SolrIndexerError when committing:<br>

> Timeout error with  POST {"commit":{"softCommit":false}}.<br>

> <br>

> Then after some more time<br>

> <br>

> I, [2023-01-26T11:06:35.678926 #78764]  INFO -- : Thread-3384: Deleted 186992 documents: #<Net::HTTPOK:0x7e298af9><br>

> <br>

> .... This all seems to indicate to me that the commit phase is taking an inordinate amount of time (almost 2 hours - maybe that's what I need to set the timeout to?). After that, the indexer starts the 2nd repo<br>

<br>

We're experiencing the exact same issue at least in the largest of our 30-odd<br>

repositories:<br>

<br>

I, [2023-01-28T07:24:48.015356 #2036632]  INFO -- : Thread-2006: Staff Indexer [2023-01-28 07:24:48 +0000] ~~~ Indexed 536300 of 587664 archival_object records in repository CUL<br>

E, [2023-01-28T07:25:47.217953 #2036632] ERROR -- : Thread-2016: SolrIndexerError when committing:

<br>

Timeout error with  POST {"commit":{"softCommit":false}}.<br>

<br>

We've had this problem from the start but were unable to dig deeper because<br>

our in-house monitoring wasn't granular or even capable enough. Our crude<br>

solution was using an external Solr since 2.5 and an external indexer since<br>

around 2.7, and periodic restarts out of hours, but we've started getting<br>

problems despite that. Our Solr is allocated 6GB of memory and timeout is set<br>

to 1200 seconds, but the problem is that searches in AS fail during the wait<br>

and that makes AS unusable, so we're reluctant to increase that.<br>

<br>

Why does one even have to run a periodic indexer? Aren't there guarantees in<br>

the system that updates are seen through to the index in realtime, do bulk<br>

updates not trigger a refresh of updated records selectively? Reading the code<br>

seems to suggest that updates are queued until processed, does AS need a more<br>

durable queue?<br>

<br>

Thanks,<br>

p<br>

_______________________________________________<br>

Archivesspace_Users_Group mailing list<br>

Archivesspace_Users_Group@lyralists.lyrasis.org<br>

<a href="http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a><o:p></o:p></span></p>

</div>

</div>

</body>

</html>