<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  </head>
  <body>
    <p>Indexing can also fail at the commit stage, not related to any
      one record. That is when ArchivesSpace tells Solr to transfer
      changes made in memory to storage. It does that at several points
      in the indexing process, but the longest one is at the end of the
      PUI indexer's run. If, because you've got a lot of records, or
      slow storage on your Solr server, it takes longer it respond than
      the value of AppConfig[:indexer_solr_timeout_seconds], it will
      start all over again, and potentially go into a loop. The
      workaround is to increase the timeout.</p>
    <p><br>
    </p>
    <p>You might not notice you've got enough records to cause this
      until you do a full re-index, or someone edits something linked to
      most or all records (e.g. a repository, or a very widely-used
      subject), triggering the re-indexing of most of the system's
      records.<br>
    </p>
    <p><br>
    </p>
    <p>Andrew.</p>
    <p><br>
    </p>
    <p><br>
    </p>
    <div class="moz-cite-prefix">On 10/05/2022 22:06, Blake Carver
      wrote:<br>
    </div>
    <blockquote type="cite" cite="mid:DM6PR22MB23091ACF2A0423C91EC5B18A9FC99@DM6PR22MB2309.namprd22.prod.outlook.com">
      
      <style type="text/css" style="display:none;">P {margin-top:0;margin-bottom:0;}</style>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
         1x1 would mean setting both records_per_thread and thread_count
        to 1. Having loglevel on debug and running at 1x1, you'll be
        able to see exactly which thing is being indexed as it happens,
        and when it crashes, you'll see what it was working through at
        the time.</div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        <br>
      </div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        PUI will always take longer, and a VERY long time 1x1, but
        unless you're sure which indexer is crashing, I'd switch them
        both up.</div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        <br>
      </div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        You can just `grep Indexed archivesspace.out` after it's running
        and watch those numbers. As long as they're going up, all is
        well.</div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        <br>
      </div>
      <div style="font-family: Calibri, Arial, Helvetica, sans-serif;
        font-size: 12pt; color: rgb(0, 0, 0);">
        It is also possible that it will finish without crashing running
        so slow as well. I've seen that happen with LARGE records. </div>
      <hr style="display:inline-block;width:98%" tabindex="-1">
      <div id="divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri, sans-serif" color="#000000"><b>From:</b>
          <a class="moz-txt-link-abbreviated" href="mailto:archivesspace_users_group-bounces@lyralists.lyrasis.org">archivesspace_users_group-bounces@lyralists.lyrasis.org</a>
          <a class="moz-txt-link-rfc2396E" href="mailto:archivesspace_users_group-bounces@lyralists.lyrasis.org"><archivesspace_users_group-bounces@lyralists.lyrasis.org></a>
          on behalf of Tom Hanstra <a class="moz-txt-link-rfc2396E" href="mailto:hanstra@nd.edu"><hanstra@nd.edu></a><br>
          <b>Sent:</b> Tuesday, May 10, 2022 4:15 PM<br>
          <b>To:</b> Archivesspace Users Group
          <a class="moz-txt-link-rfc2396E" href="mailto:archivesspace_users_group@lyralists.lyrasis.org"><archivesspace_users_group@lyralists.lyrasis.org></a><br>
          <b>Subject:</b> Re: [Archivesspace_Users_Group] (re)indexing
          in 2.8.1</font>
        <div> </div>
      </div>
      <div>
        <div dir="ltr">Thanks, Blake.
          <div><br>
          </div>
          <div>Turns out we did add quite a few records recently, so
            maybe there was something in there that it did not like all
            that much. </div>
          <div><br>
          </div>
          <div>How can you tell which record it is choking on?  Is that
            your "1x1" suggestion?  Or does the DEBUG option make that
            more clear?  I have my indexing set to:<br>
            <br>
            AppConfig[:indexer_records_per_thread]      = 25<br>
            AppConfig[:indexer_thread_count]            = 2<br>
            <br>
            for both PUI and Staff records. I believe you are suggesting
            it would most easily be found using 1 and 1?  I can see
            where that could take a long time. But it if is going to
            choke over and over on the same record, then that may be the
            best way to address it. <br>
            <br>
            Do you think if I just did staff indexing without PUI, that
            it would be identified faster?  Or could it pass the staff
            side but then die on PUI later?</div>
          <div><br>
          </div>
          <div>I hope to try some of these ideas after hours today, so
            if you can confirm that I've got the right idea, that would
            help.</div>
          <div><br>
          </div>
          <div>Tom</div>
          <div><br>
          </div>
        </div>
        <br>
        <div class="x_gmail_quote">
          <div dir="ltr" class="x_gmail_attr">On Tue, May 10, 2022 at
            2:17 PM Blake Carver <<a href="mailto:blake.carver@lyrasis.org" moz-do-not-send="true" class="moz-txt-link-freetext">blake.carver@lyrasis.org</a>>
            wrote:<br>
          </div>
          <blockquote class="x_gmail_quote" style="margin:0px 0px 0px
            0.8ex; border-left:1px solid rgb(204,204,204);
            padding-left:1ex">
            <div dir="ltr">
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                > <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline">Is
                  this possible?</span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline"><br>
                </span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline">Short
                  answer, Yes, it's possible your indexer is starting
                  over.</span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline"><br>
                </span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline">Long
                  answer. This can be tricky to figure out. Something is
                  wrong, the indexer never wants to do that. Sometimes
                  "something" "bad" gets into ArchivesSpace and the
                  indexer will just crash and start over. The problem is
                  the "something" can be anything and the "bad" can be
                  hard to figure out. The more stuff you have in your
                  DB, the harder it is to figure out.</span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline"><br>
                </span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline">First,
                  I'd make sure this is happening. Your logs should make
                  it obvious. You might see some FATAL errors just
                  before it starts over.  You MIGHT be able to narrow it
                  down from that. That is, what group of records had
                  that error in the logs? Maybe that narrows it down
                  enough. You just got lucky!</span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline"><br>
                </span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline">I
                  don't think I've ever been so lucky. What I'd do next
                  is set your loglevel to DEBUG and restart. If you're
                  feeling lucky or just impatient or both, leave the
                  indexer speed as is. You'll get more details out of
                  the logs and you should be able to narrow it down
                  better. Ideally, you want to run the indexers at 1x1,
                  which means it could take forrreeevverrrrr to get back
                  around to the crash again. If you're lucky, it'll
                  crash on a record, you'll go look at that record, the
                  problem will be obvious, and there will be much
                  rejoicing. With it running 1x1 you should see exactly
                  what's causing the fail. If it's not crashing on the
                  same record every time.... ugh. That's an even longer
                  answer. </span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <span style="color:rgb(32,31,30); font-size:15px;
                  background-color:rgb(255,255,255); display:inline"><br>
                </span></div>
              <div style="font-family:Calibri,Arial,Helvetica,sans-serif;
                font-size:12pt; color:rgb(0,0,0)">
                <br>
              </div>
              <div>
                <div id="x_gmail-m_2727772702640317328Signature">
                  <div>
                    <div id="x_gmail-m_2727772702640317328divtagdefaultwrapper" dir="ltr" style="color:rgb(0,0,0);
                      background-color:rgb(255,255,255)">
                      <div name="x_divtagdefaultwrapper" style="font-family:Calibri,Arial,Helvetica,sans-serif;
                        font-size:12pt; margin:0px">
                        <font size="3" face="Calibri,Arial,Helvetica,sans-serif" color="black"><span dir="ltr" style="font-size:12pt;
                            background-color:white"><font size="2"><span style="font-size:11pt"><br>
                              </span></font></span></font></div>
                    </div>
                  </div>
                </div>
              </div>
              <hr style="display:inline-block; width:98%">
              <div id="x_gmail-m_2727772702640317328divRplyFwdMsg" dir="ltr"><font style="font-size:11pt" face="Calibri,
                  sans-serif" color="#000000"><b>From:</b>
                  <a href="mailto:archivesspace_users_group-bounces@lyralists.lyrasis.org" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">
archivesspace_users_group-bounces@lyralists.lyrasis.org</a> <<a href="mailto:archivesspace_users_group-bounces@lyralists.lyrasis.org" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">archivesspace_users_group-bounces@lyralists.lyrasis.org</a>>
                  on behalf of Tom Hanstra <<a href="mailto:hanstra@nd.edu" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">hanstra@nd.edu</a>><br>
                  <b>Sent:</b> Tuesday, May 10, 2022 10:23 AM<br>
                  <b>To:</b> Archivesspace Users Group <<a href="mailto:archivesspace_users_group@lyralists.lyrasis.org" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">archivesspace_users_group@lyralists.lyrasis.org</a>><br>
                  <b>Subject:</b> [Archivesspace_Users_Group]
                  (re)indexing in 2.8.1</font>
                <div> </div>
              </div>
              <div>
                <div dir="ltr">I don't look at the logs a lot unless
                  there are issues with ArchivesSpace, so maybe this is
                  something normal. But, after a restart due to some
                  complaints about database connectivity, it looks like
                  our ArchivesSpace instance has decided to do a full
                  reindex. The index log sure looks as if it is starting
                  from scratch and running through the indexing of both
                  PUI and Staff indexes.
                  <div><br>
                  </div>
                  <div>
                    <div>Is this possible?  Is it something that happens
                      periodically and I just did not notice it? Nothing
                      has changed in my data directory, so I don't see
                      any reason for indexing to occur. Yet that is what
                      the logs show.</div>
                    <div><br>
                    </div>
                    <div>If it is doing this for some reason, and
                      knowing that we restart periodically, it seems
                      like we will get into a loop where indexing just
                      keeps happening all the time. Also, it would be
                      helpful to understand what caused this to happen.</div>
                    <div><br>
                    </div>
                    <div>Any thoughts or experiences from those who have
                      run this for longer would be appreciated. I'd like
                      to understand if it would be a good idea to clear
                      the data directory and perform a full index over
                      the weekend rather than an unexpected and possibly
                      never ending round in the background.</div>
                    <div><br>
                    </div>
                    <div>Thanks,</div>
                    <div>Tom</div>
                    -- <br>
                    <div dir="ltr">
                      <div dir="ltr">
                        <div>
                          <div dir="ltr">
                            <div dir="ltr">
                              <div><b style="font-family:arial,helvetica,sans-serif;
                                  font-size:12.7273px;
                                  color:rgb(136,136,136)">Tom Hanstra</b><br>
                              </div>
                              <div style="color:rgb(136,136,136);
                                font-size:12.8px">
                                <div dir="ltr">
                                  <div dir="ltr">
                                    <div dir="ltr">
                                      <div dir="ltr">
                                        <div style="font-size:12.7273px">
                                          <div>
                                            <div><i style="font-size:12.7273px;
font-family:arial,helvetica,sans-serif">Sr. Systems Administrator</i></div>
                                            <div><a href="mailto:hanstra@nd.edu" target="_blank" style="color:rgb(17,85,204);
                                                font-size:12.7273px;
                                                font-family:arial,helvetica,sans-serif" moz-do-not-send="true" class="moz-txt-link-freetext">hanstra@nd.edu</a><br>
                                            </div>
                                          </div>
                                          <div><span style="font-family:arial,helvetica,sans-serif"><br>
                                            </span></div>
                                        </div>
                                        <div style="font-size:12.7273px"><img src="https://ci3.googleusercontent.com/mail-sig/AIorK4wQjvBdM9TFi5bR5RBsq_1dY3HTxh-Kg_4W690bwTCSKeVGyazMoj0wdmkNgJ0kfjeRnparhiw" moz-do-not-send="true"><br>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </div>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
            _______________________________________________<br>
            Archivesspace_Users_Group mailing list<br>
            <a href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">Archivesspace_Users_Group@lyralists.lyrasis.org</a><br>
            <a href="http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group" rel="noreferrer" target="_blank" moz-do-not-send="true" class="moz-txt-link-freetext">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a><br>
          </blockquote>
        </div>
        <br clear="all">
        <div><br>
        </div>
        -- <br>
        <div dir="ltr" class="x_gmail_signature">
          <div dir="ltr">
            <div>
              <div dir="ltr">
                <div dir="ltr">
                  <div><b style="font-family:arial,helvetica,sans-serif;
                      font-size:12.7273px; color:rgb(136,136,136)">Tom
                      Hanstra</b><br>
                  </div>
                  <div style="color:rgb(136,136,136); font-size:12.8px">
                    <div dir="ltr">
                      <div dir="ltr">
                        <div dir="ltr">
                          <div dir="ltr">
                            <div style="font-size:12.7273px">
                              <div>
                                <div><i style="font-size:12.7273px;
                                    font-family:arial,helvetica,sans-serif">Sr.
                                    Systems Administrator</i></div>
                                <div><a href="mailto:hanstra@nd.edu" target="_blank" style="color:rgb(17,85,204);
                                    font-size:12.7273px;
                                    font-family:arial,helvetica,sans-serif" moz-do-not-send="true" class="moz-txt-link-freetext">hanstra@nd.edu</a><br>
                                </div>
                              </div>
                              <div><span style="font-family:arial,helvetica,sans-serif"><br>
                                </span></div>
                            </div>
                            <div style="font-size:12.7273px"><img src="https://ci3.googleusercontent.com/mail-sig/AIorK4wQjvBdM9TFi5bR5RBsq_1dY3HTxh-Kg_4W690bwTCSKeVGyazMoj0wdmkNgJ0kfjeRnparhiw" moz-do-not-send="true"><br>
                            </div>
                          </div>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="moz-mime-attachment-header"></fieldset>
      <pre class="moz-quote-pre" wrap="">_______________________________________________
Archivesspace_Users_Group mailing list
<a class="moz-txt-link-abbreviated" href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org">Archivesspace_Users_Group@lyralists.lyrasis.org</a>
<a class="moz-txt-link-freetext" href="http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a>
</pre>
    </blockquote>
  </body>
</html>