<div dir="ltr"><div dir="ltr">Thanks, Andrew. Some responses intertwined below, italicized:</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Mar 11, 2021 at 7:31 AM Andrew Morrison <<a href="mailto:andrew.morrison@bodleian.ox.ac.uk" target="_blank">andrew.morrison@bodleian.ox.ac.uk</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>You can allocate more memory to ArchivesSpace by setting the
ASPACE_JAVA_XMX environment variable it runs under. Setting that
to "-Xmx4g" should be sufficient.</p></div></blockquote><div><i>I did bump that and the ASPACE_JAVA_XSS up a bit for this round, which looks like it will finally complete. Just a few more PUI records need to be added. </i></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div>
<p>Those FATAL lines in the log snippet are caused by a bot probing
for known vulnerabilities in common web platforms and
applications, hoping to find a web site running an out-of-date
copy (of Drupal in this case) which it can exploit. It has nothing
to do with ArchivesSpace, which has no PHP code. It is merely
logging that it doesn't know what to do with that request.<br></p></div></blockquote><div><i>Thanks. I was hoping this was just extraneous. </i></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>
</p>
<p>How did you know your one successful re-indexing completed? There
are two indexers, Staff and PUI, with the latter usually taking
much longer to finish. So if the PUI indexer fails after the staff
indexer finishes, you will see more records in the staff interface
than the public interface, even if they're all set to be public.
Also both indexers log messages that could be interpreted as
meaning they've finished, but they then run additional indexing to
build trees, to enable navigation within collections to work. A
finally they instruct Solr to commit changes, which can be slow
depending on the performance of your storage. You could try
doubling AppConfig[:indexer_solr_timeout_seconds] to allow more
time for each operation.</p></div></blockquote><div><i>At least one set of logs, when I earlier gave the server more resources, showed that the indexing had completed. But, because the second repository was showing nothing, I decided to indexing again. <br><br>This time around, we do have the second repository found so that, too, indicates that things have gone better. I guess I have to wait for things to complete but there are still some questions outstanding. For instance, one search I did for "football" (something dear to the Notre Dame experience) within the repository which is supposed to be pretty much indexed, showed over 32K results on our hosted site but only 17K locally. That seems wildly off with only a few PUI records to be completed (log shows 736500 of 763368). Could the incomplete index really be that far off?</i></div><div><i><br></i></div><div><i>I also notice that indexing overall slows down as it gets farther into our records. Is that probably because there is just more to be done with the records that might not have gotten done in earlier attempts while the first records buzz by rapidly because of earlier indexing attempts? Or could it be that resources are taken up early in the processing and no longer available for processing the later records? Is resource tuning just a trial/error prospect? I don't see a lot of information in the documentation.</i><br><div><p>Or it could've re-indexed one repository but failed on the next.
And it is possible for entire repositories to be set as
non-public, which could be another explanation for fewer records.</p>
<p>Are you running an external Solr? If so, is the
AppConfig[:solr_url] in config.rb pointing to the correct server?<br></p></div></div><div><i>I'm running a local Solr as part of the application. Is an external Solr a good idea for a site like ours? I will also do some tweaking with the Solr settings to see if that might help...after I get through at least one complete index. </i></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>
</p>
<p>There are many possible reasons for search slowness, including
not enough memory. Are there any differences in the speed of doing
the same search in the staff and public interfaces? Or between two
ways of getting the same results in the PUI. For example, does the
link in the header to list all collections
(/repositories/resources) return results faster than searching
everything then filtering to just collections
(/search?q[]=*&op[]=&field[]=keyword&filter_fields[]=primary_type&filter_values[]=resource).
There's a fix coming in 3.0.0 for the latter.</p></div></blockquote><div><i>I had not tried comparing staff to public. I will do that (though I first have to get some access to the staff side!). And I'll really not try to do much comparison until we get indexing complete, in case the indexing itself is slowing things down. </i></div><div><i><br></i></div><div><i>More questions to come, I'm sure. But thanks for your input and ideas of places to look further. Much appreciated.</i></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><p>Andrew.</p>
<p><br>
</p>
<div>On 11/03/2021 02:07, Tom Hanstra wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">I'm very new to ArchivesSpace and so my issues may
be early configuration problems. But I'm hoping some out there
can assist. We are moving from hosted to local, so I have a
large database full of data that I'm working with.
<div><br>
</div>
<div>Indexing<br>
<div>Right now, I'm running into two primary problems:</div>
<div><br>
</div>
<div>- Twice now, I've hit issues where the indexing fails due
to the Java heap space being exhausted. Do others run into
this? What do others use for Java settings?</div>
<div>- I've broken out my PUI indexing log into a separate log
and see FATAL errors in the log:<br>
------</div>
<div>I, [2021-03-10T15:32:03.747156 #2919] INFO -- :
[1b34df32-d3b7-49c3-b205-01a59daf03e5] Started GET
"/system_api.php"<br>
for 206.189.134.38 at 2021-03-10 15:32:03 -0500<br>
F, [2021-03-10T15:32:03.881297 #2919] FATAL -- :
[1b34df32-d3b7-49c3-b205-01a59daf03e5]<br>
F, [2021-03-10T15:32:03.881658 #2919] FATAL -- :
[1b34df32-d3b7-49c3-b205-01a59daf03e5]
<a>ActionController::RoutingErro</a><br>
r (No route matches [GET] "/system_api.php"):<br>
F, [2021-03-10T15:32:03.881866 #2919] FATAL -- :
[1b34df32-d3b7-49c3-b205-01a59daf03e5]<br>
F, [2021-03-10T15:32:03.882085 #2919] FATAL -- :
[1b34df32-d3b7-49c3-b205-01a59daf03e5] actionpack (5.2.4.4)
lib/acti<br>
on_dispatch/middleware/debug_<a>exceptions.rb:65:in</a> `call'<br>
[1b34df32-d3b7-49c3-b205-01a59daf03e5] actionpack (5.2.4.4)
lib/action_dispatch/middleware/show_<a>exceptions.rb:33:in</a> `<br>
call'</div>
<div>------</div>
<div>Is this something to be concerned about? Why is it
showing up in the PUI log?</div>
<div><br>
</div>
<div>Search issues</div>
<div>- Supposedly, I did get one round of indexing completed
without a heap error. But the resulting searches yielded
numbers which were incorrect compared to our hosted version.
This is why I've been trying reindexing. Is it usual to have
indexing *look* like it is complete but really be
incomplete?</div>
<div>- When I do a search, the response is really slow. I've
got nginx set up as a proxy in front of ArchivesSpace and it
is showing that the slowness is in ArchivesSpace itself
somewhere. I don't see anything in the logs to show what is
taking so long. Where should I be checking for issues?</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Tom</div>
<div>
<div><br>
</div>
-- <br>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">
<div dir="ltr">
<div><b style="font-family:arial,helvetica,sans-serif;font-size:12.7273px;color:rgb(136,136,136)">Tom
Hanstra</b><br>
</div>
<div style="color:rgb(136,136,136);font-size:12.8px">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div dir="ltr">
<div style="font-size:12.7273px">
<div>
<div><i style="font-size:12.7273px;font-family:arial,helvetica,sans-serif">Sr.
Systems Administrator</i></div>
<div><a href="mailto:hanstra@nd.edu" style="color:rgb(17,85,204);font-size:12.7273px;font-family:arial,helvetica,sans-serif" target="_blank">hanstra@nd.edu</a><br>
</div>
</div>
<div><span style="font-family:arial,helvetica,sans-serif"><br>
</span></div>
</div>
<div style="font-size:12.7273px"><img src="https://docs.google.com/uc?export=download&id=1GFX1KaaMTtQ2Kg2u8bMXt1YwBp96bvf0&revid=0B7APN9POn6xAQ244WWFYMFU3aVJwZ0lxbmVHK3FxNXlCd0RRPQ"><br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
Archivesspace_Users_Group mailing list
<a href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org" target="_blank">Archivesspace_Users_Group@lyralists.lyrasis.org</a>
<a href="http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group" target="_blank">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a>
</pre>
</blockquote>
</div>
_______________________________________________<br>
Archivesspace_Users_Group mailing list<br>
<a href="mailto:Archivesspace_Users_Group@lyralists.lyrasis.org" target="_blank">Archivesspace_Users_Group@lyralists.lyrasis.org</a><br>
<a href="http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group" rel="noreferrer" target="_blank">http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group</a><br>
</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div><b style="font-family:arial,helvetica,sans-serif;font-size:12.7273px;color:rgb(136,136,136)">Tom Hanstra</b><br></div><div style="color:rgb(136,136,136);font-size:12.8px"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div style="font-size:12.7273px"><div><div><i style="font-size:12.7273px;font-family:arial,helvetica,sans-serif">Sr. Systems Administrator</i></div><div><a href="mailto:hanstra@nd.edu" style="color:rgb(17,85,204);font-size:12.7273px;font-family:arial,helvetica,sans-serif" target="_blank">hanstra@nd.edu</a><br></div></div><div><span style="font-family:arial,helvetica,sans-serif"><br></span></div></div><div style="font-size:12.7273px"><img src="https://docs.google.com/uc?export=download&id=1GFX1KaaMTtQ2Kg2u8bMXt1YwBp96bvf0&revid=0B7APN9POn6xAQ244WWFYMFU3aVJwZ0lxbmVHK3FxNXlCd0RRPQ"><br></div></div></div></div></div></div></div></div></div></div></div></div>