[Archivesspace_Users_Group] Indexing and search issues
Andrew Morrison
andrew.morrison at bodleian.ox.ac.uk
Thu Mar 11 07:31:06 EST 2021
You can allocate more memory to ArchivesSpace by setting the
ASPACE_JAVA_XMX environment variable it runs under. Setting that to
"-Xmx4g" should be sufficient.
Those FATAL lines in the log snippet are caused by a bot probing for
known vulnerabilities in common web platforms and applications, hoping
to find a web site running an out-of-date copy (of Drupal in this case)
which it can exploit. It has nothing to do with ArchivesSpace, which has
no PHP code. It is merely logging that it doesn't know what to do with
that request.
How did you know your one successful re-indexing completed? There are
two indexers, Staff and PUI, with the latter usually taking much longer
to finish. So if the PUI indexer fails after the staff indexer finishes,
you will see more records in the staff interface than the public
interface, even if they're all set to be public. Also both indexers log
messages that could be interpreted as meaning they've finished, but they
then run additional indexing to build trees, to enable navigation within
collections to work. A finally they instruct Solr to commit changes,
which can be slow depending on the performance of your storage. You
could try doubling AppConfig[:indexer_solr_timeout_seconds] to allow
more time for each operation.
Or it could've re-indexed one repository but failed on the next. And it
is possible for entire repositories to be set as non-public, which could
be another explanation for fewer records.
Are you running an external Solr? If so, is the AppConfig[:solr_url] in
config.rb pointing to the correct server?
There are many possible reasons for search slowness, including not
enough memory. Are there any differences in the speed of doing the same
search in the staff and public interfaces? Or between two ways of
getting the same results in the PUI. For example, does the link in the
header to list all collections (/repositories/resources) return results
faster than searching everything then filtering to just collections
(/search?q[]=*&op[]=&field[]=keyword&filter_fields[]=primary_type&filter_values[]=resource).
There's a fix coming in 3.0.0 for the latter.
Andrew.
On 11/03/2021 02:07, Tom Hanstra wrote:
> I'm very new to ArchivesSpace and so my issues may be early
> configuration problems. But I'm hoping some out there can assist. We
> are moving from hosted to local, so I have a large database full of
> data that I'm working with.
>
> Indexing
> Right now, I'm running into two primary problems:
>
> - Twice now, I've hit issues where the indexing fails due to the Java
> heap space being exhausted. Do others run into this? What do others
> use for Java settings?
> - I've broken out my PUI indexing log into a separate log and see
> FATAL errors in the log:
> ------
> I, [2021-03-10T15:32:03.747156 #2919] INFO -- :
> [1b34df32-d3b7-49c3-b205-01a59daf03e5] Started GET "/system_api.php"
> for 206.189.134.38 at 2021-03-10 15:32:03 -0500
> F, [2021-03-10T15:32:03.881297 #2919] FATAL -- :
> [1b34df32-d3b7-49c3-b205-01a59daf03e5]
> F, [2021-03-10T15:32:03.881658 #2919] FATAL -- :
> [1b34df32-d3b7-49c3-b205-01a59daf03e5] ActionController::RoutingErro
> r (No route matches [GET] "/system_api.php"):
> F, [2021-03-10T15:32:03.881866 #2919] FATAL -- :
> [1b34df32-d3b7-49c3-b205-01a59daf03e5]
> F, [2021-03-10T15:32:03.882085 #2919] FATAL -- :
> [1b34df32-d3b7-49c3-b205-01a59daf03e5] actionpack (5.2.4.4) lib/acti
> on_dispatch/middleware/debug_exceptions.rb:65:in `call'
> [1b34df32-d3b7-49c3-b205-01a59daf03e5] actionpack (5.2.4.4)
> lib/action_dispatch/middleware/show_exceptions.rb:33:in `
> call'
> ------
> Is this something to be concerned about? Why is it showing up in the
> PUI log?
>
> Search issues
> - Supposedly, I did get one round of indexing completed without a heap
> error. But the resulting searches yielded numbers which were incorrect
> compared to our hosted version. This is why I've been trying
> reindexing. Is it usual to have indexing *look* like it is complete
> but really be incomplete?
> - When I do a search, the response is really slow. I've got nginx set
> up as a proxy in front of ArchivesSpace and it is showing that the
> slowness is in ArchivesSpace itself somewhere. I don't see anything in
> the logs to show what is taking so long. Where should I be checking
> for issues?
>
> Thanks,
> Tom
>
> --
> *Tom Hanstra*
> /Sr. Systems Administrator/
> hanstra at nd.edu <mailto:hanstra at nd.edu>
>
>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20210311/99cf4c4d/attachment.html>
More information about the Archivesspace_Users_Group
mailing list