[Archivesspace_Users_Group] Best Way to Reindex with PUI Live?
Andrew Morrison
andrew.morrison at bodleian.ox.ac.uk
Tue Apr 28 09:20:24 EDT 2020
Not knowing which version you are using I cannot be absolutely sure, but
for versions released in recent years deleting the /indexer_state/ and
/indexer_pui_state/ subfolders inside the /data/ directory will not
cause downtime or missing records for PUI users (nor staff.)
If you are re-indexing because you've made changes to config.rb it will
require an application restart to put the change into effect. Delete
those state folders immediately after running the restart command, and
the indexer will begin refreshing records in batches once it is back up
and running. If the changes you've made affect how certain records are
indexed (e.g. inherited_fields for archival_objects) then there will be
some inconsistency until every record has been overwritten in Solr's
memory by the ArchivesSpace indexer. But it is unlikely any end user
will notice.
If you do decide to block user access during the re-index, you should
note it is possible for the indexer to go into a loop when doing a full
re-index, and never finish. But only if you've got lots of complex
records in a single repository. That is because the last step in
re-indexing each repository is to send an instruction to Solr to commit
all changes in memory to disk. Depending on the speed of whatever
storage layer your system uses that can take longer than 5 minutes, in
which case the indexer will start again from scratch. We've set
AppConfig[:indexer_solr_timeout_seconds] to 1800 to give it half an
hour, to avoid this.
Andrew.
On 27/04/2020 21:00, Joshua D. Shaw wrote:
> Hey Blake-
>
> I usually empty the indexer states directories and the
> data/solr_index/index directory when I do a fresh index run, but this
> is the first time I've had to do a re-index while the PUI is live.
> Staff I can give a heads up and they typically don't work weekends
> anyway. But students & faculty are a different ballgame!
>
> Do you inform users of the PUI that its down? Or do your stats
> indicate that the use on weekends is low enough not to warrant that
> step? I'm loathe to completely take down an online resource -
> especially now when Dartmouth is in the middle of its spring quarter.
>
> I guess I'll try a couple of different approaches on our dev site and
> see which turns out to be best. If none of those work, postponing the
> update till early June is probably the best option for us (when
> classes and finals end).
>
> Thanks!
> Joshua
>
> ------------------------------------------------------------------------
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org
> <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of
> Blake Carver <blake.carver at lyrasis.org>
> *Sent:* Monday, April 27, 2020 2:47 PM
> *To:* Archivesspace Users Group
> <archivesspace_users_group at lyralists.lyrasis.org>
> *Subject:* Re: [Archivesspace_Users_Group] Best Way to Reindex with
> PUI Live?
> Theoretically another way to do it is to update system_mtime on
> everything as well.
>
> https://gist.github.com/Blake-/538c8d7cc7ade39efc372a3e3e190873
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2FBlake-%2F538c8d7cc7ade39efc372a3e3e190873&data=02%7C01%7Cjoshua.d.shaw%40dartmouth.edu%7C36c6220876864dc281b208d7eadb760e%7C995b093648d640e5a31ebf689ec9446f%7C0%7C0%7C637236100609796629&sdata=kvx17mtZbxsEiMT8nQPVJkLvY9sMvWAGe1bnTNcYqEw%3D&reserved=0>
>
> Someplace in the official solr docs they say the best way to do it is
> to wipe everything. I've found it best to empty /data/.
>
> We'll usually do the full reindexes on a Friday night, most sites will
> have finished up by Monday.
> ------------------------------------------------------------------------
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org
> <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of
> Joshua D. Shaw <Joshua.D.Shaw at dartmouth.edu>
> *Sent:* Monday, April 27, 2020 1:01 PM
> *To:* Archivesspace Users Group
> <archivesspace_users_group at lyralists.lyrasis.org>
> *Subject:* [Archivesspace_Users_Group] Best Way to Reindex with PUI Live?
> Hi all-
>
> Just wondering what people have been doing when they need to do a
> total reindex and they have a live PUI? Our reindex takes about 4-6
> hours typically and I'm looking to avoid 4-6 hours of PUI downtime if
> at all possible.
>
> I'm planning to just wipe the indexer_state files and leave the index
> itself in place while the re-index occurs, but I'm wondering if there
> are better/alternate methods? Theoretically the PUI should still be
> functional while the reindex takes place if only the indexer_state
> files are wiped.
>
> Thanks!
> Joshua
>
> ___________________
> Joshua Shaw (he, him)
> Technology Coordinator
> Rauner Special Collections Library & Digital Library Technologies Group
> Dartmouth College
> 603.646.0405
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20200428/020e49a8/attachment.html>
More information about the Archivesspace_Users_Group
mailing list