[Archivesspace_Users_Group] Best Way to Reindex with PUI Live?

Andrew Morrison andrew.morrison at bodleian.ox.ac.uk
Tue Apr 28 09:20:24 EDT 2020


Not knowing which version you are using I cannot be absolutely sure, but 
for versions released in recent years deleting the /indexer_state/ and 
/indexer_pui_state/ subfolders inside the /data/ directory will not 
cause downtime or missing records for PUI users (nor staff.)


If you are re-indexing because you've made changes to config.rb it will 
require an application restart to put the change into effect. Delete 
those state folders immediately after running the restart command, and 
the indexer will begin refreshing records in batches once it is back up 
and running. If the changes you've made affect how certain records are 
indexed (e.g. inherited_fields for archival_objects) then there will be 
some inconsistency until every record has been overwritten in Solr's 
memory by the ArchivesSpace indexer. But it is unlikely any end user 
will notice.


If you do decide to block user access during the re-index, you should 
note it is possible for the indexer to go into a loop when doing a full 
re-index, and never finish. But only if you've got lots of complex 
records in a single repository. That is because the last step in 
re-indexing each repository is to send an instruction to Solr to commit 
all changes in memory to disk. Depending on the speed of whatever 
storage layer your system uses that can take longer than 5 minutes, in 
which case the indexer will start again from scratch. We've set 
AppConfig[:indexer_solr_timeout_seconds] to 1800 to give it half an 
hour, to avoid this.


Andrew.



On 27/04/2020 21:00, Joshua D. Shaw wrote:
> Hey Blake-
>
> I usually empty the indexer states directories and the 
> data/solr_index/index directory when I do a fresh index run, but this 
> is the first time I've had to do a re-index while the PUI is live. 
> Staff I can give a heads up and they typically don't work weekends 
> anyway. But students & faculty are a different ballgame!
>
> Do you inform users of the PUI that its down? Or do your stats 
> indicate that the use on weekends is low enough not to warrant that 
> step? I'm loathe to completely take down an online resource - 
> especially now when Dartmouth is in the middle of its spring quarter.
>
> I guess I'll try a couple of different approaches on our dev site and 
> see which turns out to be best. If none of those work, postponing the 
> update till early June is probably the best option for us (when 
> classes and finals end).
>
> Thanks!
> Joshua
>
> ------------------------------------------------------------------------
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org 
> <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of 
> Blake Carver <blake.carver at lyrasis.org>
> *Sent:* Monday, April 27, 2020 2:47 PM
> *To:* Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org>
> *Subject:* Re: [Archivesspace_Users_Group] Best Way to Reindex with 
> PUI Live?
> Theoretically another way to do it is to update system_mtime on 
> everything as well.
>
> https://gist.github.com/Blake-/538c8d7cc7ade39efc372a3e3e190873 
> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgist.github.com%2FBlake-%2F538c8d7cc7ade39efc372a3e3e190873&data=02%7C01%7Cjoshua.d.shaw%40dartmouth.edu%7C36c6220876864dc281b208d7eadb760e%7C995b093648d640e5a31ebf689ec9446f%7C0%7C0%7C637236100609796629&sdata=kvx17mtZbxsEiMT8nQPVJkLvY9sMvWAGe1bnTNcYqEw%3D&reserved=0>
>
> Someplace in the official solr docs they say the best way to do it is 
> to wipe everything. I've found it best to empty /data/.
>
> We'll usually do the full reindexes on a Friday night, most sites will 
> have finished up by Monday.
> ------------------------------------------------------------------------
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org 
> <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of 
> Joshua D. Shaw <Joshua.D.Shaw at dartmouth.edu>
> *Sent:* Monday, April 27, 2020 1:01 PM
> *To:* Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org>
> *Subject:* [Archivesspace_Users_Group] Best Way to Reindex with PUI Live?
> Hi all-
>
> Just wondering what people have been doing when they need to do a 
> total reindex and they have a live PUI? Our reindex takes about 4-6 
> hours typically and I'm looking to avoid 4-6 hours of PUI downtime if 
> at all possible.
>
> I'm planning to just wipe the indexer_state files and leave the index 
> itself in place while the re-index occurs, but I'm wondering if there 
> are better/alternate methods? Theoretically the PUI should still be 
> functional while the reindex takes place if only the indexer_state 
> files are wiped.
>
> Thanks!
> Joshua
>
> ___________________
> Joshua Shaw (he, him)
> Technology Coordinator
> Rauner Special Collections Library & Digital Library Technologies Group
> Dartmouth College
> 603.646.0405
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20200428/020e49a8/attachment.html>


More information about the Archivesspace_Users_Group mailing list