[Archivesspace_Users_Group] strategies for piecemeal re-index

Callahan, Maureen maureen.callahan at yale.edu
Thu Jul 23 15:11:57 EDT 2015


Hey, thanks so much to everyone for their really great suggestions. We're going to try a re-index by pointing our dev instance at prod, as Brian suggests. We'll let you know how it goes.

Maureen

Maureen Callahan
Archivist, Metadata Specialist
Manuscripts & Archives
Yale University Library
maureen.callahan at yale.edu
203.432.3627

Webpage: web.library.yale.edu/mssa
Collections: drs.library.yale.edu

-----Original Message-----
From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Chris Fitzpatrick
Sent: Thursday, July 23, 2015 4:20 AM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] strategies for piecemeal re-index





Yeah, I think what Brian suggest is not so elegant, but is probably your best bet. Start up another instance, point it to your db and let they indexer run. When it's done, stop both instances, and copy over the index. 



You could also setup an external Solr instance and reindex there then just swap out the cores or change your production instances Solr URL to point to the new core. 



b,chris. 





Chris Fitzpatrick | Developer, ArchivesSpace

Skype: chrisfitzpat  | Phone: 918.236.6048

https://urldefense.proofpoint.com/v2/url?u=http-3A__archivesspace.org_&d=AwIGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=JgH2YCQ8D3P9-Lm_x4bv3d2CZBYlbx6hxnLFHtfovi8&m=A7smtghg_f_dv8t5ij2GrYP5kqLgPVcVnR2Ayc5UUXU&s=SElWeokLfCraI2xhk0TEyQk6WGRaN5MuvcYVEZJ2uKU&e= 



________________________________________

From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of Mark Cooper <mark.cooper at lyrasis.org>

Sent: Thursday, July 23, 2015 3:01 AM

To: Archivesspace Users Group

Subject: Re: [Archivesspace_Users_Group] strategies for piecemeal re-index



Hi Maureen,



We haven't *had* to do it in production so far but we have tested deleting files in indexer_state to trigger reindexing just for the targeted record type and it worked for our test cases, enough to feel it could work if there was an urgent need for it prior to a full rebuild or backup restore.



You could restore from a "however long ago is necessary Solr backup" if you have it (such as provided by ArchivesSpace's backup script, which includes the indexer state) and let the indexer bring it up to date. That could save you from having to do a full rebuild. If you don't have those, or aren't sure when a good state was guaranteed, I think Brian's suggested approach may be the best one if you've got capacity for it.



I can't think of an obvious way to make Solr replication work to your advantage in the way I think Claire is describing it. ArchivesSpace can only point at a single Solr instance (which can be external), but you can't point at the broken index on one server, while rebuilding another. It's not like other applications I've used where the indexing happens (or can be triggered / configured) independently of the main app.



Last thing I can think of (and probably not too helpful for your immediate need) is to try and aggressively speed up indexing, perhaps by overnight disabling everything apart from the backend, indexer and solr and tune the configuration settings to maximize indexing speed (with threads and records per thread being the most likely candidates for experimentation -- but it's hard to predict how much benefit it will offer and is likely to be very "spec" dependent, so your mileage may vary).



Mark



Mark Cooper

Technical Lead, Hosting and Support

LYRASIS

email: mark.cooper at lyrasis.org

skype: mark_c_cooper​



________________________________________

From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of Brian Hoffman <brianjhoffman at gmail.com>

Sent: Wednesday, July 22, 2015 1:41 PM

To: Archivesspace Users Group

Subject: Re: [Archivesspace_Users_Group] strategies for piecemeal re-index



Hi Maureen,



Perhaps you could try using a second instance of ArchivesSpace to build the new index, then copying that index over your production ArchivesSpace index. It’s not elegant but I can’t think of anything better.



Brian





On Jul 22, 2015, at 4:26 PM, Callahan, Maureen <maureen.callahan at yale.edu> wrote:



> Claire, this sounds like a very sensible solution. Thank you! I’m eager to hear any thoughts that Chris or Brian may have. I (or someone else from Yale) may also be in touch directly to learn more about your process.

>

> Many thanks,

> Maureen

>

>

>> On Jul 22, 2015, at 4:10 PM, KNOWLES Claire <Claire.Knowles at ed.ac.uk> wrote:

>>

>> Hi Maureen,

>>

>> With another service we run we replicate the SOLR. If we want to do a full

>> reindex we then point the webapp to the replicated SOLR and turn off

>> replication. I¹m not sure if this solution will work with ArchivesSpace,

>> I¹m sure Chris can advise.

>>

>> Claire

>>

>> --

>> Claire Knowles

>> Library Digital Development Manager

>> Library and University Collections, Information Services

>> University of Edinburgh

>> Tel: 0131 6503023

>> Email: claire.knowles at ed.ac.uk

>>

>>

>>

>>

>>

>> On 22/07/2015 15:37,

>> "archivesspace_users_group-bounces at lyralists.lyrasis.org on behalf of

>> Callahan, Maureen"

>> <archivesspace_users_group-bounces at lyralists.lyrasis.org on behalf of

>> maureen.callahan at yale.edu> wrote:

>>

>>> Hey everyone,

>>>

>>> At some point, our index got totally jacked. Unfortunately, our database

>>> is way too big to be able to do a full re-index overnight and we¹re

>>> reluctant to leave it going over the weekend in case something goes wrong.

>>>

>>> Also, our Aeon requesting service relies on a webservice built on top of

>>> ArchivesSpace, so we can¹t have the index unavailable for too long, even

>>> if it¹s not during normal working hours.

>>>

>>> Has anyone played with re-indexing a bit at a time by deleting records

>>> from indexer_state? Is this a reliable way to fix index problems? Does

>>> anyone have thoughts on other strategies?

>>>

>>> Thanks,

>>> Maureen

>>> _______________________________________________

>>> Archivesspace_Users_Group mailing list

>>> Archivesspace_Users_Group at lyralists.lyrasis.org

>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwIFAw&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=JgH2YCQ8D3P9-Lm_x4bv3d2CZBYlbx6hxnLFHtfovi8&m=tn9RUtSjPMsjMstlHFo9h7W0l1cfREohyVoURWtmSxM&s=SEcF2mqDEkndNEGCVVYmk--uNFqLRX112AYMrnZD2L0&e=

>>

>>

>> --

>> The University of Edinburgh is a charitable body, registered in

>> Scotland, with registration number SC005336.

>>

>> _______________________________________________

>> Archivesspace_Users_Group mailing list

>> Archivesspace_Users_Group at lyralists.lyrasis.org

>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwIFAw&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=JgH2YCQ8D3P9-Lm_x4bv3d2CZBYlbx6hxnLFHtfovi8&m=tn9RUtSjPMsjMstlHFo9h7W0l1cfREohyVoURWtmSxM&s=SEcF2mqDEkndNEGCVVYmk--uNFqLRX112AYMrnZD2L0&e=

>

> _______________________________________________

> Archivesspace_Users_Group mailing list

> Archivesspace_Users_Group at lyralists.lyrasis.org

> https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwIGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=JgH2YCQ8D3P9-Lm_x4bv3d2CZBYlbx6hxnLFHtfovi8&m=A7smtghg_f_dv8t5ij2GrYP5kqLgPVcVnR2Ayc5UUXU&s=Dk53ygxE_Hkx0E8xPwMH7YlJjLLMGSIivg4T8-E3iDk&e= 



_______________________________________________

Archivesspace_Users_Group mailing list

Archivesspace_Users_Group at lyralists.lyrasis.org

https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwIGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=JgH2YCQ8D3P9-Lm_x4bv3d2CZBYlbx6hxnLFHtfovi8&m=A7smtghg_f_dv8t5ij2GrYP5kqLgPVcVnR2Ayc5UUXU&s=Dk53ygxE_Hkx0E8xPwMH7YlJjLLMGSIivg4T8-E3iDk&e= 

_______________________________________________

Archivesspace_Users_Group mailing list

Archivesspace_Users_Group at lyralists.lyrasis.org

https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwIGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=JgH2YCQ8D3P9-Lm_x4bv3d2CZBYlbx6hxnLFHtfovi8&m=A7smtghg_f_dv8t5ij2GrYP5kqLgPVcVnR2Ayc5UUXU&s=Dk53ygxE_Hkx0E8xPwMH7YlJjLLMGSIivg4T8-E3iDk&e= 

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org
https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwIGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=JgH2YCQ8D3P9-Lm_x4bv3d2CZBYlbx6hxnLFHtfovi8&m=A7smtghg_f_dv8t5ij2GrYP5kqLgPVcVnR2Ayc5UUXU&s=Dk53ygxE_Hkx0E8xPwMH7YlJjLLMGSIivg4T8-E3iDk&e= 


More information about the Archivesspace_Users_Group mailing list