[Archivesspace_Users_Group] Top container ranges

Andrew Morrison andrew.morrison at bodleian.ox.ac.uk
Fri Jun 19 09:01:33 EDT 2020


Just for completeness, another option is to create your own customized 
version of the EAD importer, by subclassing the EADConverter class 
<https://github.com/archivesspace/archivesspace/blob/master/backend/app/converters/ead_converter.rb> 
in a backend plugin. Then you'd just have another option in the 
drop-down in the import job form, and no need to pre-process.

But that would require both Ruby skills and an understanding of the 
ArchivesSpace data model for containers. I'd say even a complete novice 
with XSLT would find it easier to learning enough to tweak the Yale 
example that Adrien has given below. And it produces EAD you can view, 
validate and import on a test system to check the effects. We do both, 
but only use the plugin when changing the EAD has no effect (e.g. to 
alter how agents get roles, or the rules for whether a certain note is 
published.)

Andrew.


On 18/06/2020 14:57, Hilton, Adrien wrote:
>
> Hi Dawne,
>
> I believe Yale created a script to break out container ranges: 
> https://github.com/YaleArchivesSpace/xslt-files/blob/master/EAD_expand_top_container_ranges_prior_to_import.xsl
>
> Best wishes,
>
> Adrien
>
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org 
> <archivesspace_users_group-bounces at lyralists.lyrasis.org> *On Behalf 
> Of *Mayo, Dave
> *Sent:* Thursday, June 18, 2020 9:23 AM
> *To:* Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org>
> *Subject:* Re: [Archivesspace_Users_Group] Top container ranges
>
> So, with the caveat that we put a lot of resources (a bunch of 
> archivists’ time, a full year of a full time developer (me!)), we had 
> very solid results; I think remediating issues prior to import is 
> almost always worth the expense of significant effort, particularly 
> over a large corpus.
>
> My main advice would be to be very, very careful about changes – 
> version your EADs, compare before and after scripts run, and in 
> general be very systematic about how you find, report, and correct 
> changes.
>
> I don’t know if you’ve seen it, but Kate Bowers and I did a write-up 
> of what we did during our migration – it has links to a number of open 
> source tools I wrote for doing this kind of work.  They’re a bit 
> involved to get running, but they definitely work at basically any 
> scale out there, and I’m happy to help people get started with them. 
> https://journal.code4lib.org/articles/12239 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__journal.code4lib.org_articles_12239&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=o7OoY1I5SGwJOY4qFC1JgmA4MQwVJOWSxO2IqPX0FiU&m=vS6XDcZB0h_br-T8Gq3jqXX33ieGP-JCFkbS1dAbEHg&s=rQ8EQDYEp71yWvgaKUN296_jqyoIDw4gUtTneg1gC6w&e=>
>
> --
>
> Dave Mayo (he/him)
>
> Senior Digital Library Software Engineer
> Harvard University > HUIT > LTS
>
> *From: *<archivesspace_users_group-bounces at lyralists.lyrasis.org 
> <mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> on 
> behalf of "Lucas, Dawne Howard" <dawne_lucas at unc.edu 
> <mailto:dawne_lucas at unc.edu>>
> *Reply-To: *Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org 
> <mailto:archivesspace_users_group at lyralists.lyrasis.org>>
> *Date: *Thursday, June 18, 2020 at 9:12 AM
> *To: *Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org 
> <mailto:archivesspace_users_group at lyralists.lyrasis.org>>
> *Subject: *Re: [Archivesspace_Users_Group] Top container ranges
>
> Thanks, Dave.  I guess I should have specified that changing the EAD 
> isn’t a viable solution for us /unless/ it’s automated. We do not plan 
> to edit individual finding aids manually except in cases where the 
> ranges aren’t regular.
>
> If you’ve done this at Harvard, have there been any drawbacks? 
> Anything we should be looking to avoid?
>
> Thanks again,
>
> Dawne
>
> *From: *Mayo, Dave <mailto:dave_mayo at harvard.edu>
> *Sent: *Thursday, June 18, 2020 9:04 AM
> *To: *Archivesspace Users Group 
> <mailto:archivesspace_users_group at lyralists.lyrasis.org>
> *Subject: *Re: [Archivesspace_Users_Group] Top container ranges
>
> The two options I see here are essentially:
>
> 1. Change the EAD
>
> 2. Change the containers after they’re ingested.
>
> Of the two, changing the EAD seems _/easier/_ to me; if you wouldn’t 
> mind going more into why that’s not a viable solution for you, it 
> might help us provide better advice?
>
>
> Either way, at 7000 finding aids, the solution would basically need to 
> be automated – if your box ranges are very regular (i.e. only single 
> number or range, no “3,4,7-10” or similar), it wouldn’t be too 
> difficult – split the range on ‘-‘, generate list of numbers, replace 
> container with multiple containers.
>
> --
>
> Dave Mayo (he/him)
>
> Senior Digital Library Software Engineer
> Harvard University > HUIT > LTS
>
> *From: *<archivesspace_users_group-bounces at lyralists.lyrasis.org 
> <mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> on 
> behalf of "Lucas, Dawne Howard" <dawne_lucas at unc.edu 
> <mailto:dawne_lucas at unc.edu>>
> *Reply-To: *Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org 
> <mailto:archivesspace_users_group at lyralists.lyrasis.org>>
> *Date: *Thursday, June 18, 2020 at 8:13 AM
> *To: *Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org 
> <mailto:archivesspace_users_group at lyralists.lyrasis.org>>
> *Subject: *[Archivesspace_Users_Group] Top container ranges
>
> Hi all,
>
> We are formulating a plan to import our 7000+ EAD finding aids into 
> ArchivesSpace and are wondering how other institutions have handled 
> top container ranges.
>
> For example, we have finding aids coded like this:
>
> <c02><did><container type="box" 
> label="Box">3-4</container><unittitle>Photographs</unittitle></did></c02>
>
> This imports into ASpace just fine (yay!), but of course also creates 
> a top container for Box 3-4 instead of Box 3 and Box 4 (boo!). We 
> assume this will be an issue later when we integrate with Aeon.
>
> The most obvious solution to this problem appears to be to change the 
> encoding to:
>
> <c02><did><container type="box" 
> label="Box">3</container><unittitle>Photographs</unittitle></did></c02>
>
> <c02><did><container type="box" label="Box">4 
> </container><unittitle>Photographs</unittitle></did></c02>
>
> For several reasons, this is not a viable solution for us. Have other 
> institutions figured out a way to deal with this issue that does not 
> include editing the EAD in individual finding aids?
>
> Thanks for your help,
>
> Dawne
>
> --
>
> *Dawne Howard Lucas (she/her/hers)*
>
> Technical Services Archivist
>
> Wilson Special Collections Library
>
> 200 South Road, CB #3926
>
> Chapel Hill, NC 27515
>
> The University of North Carolina at Chapel Hill
>
> P919-966-1776E dawne_lucas at unc.edu <mailto:dawne_lucas at unc.edu>
>
> cid:image001.png at 01D5F200.0D957C80 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__library.unc.edu_wilson_&d=DwMFAg&c=WO-RGvefibhHBZq3fL85hQ&r=_Mv1dY22K7jvT5MD7xjbvGVzRDOUMhx4WYcnPSIzYnE&m=tkJE1JdGvSoNb5i6NSRbF3z1n28dGeVJ4ogcFmpTpQo&s=e9r4LIAN87oWg7LLTrzui9bCYcCMX-8twYfh3y0I8tY&e=>
>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20200619/f38cf2d5/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 8590 bytes
Desc: not available
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20200619/f38cf2d5/attachment.png>


More information about the Archivesspace_Users_Group mailing list