<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
{mso-style-priority:34;
margin-top:0in;
margin-right:0in;
margin-bottom:0in;
margin-left:.5in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
p.xmsonormal, li.xmsonormal, div.xmsonormal
{mso-style-name:x_msonormal;
margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle19
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle20
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:349835954;
mso-list-type:hybrid;
mso-list-template-ids:-2085041400 67698703 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level2
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level3
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level4
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level5
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level6
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
@list l0:level7
{mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level8
{mso-level-number-format:alpha-lower;
mso-level-tab-stop:none;
mso-level-number-position:left;
text-indent:-.25in;}
@list l0:level9
{mso-level-number-format:roman-lower;
mso-level-tab-stop:none;
mso-level-number-position:right;
text-indent:-9.0pt;}
ol
{margin-bottom:0in;}
ul
{margin-bottom:0in;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal">Also, specifically:<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<ol style="margin-top:0in" start="1" type="1">
<li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo1">Using an XML database like eXist-db or BaseX with XPath/XQuery was invaluable when doing analysis of issues and of the impact of changes<o:p></o:p></li><li class="MsoListParagraph" style="margin-left:0in;mso-list:l0 level1 lfo1">One of the tools I wrote, the EAD Checker, is available online:
<a href="https://eadchecker.lib.harvard.edu">https://eadchecker.lib.harvard.edu</a> – it doesn’t catch this specific issue, but it does catch a bunch of issues, some of which cause corrupted data rather than failure to import.<o:p></o:p></li></ol>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">--<o:p></o:p></p>
<p class="MsoNormal">Dave Mayo (he/him)<o:p></o:p></p>
</div>
<p class="MsoNormal">Senior Digital Library Software Engineer<br>
Harvard University > HUIT > LTS<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:12.0pt;color:black">From: </span></b><span style="font-size:12.0pt;color:black"><archivesspace_users_group-bounces@lyralists.lyrasis.org> on behalf of "Mayo, Dave" <dave_mayo@harvard.edu><br>
<b>Reply-To: </b>Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org><br>
<b>Date: </b>Thursday, June 18, 2020 at 9:23 AM<br>
<b>To: </b>Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org><br>
<b>Subject: </b>Re: [Archivesspace_Users_Group] Top container ranges<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<p class="MsoNormal">So, with the caveat that we put a lot of resources (a bunch of archivists’ time, a full year of a full time developer (me!)), we had very solid results; I think remediating issues prior to import is almost always worth the expense of significant
effort, particularly over a large corpus. <o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">My main advice would be to be very, very careful about changes – version your EADs, compare before and after scripts run, and in general be very systematic about how you find, report, and correct changes.<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">I don’t know if you’ve seen it, but Kate Bowers and I did a write-up of what we did during our migration – it has links to a number of open source tools I wrote for doing this kind of work. They’re a bit involved to get running, but they
definitely work at basically any scale out there, and I’m happy to help people get started with them.
<a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__journal.code4lib.org_articles_12239&d=DwMGaQ&c=WO-RGvefibhHBZq3fL85hQ&r=_Mv1dY22K7jvT5MD7xjbvGVzRDOUMhx4WYcnPSIzYnE&m=MDvEtnIJJpOOfJzfDMsXF5u8QJ22oJqGB1UWDHD9Gmc&s=0ky2pQ2HoOxy34kpHGjThpBcFVj1ERUBf7LwbRZMMP4&e=">
https://journal.code4lib.org/articles/12239</a><o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<div>
<p class="MsoNormal">--<o:p></o:p></p>
<p class="MsoNormal">Dave Mayo (he/him)<o:p></o:p></p>
</div>
<p class="MsoNormal">Senior Digital Library Software Engineer<br>
Harvard University > HUIT > LTS<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:12.0pt;color:black">From: </span></b><span style="font-size:12.0pt;color:black"><archivesspace_users_group-bounces@lyralists.lyrasis.org> on behalf of "Lucas, Dawne Howard" <dawne_lucas@unc.edu><br>
<b>Reply-To: </b>Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org><br>
<b>Date: </b>Thursday, June 18, 2020 at 9:12 AM<br>
<b>To: </b>Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org><br>
<b>Subject: </b>Re: [Archivesspace_Users_Group] Top container ranges</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<p class="MsoNormal">Thanks, Dave. I guess I should have specified that changing the EAD isn’t a viable solution for us
<i>unless</i> it’s automated. We do not plan to edit individual finding aids manually except in cases where the ranges aren’t regular. <o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#4472C4"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:black">If you’ve done this at Harvard, have there been any drawbacks? Anything we should be looking to avoid?</span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:black"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:black">Thanks again,</span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:black"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:black">Dawne</span><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#4472C4"> </span><o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From: </b><a href="mailto:dave_mayo@harvard.edu">Mayo, Dave</a><br>
<b>Sent: </b>Thursday, June 18, 2020 9:04 AM<br>
<b>To: </b><a href="mailto:archivesspace_users_group@lyralists.lyrasis.org">Archivesspace Users Group</a><br>
<b>Subject: </b>Re: [Archivesspace_Users_Group] Top container ranges<o:p></o:p></p>
</div>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">The two options I see here are essentially:<br>
<br>
1. Change the EAD<o:p></o:p></p>
<p class="MsoNormal">2. Change the containers after they’re ingested.<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">Of the two, changing the EAD seems _<i>easier</i>_ to me; if you wouldn’t mind going more into why that’s not a viable solution for you, it might help us provide better advice?<o:p></o:p></p>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
Either way, at 7000 finding aids, the solution would basically need to be automated – if your box ranges are very regular (i.e. only single number or range, no “3,4,7-10” or similar), it wouldn’t be too difficult – split the range on ‘-‘, generate list of numbers,
replace container with multiple containers. <o:p></o:p></p>
<p class="MsoNormal">--<o:p></o:p></p>
<p class="MsoNormal">Dave Mayo (he/him)<o:p></o:p></p>
<p class="MsoNormal">Senior Digital Library Software Engineer<br>
Harvard University > HUIT > LTS<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:12.0pt;color:black">From: </span></b><span style="font-size:12.0pt;color:black"><archivesspace_users_group-bounces@lyralists.lyrasis.org> on behalf of "Lucas, Dawne Howard" <dawne_lucas@unc.edu><br>
<b>Reply-To: </b>Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org><br>
<b>Date: </b>Thursday, June 18, 2020 at 8:13 AM<br>
<b>To: </b>Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org><br>
<b>Subject: </b>[Archivesspace_Users_Group] Top container ranges</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
<p class="xmsonormal">Hi all,<o:p></o:p></p>
<p class="xmsonormal"> <o:p></o:p></p>
<p class="xmsonormal">We are formulating a plan to import our 7000+ EAD finding aids into ArchivesSpace and are wondering how other institutions have handled top container ranges.
<o:p></o:p></p>
<p class="xmsonormal"> <o:p></o:p></p>
<p class="xmsonormal">For example, we have finding aids coded like this:<o:p></o:p></p>
<p class="xmsonormal"> <o:p></o:p></p>
<p class="xmsonormal"><c02><did><container type="box" label="Box">3-4</container><unittitle>Photographs</unittitle></did></c02><o:p></o:p></p>
<p class="xmsonormal"> <o:p></o:p></p>
<p class="xmsonormal">This imports into ASpace just fine (yay!), but of course also creates a top container for Box 3-4 instead of Box 3 and Box 4 (boo!). We assume this will be an issue later when we integrate with Aeon.
<o:p></o:p></p>
<p class="xmsonormal"> <o:p></o:p></p>
<p class="xmsonormal" style="margin-left:.5in;text-indent:-.5in">The most obvious solution to this problem appears to be to change the encoding to:<o:p></o:p></p>
<p class="xmsonormal" style="margin-left:.5in;text-indent:-.5in"> <o:p></o:p></p>
<p class="xmsonormal"><c02><did><container type="box" label="Box">3</container><unittitle>Photographs</unittitle></did></c02><o:p></o:p></p>
<p class="xmsonormal"> <o:p></o:p></p>
<p class="xmsonormal"><c02><did><container type="box" label="Box">4 </container><unittitle>Photographs</unittitle></did></c02><o:p></o:p></p>
<p class="xmsonormal" style="margin-left:.5in;text-indent:-.5in"> <o:p></o:p></p>
<p class="xmsonormal">For several reasons, this is not a viable solution for us. Have other institutions figured out a way to deal with this issue that does not include editing the EAD in individual finding aids?<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">Thanks for your help,<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">Dawne<o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal">--<o:p></o:p></p>
<p class="MsoNormal"><b><span style="color:black">Dawne Howard Lucas (she/her/hers)</span></b><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:black">Technical Services Archivist</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:black"> </span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:black">Wilson Special Collections Library</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:black">200 South Road, CB #3926</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:black">Chapel Hill, NC 27515</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:black">The University of North Carolina at Chapel Hill</span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:10.0pt;color:#5AA3D2">P</span><span style="font-size:10.0pt;color:#6AB4DC"> </span><span style="font-size:10.0pt;color:black">919-966-1776</span><span style="font-size:10.0pt;color:#4472C4"> </span><span style="font-size:10.0pt;color:#5AA3D2">E </span><span style="font-size:10.0pt;color:black"><a href="mailto:dawne_lucas@unc.edu">dawne_lucas@unc.edu</a></span><o:p></o:p></p>
<p class="MsoNormal"><span style="font-size:9.0pt;color:#4472C4"> </span><o:p></o:p></p>
<p class="MsoNormal"><a href="https://urldefense.proofpoint.com/v2/url?u=https-3A__library.unc.edu_wilson_&d=DwMFAg&c=WO-RGvefibhHBZq3fL85hQ&r=_Mv1dY22K7jvT5MD7xjbvGVzRDOUMhx4WYcnPSIzYnE&m=tkJE1JdGvSoNb5i6NSRbF3z1n28dGeVJ4ogcFmpTpQo&s=e9r4LIAN87oWg7LLTrzui9bCYcCMX-8twYfh3y0I8tY&e="><span style="color:windowtext;text-decoration:none"><span style="color:#4472C4"><img border="0" width="204" height="35" style="width:2.125in;height:.3645in" id="Picture_x0020_3" src="cid:image001.png@01D64554.0CC69A80" alt="cid:image001.png@01D5F200.0D957C80"></span></span></a><o:p></o:p></p>
<p class="MsoNormal"><span style="color:#4472C4"> </span><o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
</div>
</body>
</html>