[Archivesspace_Users_Group] EAD instance issue

Custer, Mark mark.custer at yale.edu
Tue Feb 24 20:49:42 EST 2015


Matt,

Here's another option in the interim.  You could transform the EAD output from Steady with the following XSLT 1.0 file, which will add @id and @parent attributes if those aren't already present in the EAD so that the ASpace importer will pick them up as a single "container group":

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xlink="http://www.w3.org/1999/xlink"
    xmlns:ead="urn:isbn:1-931666-22-9"
    version="1.0">

    <!--standard identity template, which does all of the copying-->
    <xsl:template match="@*|node()">
        <xsl:copy>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!--adds an @id attribute to the first container element that doesn't already have an @id or @parent attribute-->
    <xsl:template match="ead:container[not(@id|@parent)][1]">
        <xsl:copy>
            <xsl:attribute name="id">
                <xsl:value-of select="generate-id()"/>
            </xsl:attribute>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

    <!--adds a @parent attribute to the following container elements that don't already have an @Id or @parent attribute-->
    <xsl:template match="ead:container[not(@id|@parent)][position() > 1]">
        <xsl:copy>
            <xsl:attribute name="parent">
                <xsl:value-of select="generate-id(../ead:container[not(@id|@parent)][1])"/>
            </xsl:attribute>
            <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

If you have any issues getting this solution to work, just let me know.  I just did a quick test, but with the following input:

                <did>
                    <unittitle>Council Members Lists</unittitle>
                    <unitdate>undated</unitdate>
                    <container type="box" label="Text">1</container>
                    <container type="folder" label="Text">20</container>
                </did>

Here's the output that the above transformation supplied for me:

                <did>
                    <unittitle>Council Members Lists</unittitle>
                    <unitdate>undated</unitdate>
                    <container id="d1e109" type="box" label="Text">1</container>
                    <container parent="d1e109" type="folder" label="Text">20</container>
                </did>

That "d1e109" value will likely be different, but it should always be a unique value within the file, and the value really won't matter; it's just needed to group the containers.

I hope that helps,

Mark





________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org [archivesspace_users_group-bounces at lyralists.lyrasis.org] on behalf of Custer, Mark [mark.custer at yale.edu]
Sent: Tuesday, February 24, 2015 2:58 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD instance issue

All,

So I fear that this is caused by my request to allow grouped containers to be imported into ArchivesSpace, since that was never possible with Archivists’ Toolkit. I’ll forward this message to the creator of Steady, Jason Ronallo, to see if it would just be a simple change to include generated @id and @parent values on the container elements.  So, in the example below, the EAD output could look like this, instead:

<container type="box" label="Text" id=”c1”>1</container>
<container type="folder" label="Text" parent=”c1”>20</container>

(alternatively, if the @label attribute was only output once in this example, then that value could be used to do the grouping by the EAD importer, but I personally prefer to rely on the @parent and @id attributes since those are the examples given in the EAD tag library, http://www.loc.gov/ead/tglib/elements/container.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.loc.gov_ead_tglib_elements_container.html&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=PFwumEcW50QlHdQTKuDJyESPPx1BuZfohYxwCKJ7E-k&s=KLI_7Degdg7EGUPNgHdelyCIZdG_ELV0lvihQM7I3r8&e=>)

Another option would be for the ASpace importer to only group containers when @id and @parent attributes are available.  In other words, if those attributes aren’t part of the container elements, like they aren’t with the Steady-produced files, then those sibling containers should be considered a single group (and if there are more than 3 sibling container elements, only the first 3 would be imported, which also mimics the AT’s behavior).  I haven’t looked into the ASpace EAD importer yet, but I’ll see if I can’t find the relevant pieces of code that govern this behavior.

All my best,

Mark



From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of MATTHEW R FRANCIS
Sent: Tuesday, February 24, 2015 2:41 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD instance issue

Noah,

Thank you for the quick and helpful response, it is greatly appreciated.  I think your right about the cause for the change in behavior, and more importantly from my perspective your AT import/export WF recommendation looks like it will work as a stop-gap process for us until there is a better way to proceed.

Cheers and thanks again.

-Matt

Matt Francis
Archivist for Collection Management
Special Collections Library
Penn State University


Twitter: @archivingmatt
http://www.archivingmatt.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.archivingmatt.com&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=PFwumEcW50QlHdQTKuDJyESPPx1BuZfohYxwCKJ7E-k&s=ucWmqmxEfGElNa2bswKf77KGcRG2BjF_csblAOLJVsk&e=>

________________________________
From: "Noah Huffman" <noah.huffman at duke.edu<mailto:noah.huffman at duke.edu>>
To: "Archivesspace Users Group" <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Sent: Tuesday, February 24, 2015 2:25:55 PM
Subject: Re: [Archivesspace_Users_Group] EAD instance issue

Hi Matthew,

I just tested and confirmed the issue you describe.  We use the same Excel->Steady->EAD->AT/ASpace workflow for large container lists, so this is also an issue for us.

I suspect the behavior is related to changes made in response to this ticket: https://archivesspace.atlassian.net/browse/AR-751<https://urldefense.proofpoint.com/v2/url?u=https-3A__archivesspace.atlassian.net_browse_AR-2D751&d=AwMFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=fsFyKgFz0gzNJ4-L054MLtQb1rGgFEE3gczfb-PzjpE&s=aJPXDL4UOTUg7jAHm5x3t-foUC9_1wI735l3-gx7VaY&e=>

This isn’t a long term solution, but to achieve your desired result, you could import the XML output of Steady into Archivists Toolkit, then export EAD, then import to ArchivesSpace.  I just tested this and it works.

The AT Importer will create a single instance record for the box/folder and on export AT will assign @id and @parent attributes to the containers that communicate the parent/child relationship.  The ArchivesSpace importer will use those attributes to create a single parent/child instance for the box/folder.

Alternatively, you could somehow process the XML to assign @id and @parent to each container prior to import.

Something like:
<container id="cid1351138" type="Box" label="Mixed materials">1</container>
<container parent="cid1351138" type="Folder">4</container>

-Noah

================
Noah Huffman
David M. Rubenstein Rare Book & Manuscript Library
Duke University


From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of MATTHEW R FRANCIS
Sent: Tuesday, February 24, 2015 1:02 PM
To: Archivesspace Users Group
Subject: [Archivesspace_Users_Group] EAD instance issue

All,

Sending through the listserv as the "Send Feedback" link does not properly function for our instance of ArchivesSpace and apologies in advance if this has already been discussed.

We are currently running ASpace v1.1.2 and have recently run into an issue with our EAD imports that we were not experiencing under previous versions (for full context we were previously running v1.0.9 before migrating to the current version).  Historically some of our workflows have relied on converting Excel collection container listings into EAD XML files through Steady (http://steady2.herokuapp.com/<https://urldefense.proofpoint.com/v2/url?u=http-3A__steady2.herokuapp.com_&d=AwMFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=fsFyKgFz0gzNJ4-L054MLtQb1rGgFEE3gczfb-PzjpE&s=x4gRIBeykmbWmcy1PGsfGBnRohA7DhS5fW_X9-7VlKA&e=>) and then importing into AT/ASpace.  This is a workflow that we would like to remain an option moving into the future.

However, when importing one of these XML files in v1.1.2 we noticed that when there were two sets of container tags for an archival object in our XML that ASpace mapped the container data as two separate instances for the archival object instead of a singular instance with two container types.  For example here is the XML data for one of the archival objects:


<c02 level="file">
<did>
<unittitle>Council Members Lists</unittitle>
<unitdate>undated</unitdate>
<container type="box" label="Text">1</container>
<container type="folder" label="Text">20</container>
</did>
</c02>



And when imported into ASpace the instance is now displaying as:



[cid:image001.png at 01D05040.73508450]



We are not sure what has caused this change with the current version, but from our workflows/resources perspectives it presents a challenge to our collection management practices.  Would appreciate any information on what might be causing this, if there is a recommended alternative approach to importing this type of EAD XML data, and/or if this is something that is pipeline to be worked on.



Thanks for the help and information.



-Matt

Matt Francis
Archivist for Collection Management
Special Collections Library
Penn State University

Twitter: @archivingmatt
http://www.archivingmatt.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.archivingmatt.com&d=AwMFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=fsFyKgFz0gzNJ4-L054MLtQb1rGgFEE3gczfb-PzjpE&s=8jkNk2Kz0syasZhQvYf1CDyOrgxJhMu47RAgCshMVGI&e=>

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group<https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=PFwumEcW50QlHdQTKuDJyESPPx1BuZfohYxwCKJ7E-k&s=Nq50GY02r5ahg0nGUDGH2KMc4NXiamSMcjrH6OeLIec&e=>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150225/1146fe1c/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 11020 bytes
Desc: image001.png
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150225/1146fe1c/attachment.png>


More information about the Archivesspace_Users_Group mailing list