[Archivesspace_Users_Group] EAD instance issue
Kennedy, Nancy
KennedyN at si.edu
Wed Feb 25 10:21:26 EST 2015
Mark,
I just mean with regard to asking Jason R. to change the STEADY to ‘include generated @id and @parent values on the container elements’. If the STEADY files are generated with @id, that’s fine for ASpace, but creates problems for AT users that rely on the BYU plugins. We’d have to strip out the @id before import to our AT. And, for now, we need to keep our AT workflows working.
Using the XSLT temporary fix to add @id (i.e. outside of STEADY) is clearly preferable for us. Though Matt might have a different perspective on that : )
Agreed too on the logic to fix the ASpace importer. That seems necessary and like it ought to work.
Chris’s suggestion to incorporate Jason's stead into ASpace as a CSV importer is appealing on this end. Though, I think we would still need to fix the importer logic. STEADY does a lot for us, but we have cases where the spreadsheet exceeds STEADY’s model and so we’d still be going from spreadsheet -> EAD -> AT/ASpace. … unless this ASpaceMigrator tool can do it. Thanks, Nathan, for the heads up.
nancy
From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Nathan Stevens
Sent: Wednesday, February 25, 2015 10:14 AM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD instance issue
To further muddy the issue, If you are looking to construct ASpace Resource records from Excel data then you should also consider looking at the ASpaceMigrator<https://github.com/ns96/ASpaceMigrator> tool the ASpace team has been slowly working on. It currently beta code, but it has been used successfully by a few groups to migrate data into ASpace.
On Wed, Feb 25, 2015 at 9:53 AM, Custer, Mark <mark.custer at yale.edu<mailto:mark.custer at yale.edu>> wrote:
Nancy,
I’m not sure what you mean regarding the multiple date and extent plugins (although I do have experience using that, and I think that it was a great addition to the AT data model!). This shouldn’t cause any problem at all. The @id attribute is only applied to a container element with the stylesheet that I provided, and it should never be repeated in the file in this case.
Also, I agree that it should be addressed by the ASpace importer. I was just providing a temporary fix, which I thought would be much easier (not to mention much, much faster) than importing and exporting a file into the AT!
Lastly, I wasn’t clear in one of my earlier emails. To fix this in the ASpace importer, here’s what I think the logic should be:
1) If @id and @parent attributes are present on container elements, use that data to group and import multiple instances. Voilà, we can finally retain multiple instances, which we could only produce in the AT but not consume (this is now the current behavior in ASpace).
2) If @id and @parent attributes are NOT present on the container elements within any EAD did element, then treat all of those containers as a single instance, rather than multiple instances. In this case, if there are more than 3 sibling container elements (which is exceedingly rare), then only the first 3 will be imported into ASpace as a single container group (which follows the AT’s import behavior).
Mark
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>] On Behalf Of Kennedy, Nancy
Sent: Wednesday, February 25, 2015 9:34 AM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD instance issue
Mark,
Wouldn’t adding @id pairs to the STEADY output cause trouble for repositories that still need to use the AT multiple date and extent plugins (the BYU plugins)? For the near term, we will still be using AT and the BYU plugins. For us, @id pairs will throw errors on import. Granted, we could incorporate workarounds to strip any STEADY created @id pairs.
But, since the @id is not required by the EAD and is being considered here primarily to support a feature (and a GREAT one! I’m so looking forward to being able to import multiple instances) for ArchivesSpace, I’d much rather see it addressed within ArchivesSpace importers ... rather than as a change to STEADY outputs.
One of the nice things about STEADY is how (nearly) ready-to-go its EAD files are. We have other methods for importing from spreadsheets, particularly when round-tripping, but STEADY is by far the easier option for new users and student projects. I’d really like to keep it as simple as possible!
Nancy
Nancy Kennedy
EAD Coordinator
Smithsonian Institution
KennedyN at si.edu<mailto:KennedyN at si.edu>
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Custer, Mark
Sent: Tuesday, February 24, 2015 2:59 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD instance issue
All,
So I fear that this is caused by my request to allow grouped containers to be imported into ArchivesSpace, since that was never possible with Archivists’ Toolkit. I’ll forward this message to the creator of Steady, Jason Ronallo, to see if it would just be a simple change to include generated @id and @parent values on the container elements. So, in the example below, the EAD output could look like this, instead:
<container type="box" label="Text" id=”c1”>1</container>
<container type="folder" label="Text" parent=”c1”>20</container>
(alternatively, if the @label attribute was only output once in this example, then that value could be used to do the grouping by the EAD importer, but I personally prefer to rely on the @parent and @id attributes since those are the examples given in the EAD tag library, http://www.loc.gov/ead/tglib/elements/container.html<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.loc.gov_ead_tglib_elements_container.html&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=TQmmzdQt9TEtFi4MyRDTt2mXhw0lj8a26EWCdRuOp6c&s=qasB18CQYeZyhBeEO7mLJyhn7J8M0VcUZOpLFbOCSmg&e=>)
Another option would be for the ASpace importer to only group containers when @id and @parent attributes are available. In other words, if those attributes aren’t part of the container elements, like they aren’t with the Steady-produced files, then those sibling containers should be considered a single group (and if there are more than 3 sibling container elements, only the first 3 would be imported, which also mimics the AT’s behavior). I haven’t looked into the ASpace EAD importer yet, but I’ll see if I can’t find the relevant pieces of code that govern this behavior.
All my best,
Mark
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of MATTHEW R FRANCIS
Sent: Tuesday, February 24, 2015 2:41 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD instance issue
Noah,
Thank you for the quick and helpful response, it is greatly appreciated. I think your right about the cause for the change in behavior, and more importantly from my perspective your AT import/export WF recommendation looks like it will work as a stop-gap process for us until there is a better way to proceed.
Cheers and thanks again.
-Matt
Matt Francis
Archivist for Collection Management
Special Collections Library
Penn State University
Twitter: @archivingmatt
http://www.archivingmatt.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.archivingmatt.com&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=TQmmzdQt9TEtFi4MyRDTt2mXhw0lj8a26EWCdRuOp6c&s=wCIhwm68gqzh_-QEuiVqQ_RAU4lDNkKOR_rcKgDqHYU&e=>
________________________________
From: "Noah Huffman" <noah.huffman at duke.edu<mailto:noah.huffman at duke.edu>>
To: "Archivesspace Users Group" <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Sent: Tuesday, February 24, 2015 2:25:55 PM
Subject: Re: [Archivesspace_Users_Group] EAD instance issue
Hi Matthew,
I just tested and confirmed the issue you describe. We use the same Excel->Steady->EAD->AT/ASpace workflow for large container lists, so this is also an issue for us.
I suspect the behavior is related to changes made in response to this ticket: https://archivesspace.atlassian.net/browse/AR-751<https://urldefense.proofpoint.com/v2/url?u=https-3A__archivesspace.atlassian.net_browse_AR-2D751&d=AwMFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=fsFyKgFz0gzNJ4-L054MLtQb1rGgFEE3gczfb-PzjpE&s=aJPXDL4UOTUg7jAHm5x3t-foUC9_1wI735l3-gx7VaY&e=>
This isn’t a long term solution, but to achieve your desired result, you could import the XML output of Steady into Archivists Toolkit, then export EAD, then import to ArchivesSpace. I just tested this and it works.
The AT Importer will create a single instance record for the box/folder and on export AT will assign @id and @parent attributes to the containers that communicate the parent/child relationship. The ArchivesSpace importer will use those attributes to create a single parent/child instance for the box/folder.
Alternatively, you could somehow process the XML to assign @id and @parent to each container prior to import.
Something like:
<container id="cid1351138" type="Box" label="Mixed materials">1</container>
<container parent="cid1351138" type="Folder">4</container>
-Noah
================
Noah Huffman
David M. Rubenstein Rare Book & Manuscript Library
Duke University
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of MATTHEW R FRANCIS
Sent: Tuesday, February 24, 2015 1:02 PM
To: Archivesspace Users Group
Subject: [Archivesspace_Users_Group] EAD instance issue
All,
Sending through the listserv as the "Send Feedback" link does not properly function for our instance of ArchivesSpace and apologies in advance if this has already been discussed.
We are currently running ASpace v1.1.2 and have recently run into an issue with our EAD imports that we were not experiencing under previous versions (for full context we were previously running v1.0.9 before migrating to the current version). Historically some of our workflows have relied on converting Excel collection container listings into EAD XML files through Steady (http://steady2.herokuapp.com/<https://urldefense.proofpoint.com/v2/url?u=http-3A__steady2.herokuapp.com_&d=AwMFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=fsFyKgFz0gzNJ4-L054MLtQb1rGgFEE3gczfb-PzjpE&s=x4gRIBeykmbWmcy1PGsfGBnRohA7DhS5fW_X9-7VlKA&e=>) and then importing into AT/ASpace. This is a workflow that we would like to remain an option moving into the future.
However, when importing one of these XML files in v1.1.2 we noticed that when there were two sets of container tags for an archival object in our XML that ASpace mapped the container data as two separate instances for the archival object instead of a singular instance with two container types. For example here is the XML data for one of the archival objects:
<c02 level="file">
<did>
<unittitle>Council Members Lists</unittitle>
<unitdate>undated</unitdate>
<container type="box" label="Text">1</container>
<container type="folder" label="Text">20</container>
</did>
</c02>
And when imported into ASpace the instance is now displaying as:
[cid:image001.png at 01D050E4.CFEFF700]
We are not sure what has caused this change with the current version, but from our workflows/resources perspectives it presents a challenge to our collection management practices. Would appreciate any information on what might be causing this, if there is a recommended alternative approach to importing this type of EAD XML data, and/or if this is something that is pipeline to be worked on.
Thanks for the help and information.
-Matt
Matt Francis
Archivist for Collection Management
Special Collections Library
Penn State University
Twitter: @archivingmatt
http://www.archivingmatt.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.archivingmatt.com&d=AwMFaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=fsFyKgFz0gzNJ4-L054MLtQb1rGgFEE3gczfb-PzjpE&s=8jkNk2Kz0syasZhQvYf1CDyOrgxJhMu47RAgCshMVGI&e=>
_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group<https://urldefense.proofpoint.com/v2/url?u=http-3A__lyralists.lyrasis.org_mailman_listinfo_archivesspace-5Fusers-5Fgroup&d=AwMGaQ&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=s7ciGQfUJeaV_ryx908hbeXDoU9aqDwDN0Z0VbfsJ3Y&m=TQmmzdQt9TEtFi4MyRDTt2mXhw0lj8a26EWCdRuOp6c&s=5hIsW4eWXaYbDfNBB3mgoObf-sWyDzzqjEmvVsRbGc0&e=>
_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
--
Nathan Stevens
Programmer/Analyst
Digital Library Technology Services
New York University
1212-998-2653
ns96 at nyu.edu<mailto:ns96 at nyu.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150225/843ca45d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 11020 bytes
Desc: image001.png
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150225/843ca45d/attachment.png>
More information about the Archivesspace_Users_Group
mailing list