[Archivesspace_Users_Group] Mass export of EAD

brian brianjhoffman at gmail.com
Wed Jul 29 00:12:17 EDT 2015


It seems to me like it might not be a great idea to change the business rules for how a resource record's mtime gets updated, but that it wouldn't be too hard  to add a new field to the resource that tracks the last component update. It also seems like it wouldn't be too hard for the services under discussion to query for component mtimes as well as  resource mtimes.


Sent from my T-Mobile 4G LTE Device

<div>-------- Original message --------</div><div>From: "Arnold, Hillel" <harnold at rockarch.org> </div><div>Date:07/28/2015  4:23 PM  (GMT-05:00) </div><div>To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org> </div><div>Cc:  </div><div>Subject: Re: [Archivesspace_Users_Group] Mass export of EAD </div><div>
</div>Hi Mark,
Yup, you’re absolutely right. I made the (erroneous) assumption that changes to mtimes for descendant components would propagate in the resource record as well. This seems like something that would be best done in AS itself; I’m wondering if Brian or Chris have any thoughts about how this could be accomplished?

Hillel Arnold
Lead Digital Archivist
Rockefeller Archive Center

From: Mark Cooper <mark.cooper at lyrasis.org>
Reply-To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Date: Tuesday, July 28, 2015 at 3:41 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Mass export of EAD

In case you're interested, but haven't seen it, there is a doc for the export script:

https://github.com/archivesspace/archivesspace/blob/master/launcher/ead_export/REPO_EAD_EXPORT_README.md

Right now it just exports every EAD associated with a specified repo to a zip file and doesn't have date or incremental awareness. That could be added as the resources endpoint accepts a "modified_since" parameter (as a timestamp). I just rough tested:

date -d '2015-07-01 00:00:00' +'%s' # 1435734000
curl -H "X-ArchivesSpace-Session: $TOKEN" "http://localhost:8089/repositories/2/resources?all_ids=true&modified_since=1435734000"

Returns what appears to be the correct set of results. The obvious problem is that it isn't descendent aware, so it's only direct changes to the topmost resource record that count for the "modified_since" parameter. If the api also factored in descendent mtimes for records types that have them that would have been ideal =) Some workaround, or a solution, for that limitation is going to be required for any time based incremental type export (assuming you need any descendent / component updates to be considered as an update to the resource for what you're doing -- in other words, you may not be able to just rely on the resource mtime).

Mark Cooper
Technical Lead, Hosting and Support
LYRASIS
email: mark.cooper at lyrasis.org
skype: mark_c_cooper

From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of Suda, Phillip J <psuda1 at tulane.edu>
Sent: Tuesday, July 28, 2015 9:13 AM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] Mass export of EAD
 
Thanks all for your suggestions/scripts/help. This is a great start.
 
Thank you,
 
Phil
 
 
Phillip Suda
Systems Librarian
Howard-Tilton Memorial Library
Tulane University
psuda1 at tulane.edu
504-865-5607
 
 
 
From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Kevin Clair
Sent: Tuesday, July 28, 2015 10:41 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] Mass export of EAD
 
I have a Perl script I run from command line that runs every batch export I want or need at this point: https://github.com/duspeccoll/as_utils/blob/master/reports.pl
 
It grabs the JSON list of all the IDs for a given model, and then either dumps everything into a single JSON object or exports to some other format. The EAD export is lines 206-224. This is *extremely* customized for our environment, and I’ve made no effort yet to modify it for general use, but it’s an idea of how one could go about doing this.  -k
 
From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Steven Majewski
Sent: Tuesday, July 28, 2015 9:34 AM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] Mass export of EAD
 
 
There is ead_export.sh in the scripts directory. 
It only exports published collections, but that can be changed in the code if needed. 
That script runs locally on the AS server and it writes into the archivesspace/data/
directory, so you need write access. 
 
resource ids will not necessarily be sequential after deletions and transfers, but you
can get a JSON list of all of the ids from /repositories/$REPO_ID/resources?all_ids=true
and then loop over those ids. 
 
— Steve Majewski
 
 
On Jul 28, 2015, at 11:15 AM, Alexander Duryee <alexanderduryee at nypl.org> wrote:
 
Phil,
As far as I'm aware, there's no bulk EAD export functionality in ASpace.  However, since ASpace's resource identifiers are sequential integers, you can loop over each resource id in a repository and make an API call for its EAD record:
for x in {first..last}; do curl -H '[session token]' "https://[address]/repositories/[id]/resource_descriptions/${x}.xml" > aspace_${x}.xml; done
A loop like that should generate EAD records for each resource in your repository.
Regards,
--Alex
 
On Tue, Jul 28, 2015 at 10:27 AM, Suda, Phillip J <psuda1 at tulane.edu> wrote:
Greetings all,
 
             Is there an API or mass export feature for exporting all EAD records from a repository, etc.? I am only seeing a collection level export feature. 
 
Thanks,
 
Phil
 
Phillip Suda
Systems Librarian
Howard-Tilton Memorial Library
Tulane University
psuda1 at tulane.edu
504-865-5607
 

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group



--
Alexander Duryee
Metadata Archivist
New York Public Library
(917)-229-9590
alexanderduryee at nypl.org
_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150729/6e503014/attachment.html>


More information about the Archivesspace_Users_Group mailing list