[Archivesspace_Users_Group] cURL for bulk export of EAD in xml?

Noah Huffman noah.huffman at duke.edu
Wed Mar 18 14:00:22 EDT 2015


Below are some instructions I wrote up (pasted from a txt file) for batch exporting EAD through the API using Curl in Windows Powershell.

You can use one call to export all the resources, you just have to obtain and then include the entire list of resource IDs in the call in this format "{1, 11, 21, 31, ...}."

Hope this helps.


Steps for Batch Exporting EAD from ArchivesSpace using CURL (Windows Powershell) and REST API

1. Obtain Session Token from ASpace backend (9089) Using CURL

Command: curl -Fpassword=admin "[backend-url]/users/admin/login"

2. Copy Token from response and store as the variable $TOKEN

Command: $TOKEN = "8e5813109906328fd4ba1cf68be3435cb3b763b056f3d9ca2d992ccac9db794d"

3. Obtain a list of resource record identifiers in the appropriate ASpace repository and store as a variable $IDs

Command: $IDs= curl -H "X-ArchivesSpace-Session: $TOKEN" "[backend-url]/repositories/[repository number]/resources?all_ids=1"

4. Replace brackets in list with braces using Powershell regex find and replace and re-save as $IDs variable

Command: $IDs = $IDs -replace '^\[(.*)\]$', "{`$1}"

5. Batch Export EADs to current directory by passing list of resource IDs stored as $IDs variable.

Command: curl --output "resource_#1.xml" -H "X-ArchivesSpace-Session: $TOKEN" "[backend-url]/repositories/[repository number]/resource_descriptions/$IDs.xml?numbered_cs=true&?include_daos=true&?include_unpublished=true"

--output option will write filename to current location, #1 will use resource ID as filename for files in batch

-----Original Message-----
From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Mary Willoughby
Sent: Wednesday, March 18, 2015 1:46 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] cURL for bulk export of EAD in xml?

Thanks! I'll try out the script approach first before wading further into cURL.

On 3/18/2015 1:24 PM, Steven Majewski wrote:
> See this thread from January: [Archivesspace_Users_Group] curl help 
> <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/2015
> -January/001059.html>
> The API call you want is :
> GET  /repositories/$REPO_ID/resource_descriptions/${ID}.xml?${PARAMS}
> ( where PARAMS may be something like: 
> "include_daos=true&numbered_cs=true" )
> There isn't one call to export all resources: You have to first do a 
> call to GET  /repositories/$REPO_ID/resources?all_ids=true
> and loop thru the id's returned with something like:
> for ID in $( curl -s -H "X-ArchivesSpace-Session:
> $session" "$REPO/repositories/$REPO_ID/resources?all_ids=true" | tail 
> -1
> | tr '[],' ' ' )
> do
> curl  [ . .  . ]
> If you can directly login to the server, running the ead_export script 
> may be easier.
> I have seen problems though if there is anything wrong with the 
> exported EAD, you will get incomplete data when Nokogiri silently 
> chokes on it. If you use the API calls, you will get a complete copy 
> of the bad XML.  ( I saw this in the case I noted where ASpace inserts 
> <p> tags incorrectly and exports malformed XML. )
> - Steve Majewski
> On Mar 18, 2015, at 12:52 PM, Mary Willoughby <smirk at uga.edu 
> <mailto:smirk at uga.edu>> wrote:
>> Hi everyone,
>> I'm trying to bulk export EAD as xml using cURL to communicate with 
>> the backend of our ArchivesSpace instance. I've gotten through the 
>> very basic steps-- can connect, get session token, export session 
>> token, login, and get details on specific repositories etc. What I'm 
>> a little confused about is the specific syntax required to do a bulk 
>> export of all the EAD from a given repository. Does anyone know of 
>> any documentation/examples of this, or has anybody tried it and had 
>> it work who would share the commands they used?  I've looked at the 
>> thread from back in January and the HM screencasts about the backend 
>> on youtube, and those have been a great help in getting this far, but 
>> unfortunately I don't know enough about cURL to come up with the 
>> string I need on my own. At least not so far.
>> Thanks,
>> Mary Willoughby
>> Digital Library of Georgia
>> _______________________________________________
>> Archivesspace_Users_Group mailing list 
>> Archivesspace_Users_Group at lyralists.lyrasis.org
>> <mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_gro
>> up
> _______________________________________________
> Archivesspace_Users_Group mailing list 
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_grou
> p
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org

More information about the Archivesspace_Users_Group mailing list