[Archivesspace_Users_Group] cURL for bulk export of EAD in xml?

Mary Willoughby smirk at uga.edu
Thu Mar 19 10:28:53 EDT 2015


Thanks for the encouragement! I'm going to try them out today.

Mary

On 3/19/2015 9:42 AM, Ben Goldman wrote:
> Mary,
>
> I have little new to add except to say that the directions Noah provided are spot on. As a result of that conversation back in January, we were able to mass export 1900 finding aids for republishing.
>
> Good luck!
>
> -Ben
>
>
>
> Ben Goldman
> Digital Records Archivist
> Penn State University Libraries
> University Park, PA
> 814-863-8333
> http://www.libraries.psu.edu/psul/speccolls.html
>
>
>
> ----- Original Message -----
> From: "Noah Huffman" <noah.huffman at duke.edu>
> To: "Archivesspace Users Group" <archivesspace_users_group at lyralists.lyrasis.org>
> Sent: Wednesday, March 18, 2015 2:00:22 PM
> Subject: Re: [Archivesspace_Users_Group] cURL for bulk export of EAD in xml?
>
> Mary,
>
> Below are some instructions I wrote up (pasted from a txt file) for batch exporting EAD through the API using Curl in Windows Powershell.
>
> You can use one call to export all the resources, you just have to obtain and then include the entire list of resource IDs in the call in this format "{1, 11, 21, 31, ...}."
>
> Hope this helps.
>
> -Noah
>
> Steps for Batch Exporting EAD from ArchivesSpace using CURL (Windows Powershell) and REST API
>
> 1. Obtain Session Token from ASpace backend (9089) Using CURL
>
> Command: curl -Fpassword=admin "[backend-url]/users/admin/login"
>
> 2. Copy Token from response and store as the variable $TOKEN
>
> Command: $TOKEN = "8e5813109906328fd4ba1cf68be3435cb3b763b056f3d9ca2d992ccac9db794d"
>
> 3. Obtain a list of resource record identifiers in the appropriate ASpace repository and store as a variable $IDs
>
> Command: $IDs= curl -H "X-ArchivesSpace-Session: $TOKEN" "[backend-url]/repositories/[repository number]/resources?all_ids=1"
>
> 4. Replace brackets in list with braces using Powershell regex find and replace and re-save as $IDs variable
>
> Command: $IDs = $IDs -replace '^\[(.*)\]$', "{`$1}"
>
> 5. Batch Export EADs to current directory by passing list of resource IDs stored as $IDs variable.
>
> Command: curl --output "resource_#1.xml" -H "X-ArchivesSpace-Session: $TOKEN" "[backend-url]/repositories/[repository number]/resource_descriptions/$IDs.xml?numbered_cs=true&?include_daos=true&?include_unpublished=true"
>
> --output option will write filename to current location, #1 will use resource ID as filename for files in batch
>
> -----Original Message-----
> From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Mary Willoughby
> Sent: Wednesday, March 18, 2015 1:46 PM
> To: Archivesspace Users Group
> Subject: Re: [Archivesspace_Users_Group] cURL for bulk export of EAD in xml?
>
> Thanks! I'll try out the script approach first before wading further into cURL.
>
> On 3/18/2015 1:24 PM, Steven Majewski wrote:
>>
>>
>> See this thread from January: [Archivesspace_Users_Group] curl help
>> <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/2015
>> -January/001059.html>
>>
>> The API call you want is :
>>
>> GET  /repositories/$REPO_ID/resource_descriptions/${ID}.xml?${PARAMS}
>>
>> ( where PARAMS may be something like:
>> "include_daos=true&numbered_cs=true" )
>>
>>
>> There isn't one call to export all resources: You have to first do a
>> call to GET  /repositories/$REPO_ID/resources?all_ids=true
>> and loop thru the id's returned with something like:
>>
>>
>> for ID in $( curl -s -H "X-ArchivesSpace-Session:
>> $session" "$REPO/repositories/$REPO_ID/resources?all_ids=true" | tail
>> -1
>> | tr '[],' ' ' )
>> do
>> curl  [ . .  . ]
>>
>>
>>
>> If you can directly login to the server, running the ead_export script
>> may be easier.
>> I have seen problems though if there is anything wrong with the
>> exported EAD, you will get incomplete data when Nokogiri silently
>> chokes on it. If you use the API calls, you will get a complete copy
>> of the bad XML.  ( I saw this in the case I noted where ASpace inserts
>> <p> tags incorrectly and exports malformed XML. )
>>
>> - Steve Majewski
>>
>>
>>
>> On Mar 18, 2015, at 12:52 PM, Mary Willoughby <smirk at uga.edu
>> <mailto:smirk at uga.edu>> wrote:
>>
>>> Hi everyone,
>>> I'm trying to bulk export EAD as xml using cURL to communicate with
>>> the backend of our ArchivesSpace instance. I've gotten through the
>>> very basic steps-- can connect, get session token, export session
>>> token, login, and get details on specific repositories etc. What I'm
>>> a little confused about is the specific syntax required to do a bulk
>>> export of all the EAD from a given repository. Does anyone know of
>>> any documentation/examples of this, or has anybody tried it and had
>>> it work who would share the commands they used?  I've looked at the
>>> thread from back in January and the HM screencasts about the backend
>>> on youtube, and those have been a great help in getting this far, but
>>> unfortunately I don't know enough about cURL to come up with the
>>> string I need on my own. At least not so far.
>>>
>>> Thanks,
>>> Mary Willoughby
>>>
>>> Digital Library of Georgia
>>> _______________________________________________
>>> Archivesspace_Users_Group mailing list
>>> Archivesspace_Users_Group at lyralists.lyrasis.org
>>> <mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
>>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_gro
>>> up
>>
>>
>>
>> _______________________________________________
>> Archivesspace_Users_Group mailing list
>> Archivesspace_Users_Group at lyralists.lyrasis.org
>> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_grou
>> p
>>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>



More information about the Archivesspace_Users_Group mailing list