[Archivesspace_Users_Group] Import EAD via the API?

Mon Apr 6 15:50:40 EDT 2015

Hi Dallas,

What you describe is along the lines of what I had in mind with this plugin:

https://github.com/mark-cooper/aspace-jsonmodel-from-format

It adds an endpoint to get an AS jsonmodel from a "raw" file (like an EAD xml file). You can then fire it back at the imports endpoint to get it into ArchivesSpace. In terms of validation, if the jsonmodel conversion fails then you wouldn't have anything to import and can log that. Use your preferred scripting language to do something like the examples for batch imports (in effect scripting a file by file import, rather than pushing a job which may fail part way through).

However, I haven't tested it with 1.2.0 so it may not be compatible with that version as is (but is probably ok).

Best,

Mark

Mark Cooper
Technical Lead, Hosting and Support
LYRASIS
email: mark.cooper at lyrasis.org
skype: mark_c_cooper?
________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of Dallas Pillen <djpillen at umich.edu>
Sent: Monday, April 6, 2015 12:08 PM
To: Archivesspace Users Group
Subject: [Archivesspace_Users_Group] Import EAD via the API?

Hello all,

I was curious if anyone has had any success starting EAD import jobs via the API?

I was thinking this could be done using POST /repositories/:repo_id/jobs_with_files described here: http://archivesspace.github.io/archivesspace/doc/file.API.html#post-repositoriesrepoidjobswithfiles

However, I am not entirely sure how the job and file parameters should be sent in the POST request, and I haven't seen anyone ask this question before or give an example of how it might work. I've tried sending the POST request several different ways and each time I am met with: {"error":{"job":["Parameter required but no value provided"],"files":["Parameter required but no value provided"]}}.

I suppose it's worth mentioning that the reason I want to do this is that, at some point, we will be importing several thousand EADs into ArchivesSpace. We're doing a lot of preliminary work to make our EADs import successfully, but know there will likely be some that will fail. Right now, the only way to do a batch import of EADs is to do a batch as a single import job. If one EAD in that job has an error, the entire job fails. For that reason, I would like to be able to import each EAD as a separate job so that the EADs that will import successfully will do so without being impacted by the EADs with errors. However, starting several thousand individual import jobs would be very tedious, and I'm looking for a way to automate that process. If anyone else has come up with any creative solutions or knows of a better way to do that than the API, I would be very interested to know.

The end goal would be to have a script that would batch start the import jobs, get the ID for each job, check up on the jobs every so often and, once there are no longer any active jobs, output some information about each of the jobs that failed. I've figured out how to do most of that using the API, but I'm stumped on how to get the whole process started.

Thanks!

Dallas

--
Dallas Pillen
Project Archivist

[https://webapps.lsa.umich.edu/dean/lsa_emails/bentley-sig-em.png]
  Bentley Historical Library<http://bentley.umich.edu/>
  1150 Beal Avenue
  Ann Arbor, Michigan 48109-2113
  734.647.3559
  Twitter<https://twitter.com/umichBentley> Facebook <https://www.facebook.com/bentleyhistoricallibrary>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150406/c4215180/attachment.html>