[Archivesspace_Users_Group] Yet more tools: Ingest and analysis (of ingest) scripts now on Github

Mayo, Dave dave_mayo at harvard.edu
Fri Aug 19 10:28:32 EDT 2016


Hello all,

I just wanted to make known that, in addition to the EADChecker and our preprocessor, we have produced a set of ruby scripts for doing bulk ingest that might be of general interest, made them available via Github.

https://github.com/harvard-library/aspace-utils

There are basically two scripts - an ingest script, which will do a bulk ingest of all XML finding aids in a directory, and an analysis script which will run over said finding aids and produce a summary of error causes.

The ingest script's primary advantage over the built-in bulk importer is that it doesn't fail the whole import on the first failed file, just the individual file.

Right now, the scripts have one Harvard-ism: they assume that finding aids are named with a leading 3-character code, which is associated with a repository ID in the ingest script's configuration file.  This can probably be amended in a more general manner pretty easily - if you want to use/try the scripts, but don't know Ruby terribly well, please email me and I'll help you get it set up (or if there's enough demand, just take the time and generalize handling of repositories).

Comments welcome, please contact me with any questions, code or doc contributions, etc.

For reference, here's a list of our other ASpace-related EAD tools, which I'm ALSO very happy to help with/answer questions about.

EADChecker - Web service to check EAD files against Harvard's schematron (or your own, if you want to download the code and run your own instance)
Live site, globally accessible: http://eadchecker.lib.harvard.edu
Code: https://github.com/harvard-library/archivesspace-checker
Developer Docs: http://harvard-library.github.io/archivesspace-checker/

ASpace Preprocessor - Application to check finding aids in bulk and apply automated fixes to them
Code: https://github.com/harvard-library/archivesspace-preprocessor
AND: https://github.com/harvard-library/aspace-processor-fixes

Thanks very much for your time, I hope this is useful to someone out there!

- Dave Mayo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20160819/267b709f/attachment.html>


More information about the Archivesspace_Users_Group mailing list