[Archivesspace_Users_Group] Schematron file to test for ASpace compatibility of EAD files prior to import

Custer, Mark mark.custer at yale.edu
Mon Apr 6 16:35:37 EDT 2015


All,

In case anyone might find this useful, I just started work on a Schematron file, https://github.com/fordmadox/schematrons/blob/master/ArchivesSpace-EAD-validator.sch, to test for common EAD incompatibilities with the ASpace JSON model.  For instance, this file will make sure that a yet-to-be ArchivesSpace resource record has the following, required pieces of data, none of which are required by the EAD 2002 schema:


·         @level attribute

·         title

·         date

·         identifier (or archdesc/did/unitid, in EAD-speak)

·         extent statement (though I still have to add a rule that will enforce that the resource-level extent statement has both an extent number and an extent type)

Honestly, I haven’t really tested ASpace’s EAD importer just yet, but if I were batch importing EAD files, I’d imagine that I’d do the following:


1.       validate the files according to the EAD 2002 schema

2.       validate the files according to a schematron file (or something similar)

3.       If the files failed either validation step, then I would look at the reported errors, including file names and line numbers, to see what needs to be fixed (or, better yet, automate everything so that I could tie specific fixes to specific errors; e.g., if there’s no resource-level identifier in the EAD file, then assign an identifier based on the current timestamp and send it off for import.)

4.       Upload the valid files

5.       Repeat

All of that said, I’ve never written a schematron file before, so at this point I just wanted to get a rough draft of most of the ASpace requirements.  So far, this file is not exhaustive (for instance, it won’t report that you’ve put more than 255 characters in your container_1 indicator, if you’ve done such a thing!), but all of those rules could be added pretty easily.  Is anyone else already doing something like this? I  seem to remember that some folks had scripts for updating their EAD files to be compliant with ASpace prior to import, but I wasn’t sure if anyone had already created their own validation scenarios.

All my best,

Mark



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150406/65c0c072/attachment.html>


More information about the Archivesspace_Users_Group mailing list