[Archivesspace_Users_Group] EAD Import - cryptic error messages

Steven Majewski sdm7g at virginia.edu
Thu Feb 20 13:25:36 EST 2014


On Feb 19, 2014, at 3:25 AM, Chris Fitzpatrick <Chris.Fitzpatrick at lyrasis.org> wrote:

> 
> Hey Noah,
> 
> Yes, the EAD import errors are rather uninformative. A big problem is that schema valid EAD is sometimes not compliant to the AS model. There's a big range of variance that is allowed in the EAD schema, so we're still working on getting all the import mappings right.
> 


Are these other (non-schema) restrictions documented somewhere  ? 

What I’ve figured so far ( besides those character count limits ) is: 

<physdesc> must contain an <extent> element, and it’s contents must be  a number and a unit 
( although unit seems to accept almost any phrase, as long as it’s preceded by a number. )
Are there other places where <extent> is required ? This seems to be the greatest class of import error we’re seeing,
even after fixing some of the instances with a stylesheet.  


There are restrictions on dates, but we haven’t quite figured out those rules:
It clearly doesn't like “n.d.”  (no date)  or “c.a.”  (circa) as a prefix. ( Will it parse as a suffix, after a date ? ) 





We’re also seeing truncated unittitle’s  on some of the ones that have been successfully been imported.

In some instances, these occur when a <unittitle> contains mixed content, for example:

<unittitle>Newspaper Clippings, <title xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" render="italic" xlink:href="">Richmond Times Dispatch,</title><unitdate type="inclusive" era="ce" calendar="gregorian">1967-1968</unitdate></unittitle>


shows up as  “Newspaper Clippings,” . 


I don’t know if it’s specifically the mixed content and embedded tags that is the problem, or if it’s the empty attributes
 ( xlink:href=“”  —  those are added by the LOC dtd2schema.xsl  conversion. ), but I have seen empty attributes in
other elements cause parse errors on import. 


( There appear to be other truncations that may not fit that specific pattern, that I haven’t traced yet. ) 


— Steve Majewski


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20140220/0b572b31/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4943 bytes
Desc: not available
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20140220/0b572b31/attachment.bin>


More information about the Archivesspace_Users_Group mailing list