[Archivesspace_Users_Group] EAD Import - cryptic error messages
Steven Majewski
sdm7g at virginia.edu
Thu Feb 20 13:25:36 EST 2014
On Feb 19, 2014, at 3:25 AM, Chris Fitzpatrick <Chris.Fitzpatrick at lyrasis.org> wrote:
>
> Hey Noah,
>
> Yes, the EAD import errors are rather uninformative. A big problem is that schema valid EAD is sometimes not compliant to the AS model. There's a big range of variance that is allowed in the EAD schema, so we're still working on getting all the import mappings right.
>
Are these other (non-schema) restrictions documented somewhere ?
What I’ve figured so far ( besides those character count limits ) is:
<physdesc> must contain an <extent> element, and it’s contents must be a number and a unit
( although unit seems to accept almost any phrase, as long as it’s preceded by a number. )
Are there other places where <extent> is required ? This seems to be the greatest class of import error we’re seeing,
even after fixing some of the instances with a stylesheet.
There are restrictions on dates, but we haven’t quite figured out those rules:
It clearly doesn't like “n.d.” (no date) or “c.a.” (circa) as a prefix. ( Will it parse as a suffix, after a date ? )
We’re also seeing truncated unittitle’s on some of the ones that have been successfully been imported.
In some instances, these occur when a <unittitle> contains mixed content, for example:
<unittitle>Newspaper Clippings, <title xmlns:xlink="http://www.w3.org/1999/xlink" xlink:type="simple" render="italic" xlink:href="">Richmond Times Dispatch,</title><unitdate type="inclusive" era="ce" calendar="gregorian">1967-1968</unitdate></unittitle>
shows up as “Newspaper Clippings,” .
I don’t know if it’s specifically the mixed content and embedded tags that is the problem, or if it’s the empty attributes
( xlink:href=“” — those are added by the LOC dtd2schema.xsl conversion. ), but I have seen empty attributes in
other elements cause parse errors on import.
( There appear to be other truncations that may not fit that specific pattern, that I haven’t traced yet. )
— Steve Majewski
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20140220/0b572b31/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4943 bytes
Desc: not available
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20140220/0b572b31/attachment.bin>
More information about the Archivesspace_Users_Group
mailing list