[Archivesspace_Users_Group] EAD Import - cryptic error messages

Chris Fitzpatrick Chris.Fitzpatrick at lyrasis.org
Mon Mar 3 17:25:52 EST 2014


Hi Steven,


Wow, thanks for this. I'm am going over this and it really helps for the improved error messaging we are trying to setup.


I definitely think it should be doable to strip out any empty XML tags and not have them create JSON nodes.


Also looking at the diff you sent..it seems to cause some problems with the test suite, but I need to figure out what's going on there and way this is stripping out some of these error messages. Will update soon.


But yes, until then a good work around would be to strip out empty EAD tags prior to import....


best,chris.



Chris Fitzpatrick | chris.fitzpatrick at lyrasis.org
Developer, ArchivesSpace
http://archivesspace.org/
________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of Steven Majewski <sdm7g at virginia.edu>
Sent: Saturday, March 01, 2014 5:14 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD Import - cryptic error messages


I found a way to get info on the EAD -> JSON_schema mappings, and I’ve managed to fix those notes/0/content errors as well as several others.

For debugging purposes, I make this temporary change to jsonmodel_wrap.rb  to  ignore all validation errors,
and I run my command line EAD import parser.  ( I haven’t tried running this code on the backend server — no idea what that might break. )

--- a/backend/app/converters/lib/jsonmodel_wrap.rb
+++ b/backend/app/converters/lib/jsonmodel_wrap.rb
@@ -13,10 +13,10 @@ module ASpaceImport
         # TODO - speed things up by avoiding this another way
         rescue JSONModel::ValidationException => e



-          e.errors.reject! {|path, mssg|
-                            e.attribute_types &&
-                            e.attribute_types.has_key?(path) &&
-                            e.attribute_types[path] == 'ArchivesSpaceDynamicEnum'}
+          e.errors.reject! {|path, mssg| true }
+#                            e.attribute_types &&
+#                            e.attribute_types.has_key?(path) &&
+#                            e.attribute_types[path] == 'ArchivesSpaceDynamicEnum'}


This generates json files for almost all of the EAD files. ( except for about 30, which I assume are the ones with errors other
than #<:ValidationException…> ).  The ones that would not have normally validated correctly will still generate validation
errors if POSTED to  /repositories/$ID/batch_imports.  However, I can pipe them thru json_pp and search for the schema
property in the error message.  So far, this has yielded enough context information to identify the source of the problem
in the EAD file.

Most of these problems seem to trace back to empty elements in the EAD file.

In a few cases, there is a missing required element ( unitid, for example ), but in most cases, removing the empty element
fixes the problem.  Is this something that could be fixed in the parser ? : if the element is empty, don’t create a JSON property
for it ?
( For now, I’m adding templates for all of the glitches I’ve found to a AS fixup stylesheet run as a pre-process to AS import. )


— Steve Majewski


On Feb 28, 2014, at 8:43 AM, Brad Westbrook <brad.westbrook at lyrasis.org<mailto:brad.westbrook at lyrasis.org>> wrote:

Hi, Steve,

I won’t be able to address your mapping request until next week.

We are working on a public release now which will address the LDAP security hole reported a couple of weeks ago and include a number of enhancements made since the 1.0.4 release on Jan. 20.  We are aiming to announce the release later today, but it might not be until Monday, depending on resolution of one item.

Brad

Bradley D. Westbrook
Program Manager
brad.westbrook at lyrasis.org<mailto:brad at archivesspace.org>
800.999.8558 x2910
678.235.2910
bradley_d_westbrook (Skype)
<image001.png>





From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org]On Behalf Of Steven Majewski
Sent: Friday, February 28, 2014 8:33 AM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD Import - cryptic error messages


Brad:

 Can  you express this requirement in terms of EAD elements  instead of  JSONModel schema types ?
 It’s that mapping that is giving me trouble:  trying to turn the schema references in those error messages
 into elements in the imported EAD that need to be addressed.

 Any ETA for that next release ?
I’ve managed to fixup some of the import problems with a stylesheet: I’m up to 2749 files out of 4074 parsing successfully ( up from 0 and 300+ on my
initial efforts ). That notes/0/content message is my greatest outstanding issue:

1210  #<:ValidationException: {:errors=>{"notes/0/content"=>["At least 1 item(s) is required"]}}>
  31  Unexpected Object Type in Queue: Expected archival_object got container
  30  #<:ValidationException: {:errors=>{"dates"=>["one or more required (or enter a Title)"], "title"=>["must not be an empty string (or enter a Date)"]}}>
  11  #<:ValidationException: {:errors=>{"instances/0/container/indicator_1"=>["Property is required but was missing"]}}>
  11  #<:ValidationException: {:errors=>{"id_0"=>["Property is required but was missing"]}}>
   8  #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"], "notes/0/content"=>["At least 1 item(s) is required"]}}>
   6  #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"]}}>
   5  #<:ValidationException: {:errors=>{"notes/0/content"=>["At least 1 item(s) is required"], "id_0"=>["Property is required but was missing"]}}>
   5  #<:ValidationException: {:errors=>{"ead_id"=>["Must be 255 characters or fewer"]}}>
   2  #<:ValidationException: {:errors=>{"instances/0/container/type_1"=>["Property is required but was missing"]}}>
   1  Invalid schema given: string
   1  #<:ValidationException: {:errors=>{"record"=>["Can't unambiguously match {:reference_text=>\"(In non correspondence -legal)\"} against schema types: [\"JSONModel(:note_index_item) object\"]. Resolve this by adding a 'jsonmodel_type' property to {:reference_text=>\"(In non correspondence -legal)\"}"]}}>
   1  #<:ValidationException: {:errors=>{"notes/7/subnotes/0/content"=>["Must be 65000 characters or fewer"]}}>
   1  #<:ValidationException: {:errors=>{"notes/0/content"=>["At least 1 item(s) is required"], "notes/8/subnotes/0/content"=>["Must be 65000 characters or fewer"]}}>
   1  #<:ValidationException: {:errors=>{"instances/0/container/type_1"=>["Property is required but was missing"], "instances/0/container/indicator_1"=>["Property is required but was missing"]}}>
   1  #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"], "ead_id"=>["Must be 255 characters or fewer"]}}>



— Steve M.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20140303/d3d2a999/attachment.html>


More information about the Archivesspace_Users_Group mailing list