[Archivesspace_Users_Group] notes/*/subnotes/0/content"=>["Must be 65000 characters or fewer"]

Steven Majewski sdm7g at virginia.edu
Tue Feb 18 12:08:50 EST 2014


On Feb 18, 2014, at 6:51 AM, Brad Westbrook <brad.westbrook at lyrasis.org> wrote:

> Hi, Steve,
> 
> Your suggestion to portion the <scopecontent> into smaller <scopecontent>s is a reasonable work around.  This would work for most notes that exceed the 65 character limit.  BTW, will you tell us what the source of the EADs are?
> 

We have about 11,000 EAD guides in the Virginia Heritage site: 

	http://ead.lib.virginia.edu/
	http://vaheritage.org

Over 4000 of those are from UVA Special Collections.  ( Library of Virginia is next with almost 4000. )

We have not had great success importing UVA’s EAD into ArchivesSpace. 
All of them validate against 2002 RELAXNG schema. ( Actually an extended version
that allows the xsi:schemaLocation attribute that AT exports, and the xml:base that Xinclude expansion adds. )

Initially none of them imported successfully. After running them thru a stylesheet that adds missing
<extent> to <physdesc> without them, a bit more than 300+ out of 4000+  will import. 

Remaining errors, sorted by frequency are:

2347  #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"]}}>
 944  #<:ValidationException: {:errors=>{"notes/0/content"=>["At least 1 item(s) is required"]}}>
 182  #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"], "notes/0/content"=>["At least 1 item(s) is required"]}}>
 111  Unexpected Object Type in Queue: Expected resource got date
  38  Unexpected Object Type in Queue: Expected archival_object got date
  31  #<:ValidationException: {:errors=>{"dates"=>["one or more required (or enter a Title)"], "title"=>["must not be an empty string (or enter a Date)"]}}>
  24  Unexpected Object Type in Queue: Expected archival_object got container
  10  #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"], "id_0"=>["Property is required but was missing"]}}>
   8  #<:ValidationException: {:errors=>{"instances/0/container/indicator_1"=>["Property is required but was missing"]}}>
   4  #<:ValidationException: {:errors=>{"notes/0/content"=>["At least 1 item(s) is required"], "id_0"=>["Property is required but was missing"]}}>
   2  #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"], "ead_id"=>["Must be 255 characters or fewer"]}}>
   1  Invalid schema given: string
   1  #<:ValidationException: {:errors=>{"record"=>["Can't unambiguously match {:reference_text=>\"(In non\\n               correspondence -legal)\"} against schema types: [\"JSONModel(:note_index_item) object\"]. Resolve this by adding a 'jsonmodel_type' property to {:reference_text=>\"(In non\\n               correspondence -legal)\"}"]}}>
   1  #<:ValidationException: {:errors=>{"instances/0/container/type_1"=>["Property is required but was missing"]}}>
   1  #<:ValidationException: {:errors=>{"instances/0/container/type_1"=>["Property is required but was missing"], "instances/0/container/indicator_1"=>["Property is required but was missing"]}}>
   1  #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"], "notes/7/subnotes/0/content"=>["Must be 65000 characters or fewer"]}}>
   1  #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"], "notes/0/content"=>["At least 1 item(s) is required"], "notes/8/subnotes/0/content"=>["Must be 65000 characters or fewer"]}}>
   1  #<:ValidationException: {:errors=>{"ead_id"=>["Must be 255 characters or fewer"]}}>
   1  #<:ValidationException: {:errors=>{"dates/0/expression"=>["Must be 255 characters or fewer"]}}>

That last one was a markup error that’s been fixed. Most of the others have been difficult to locate. 
The parser error says what is missing, but it doesn’t say where it’s missing from. 

I believe that the majority of extents errors are due to extent starting with text other than a number,
and the majority of date errors are likely unitdate’s than contain non-date text, like “circa” , “c.a.” or “n.d.” (no date).


We have had a somewhat better success ratio with importing a sample of guides from other Virginia Heritage members. 


[ Numbers produced by hacking a new batch importer after the old one in migration went away. 
  The new batch importer admin interface is an improvement, but it stops on the first error it encounters, so it was not 
  very useable in processing this number of files. ]
 

> Also, can you elaborate on your parenthetical statement at the bottom, perhaps providing an example?
> 
> Thanks,
> 
> Brad W.  
> 
> 

Only a couple of guides with this error. ( and a couple with the scopecontent > 65000 error. )


Error message:
   #<:ValidationException: "ead_id"=>["Must be 255 characters or fewer"]


========== imports/uva-sc/viu01268.xml =========
Found 1 nodes:
-- NODE --
<eadid publicid="PUBLIC &#34;-//University of Virginia::Library::Special Collections Dept.//TEXT (US::ViU::viu01268::A Guide to the Memoirs of Mary Victoria Wesson Craw entitled What Every Little Girl Said to Her Mother &#34;What Was It Like When You Were Growing Up?&#34;: Scrapbook of a Lady of the Twentieth Century, 1983-1992)//EN&#34; &#34;viu01268.xml&#34;" countrycode="US" mainagencycode="US-ViU">PUBLIC
             "-//University of Virginia::Library::Special Collections
             Dept.//TEXT (US::ViU::viu01268::A Guide to the Memoirs of
             Mary Victoria Wesson Craw entitled What Every Little Girl
             Said to Her Mother "What Was It Like When You Were Growing
             Up?": Scrapbook of a Lady of the Twentieth Century,
             1983-1992)//EN" "viu01268.xml"</eadid>
========== imports/uva-sc/viu01600.xml =========
Found 1 nodes:
-- NODE --
<eadid publicid="PUBLIC &#34;-//University of Virginia::Library::Special Collections Dept.//TEXT (US::ViU::viu01600::A.F. Robinton, The trials and tribulations of Robert Aldolphus Ralston onetime recruit inthe U.S. Navy and at present, Boastwain's Mate, 1st Class)//EN&#34; &#34;viu01600.xml&#34;" countrycode="US" mainagencycode="US-ViU">PUBLIC
             "-//University of Virginia::Library::Special Collections
             Dept.//TEXT (US::ViU::viu01600::A.F. Robinton, The trials
             and tribulations of Robert Aldolphus Ralston onetime
             recruit in the U.S. Navy and at present, Boastwain's Mate,
             1st Class)//EN" "viu01600.xml"</eadid>


normalize-space() on the text to remove the extra spaces does not reduce it to less than 255,
so I suppose I’ll have to truncate the title in the eadid text. 

— Steve Majewski / UVA Alderman Library

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4943 bytes
Desc: not available
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20140218/e41d5bf9/attachment.bin>


More information about the Archivesspace_Users_Group mailing list