[Archivesspace_Users_Group] EAD File Importing and Verification

Majewski, Steven Dennis (sdm7g) sdm7g at eservices.virginia.edu
Thu Jan 7 17:25:23 EST 2016


On Jan 7, 2016, at 4:44 PM, Flanagan, Patrick <PJFlanagan at ship.edu<mailto:PJFlanagan at ship.edu>> wrote:

I think they're poorly conforming, as a number of tags were missing at one point -- such as <extent></extent>. They may have been generated by Archon? It's something of a mess.


If missing <extent> was the problem, I’m sure you would get that message in the import log.
But on the couple of occasions when I’ve clicked on the wrong import, and tried importing EAD as MARC XML
or the other way around, it’s just silently failed, so that’s why I’m thinking it doesn’t recognize it as EAD.
If it parses as XML but none of the expected paths match, then it won’t trigger any actions and nothing happens.
But if something happens and it doesn’t validate, it’ll complain.


If you want to try the JIRB method:

./scripts/jirb

Loading ArchivesSpace configuration file from path: /projects/Archivespace/dcs-archivesspace/common/config/config.rb

#   lots of messages …

irb(main):001:0> cnv = EADConverter.new( '/projects/from.edward/viu03244.xml' )
=> #<EADConverter:0x3af3c661 @input_file="/projects/from.edward/viu03244.xml", @batch=#<ASpaceImport::RecordBatch:0x127f90eb @working_area=[], @uri_remapping={}, @record_filter=#<Proc:0x66a17a8f@/projects/Archivespace/dcs-archivesspace/backend/app/converters/lib/parse_queue.rb:18 (lambda)>, @must_be_unique=["subject"], @working_file=#<Tempfile:/var/folders/yj/hv_tsy7j51l212dw7xf36lm40000gp/T/import_batch_working_file_145220516423220160107-7506-gmuck6>, @seen_records={}>>

irb(main):002:0> cnv.run
HI!
HI!
W, [2016-01-07T17:19:30.901000 #7506]  WARN -- : Thread-4734: Setting a property that has already been set
JSONModel::ValidationException: #<:ValidationException: {:errors=>{"extents"=>["At least 1 item(s) is required"]}, :import_context=>"<ead class=\"cdata\" id=\"viu03244\" xmlns=\"urn:isbn:1-931666-22-9\"> ... </ead>"}>

# lots of stack trace…

# JSON output is in:
irb(main):004:0* cnv.get_output_path
=> "/var/folders/yj/hv_tsy7j51l212dw7xf36lm40000gp/T/import_batch_result_145220520329120160107-7506-1andxuz"



I have xmllint, but I hadn't found the EAD schema; thank you! I'll try both that and setting up schematron and see what it comes up with. This is exactly what I needed!

~Patrick
________________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>] on behalf of Majewski, Steven Dennis (sdm7g) [sdm7g at eservices.virginia.edu<mailto:sdm7g at eservices.virginia.edu>]
Sent: Thursday, January 07, 2016 4:30 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD File Importing and Verification

Are they namespaced schema conforming EAD or are they based on the DTD ?
I don’t think I’ve ever seen a completely empty import log — that makes me think it isn’t recognizing it as EAD.
( And not being schema conforming is my first guess at a reason. )


Otherwise:


1. Check the the XML is well formed. I use xmllint or Oxygen Editor. ( or google for online validators )
2. Validate against the EAD schema. Again, I use xmllint or Oxygen.
  get a copy from: http://www.loc.gov/ead/eadschema.html
3. Try validating against the schematron rules at: https://github.com/fordmadox/schematrons
  This may be a bit more difficult to manage. We had some discussion at the NYU workshop about
  setting this up as a supported service, so people don’t have to deal with figuring out how to run
  Schematron, but I haven’t had a change to look at this. But if you get this far, ask and we’ll
  figure out how to help.
4. You can also run the EADConverted from IRB console and output the JSON model.
  But if there’s nothing in the log file, I doubt you’re getting that far in the import.


— Steve Majewski




On Jan 7, 2016, at 4:09 PM, Flanagan, Patrick <PJFlanagan at ship.edu<mailto:PJFlanagan at ship.edu><mailto:PJFlanagan at ship.edu>> wrote:

Good afternoon,

I've been tasked with figuring out why a simple EAD file fails to be imported into ArchivesSpace. I suspect it's an error with the XML file's formatting, such as a missing tag, but I don't know enough about the file type to verify it by eye. I thought I'd ask if there's any tool archivists use to verify that their EAD files are correct. When imported into ArchivesSpace, the job fails and there is an empty error log; I don't have anything else to go on, unfortunately.

Thank you very much for your time,

~Patrick Flanagan
KLN Applications Administrator
Keystone Library Network Hub
_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org><mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group

_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20160107/ee483f04/attachment.html>


More information about the Archivesspace_Users_Group mailing list