[Archivesspace_Users_Group] EAD Import Issue...

Custer, Mark mark.custer at yale.edu
Fri Jul 31 13:12:35 EDT 2015

Interesting.  I just tried to change the encoding value, but that doesn’t work.  If you do a find and replace in the file to replace the single quotes, though, the record will import fine.  I’ve attached a copy of the record that I was able to import.

For the record, using that type of single quote doesn’t invalidate the EAD file.  It’s still perfectly valid, but I don’t know if it’s fully UTF-8 compliant.

Is there any way to come up with a list of invalid characters?  If so, then that could be added to a Schematron file to test and make sure those values aren’t present before attempting to do the batch upload.


From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Steven Majewski
Sent: Friday, July 31, 2015 12:31 PM
To: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] EAD Import Issue...

You might also try changing the encoding of the EAD file in the XML header.
If it’s not declared, by default it’s UTF-8.
Change the first line to:

            <?xml version="1.0" encoding="windows-1252"?>

( I don’t know for a fact if this will work for ArchivesSpace, but it works with most parsers and validators. )

Alternatively, if you have ‘iconv’ you can run a conversion thru that program to change the encoding:

iconv -f WINDOWS-1252 -t UTF-8

— Steve Majewski

On Jul 31, 2015, at 12:17 PM, Tomecek, Christy <christy.tomecek at yale.edu<mailto:christy.tomecek at yale.edu>> wrote:


I think the issue is that there are Word “Smart Quotes” in your text fields (not the markup itself). The EAD won’t validate if they are present.

Example (Smart quote highlighted):

<abstract label="Abstract">Dating from 1918 to 2000, the History and Background Information series consists of written histories, newspaper clippings, and anniversary publications documenting St. Vincent’s steady growth in the Lincoln Park neighborhood.

There is a way to turn off Smart Quotes in Word so this way you don’t have to go line by line fixing them if you are doing a copy-paste from a Word Document into ASpace.

•         Open Word. Go to File (or if you are in Windows 8/8.1, go to the Windows logo button).
•         Scroll to the bottom of the sidebar where things like "New," "Save," etc. are and click on "Options" at the bottom.
•         Go to "Proofing," located on the sidebar.
•         Go to "AutoCorrect Options" in the main panel.
•         Go to the "AutoFormat As You Type" tab and uncheck the "'Straight quotes' with 'smart quotes'" options under "Replace When You Type."



Christy Tomecek
Archives Assistant
Yale University Library
Manuscripts and Archives
christy.tomecek at yale.edu<mailto:christy.tomecek at yale.edu>

From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Rossetti, Dominic
Sent: Friday, July 31, 2015 11:58 AM
To: 'archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>'
Subject: [Archivesspace_Users_Group] EAD Import Issue...

Hey all,

When trying to import EAD I get the following error message in the log file:


Error: #<Encoding::UndefinedConversionError: ""\x9D"" from Windows-1252 to UTF-8>

I’ve attached a file as an example. The EAD is valid and correct. Not sure what is causing the issue.
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150731/e466e224/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dpu_ead_cm0001_stvincentchurchchi-edited.xml
Type: text/xml
Size: 54453 bytes
Desc: dpu_ead_cm0001_stvincentchurchchi-edited.xml
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20150731/e466e224/attachment.xml>

More information about the Archivesspace_Users_Group mailing list