[Archivesspace_Users_Group] diacritics in Title and Filing Title fields

Zalduendo, Ines izalduendo at gsd.harvard.edu
Tue Dec 11 12:57:05 EST 2018

Thanks Benn for sending this along.
The same is going on with Japanese characters. They display correctly in ArchivesSpace but the PDF doesn’t display them.
Here’s an example: https://hollisarchives.lib.harvard.edu/repositories/7/resources/201 (top right button for PDF)
I never reported this to the users group, but am glad others are interested in this being looked into. I was told core developers already know about this.

Special Collections Archivist / Frances Loeb Library / Harvard University Graduate School of Design / 48 Quincy Street, Cambridge, MA 02138 / T. 617.496.1300

From: archivesspace_users_group-bounces at lyralists.lyrasis.org <archivesspace_users_group-bounces at lyralists.lyrasis.org> On Behalf Of Benn Joseph
Sent: Tuesday, December 11, 2018 11:19 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: [Archivesspace_Users_Group] diacritics in Title and Filing Title fields

Not sure if there’s a ticket for this, but we’re seeing some tricky behavior with diacritics in both the Title and Filing Title fields when trying to print a PDF as a background job.

Here’s an example: the collection name is “Camille Saint-Saëns correspondence”, and the umlaut displays correctly in the public interface.

If this text is input into the Title field without any character encoding, i.e. if the “ë” is just pasted in there, then when I print a PDF as a background job in the staff interface it shows up like this:

“Camille Saint-Sae#ns correspondence”

If I encode the character, whether HTML (ë) or UTF-8 (ë), the title ends up looking like this in the PDF output:

“Camille Saint-Saëns correspondence”

…because the ampersand gets converted to “&” in the xml and ends up as “& #235;”. I’m not seeing this behavior in any other fields though. Does this mean that no diacritics are allowed in the Title fields? Or, am I just inputting this wrong? When generating a PDF from the public interface, it seems to remove the encoding entirely, so the title fields end up as “Saint-Saens” in each case--although I understand that PDF creation process to be different than the one done as a background job.


Benn Joseph
Head of Archival Processing
Northwestern University Libraries
Northwestern University
benn.joseph at northwestern.edu<mailto:benn.joseph at northwestern.edu%0d>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20181211/000f1786/attachment.html>

More information about the Archivesspace_Users_Group mailing list