[Archivesspace_Users_Group] diacritics in Title and Filing Title fields

Alexander Duryee alexanderduryee at nypl.org
Wed Dec 12 09:39:13 EST 2018


Thanks for posting this!  I created a ticket for this issue some time ago -
https://archivesspace.atlassian.net/projects/ANW/issues/ANW-758.  The issue
appears to be that the base PDF font set is limited in its character
support, and does not handle diacritics/non-Latin characters well - it
either "flattens" them to ASCII, or replaces them with "#".

I'm unaware of any workarounds in the meantime, but it's entirely a PDF
rendering issue - your data should be fine as-is.

Thanks,
--Alex

On Tue, Dec 11, 2018 at 12:57 PM Zalduendo, Ines <izalduendo at gsd.harvard.edu>
wrote:

> Thanks Benn for sending this along.
>
> The same is going on with Japanese characters. They display correctly in
> ArchivesSpace but the PDF doesn’t display them.
>
> Here’s an example:
> https://hollisarchives.lib.harvard.edu/repositories/7/resources/201 (top
> right button for PDF)
>
> I never reported this to the users group, but am glad others are
> interested in this being looked into. I was told core developers already
> know about this.
>
> Ines
>
>
>
> Special Collections Archivist / Frances Loeb Library / Harvard University
> Graduate School of Design / 48 Quincy Street, Cambridge, MA 02138 / T.
> 617.496.1300
>
>
>
> *From:* archivesspace_users_group-bounces at lyralists.lyrasis.org <
> archivesspace_users_group-bounces at lyralists.lyrasis.org> *On Behalf Of *Benn
> Joseph
> *Sent:* Tuesday, December 11, 2018 11:19 AM
> *To:* Archivesspace Users Group <
> archivesspace_users_group at lyralists.lyrasis.org>
> *Subject:* [Archivesspace_Users_Group] diacritics in Title and Filing
> Title fields
>
>
>
> Not sure if there’s a ticket for this, but we’re seeing some tricky
> behavior with diacritics in both the Title and Filing Title fields when
> trying to print a PDF as a background job.
>
>
>
> Here’s an example: the collection name is “Camille Saint-Saëns
> correspondence”, and the umlaut displays correctly in the public interface.
>
>
>
> If this text is input into the Title field without any character encoding,
> i.e. if the “ë” is just pasted in there, then when I print a PDF as a
> background job in the staff interface it shows up like this:
>
>
>
> “Camille Saint-Sae#ns correspondence”
>
>
>
> If I encode the character, whether HTML (ë) or UTF-8 (ë), the
> title ends up looking like this in the PDF output:
>
>
>
> “Camille Saint-Saëns correspondence”
>
>
>
> …because the ampersand gets converted to “&” in the xml and ends up as
> “& #235;”. I’m not seeing this behavior in any other fields though.
> Does this mean that no diacritics are allowed in the Title fields? Or, am I
> just inputting this wrong? When generating a PDF from the public interface,
> it seems to remove the encoding entirely, so the title fields end up as
> “Saint-Saens” in each case--although I understand that PDF creation process
> to be different than the one done as a background job.
>
>
>
> Thanks!
>
> --Benn
>
>
>
> *Benn Joseph*
>
> Head of Archival Processing
>
> Northwestern University Libraries
>
> Northwestern University
>
> www.library.northwestern.edu
>
> benn.joseph at northwestern.edu <benn.joseph at northwestern.edu%0d>
>
> 847.467.6581
>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>


-- 
Alexander Duryee
Metadata Archivist
New York Public Library
(917)-229-9590
alexanderduryee at nypl.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20181212/fc2adcfb/attachment.html>


More information about the Archivesspace_Users_Group mailing list