[Archivesspace_Users_Group] PDF export issue
Custer, Mark
mark.custer at yale.edu
Mon Oct 5 10:27:46 EDT 2020
Dear Hitomi,
This process will only work for the staff-side PDFs, since the PUI PDFs do not use the XSL-FO process (the staff-side files are converted from EAD to PDF, whereas the public-side files are converted from HTML to PDF). I haven’t looked into how to update the public process to add new fonts, so I can’t help with that unfortunately.
For the staff side (where the “as-eas-pdf.xsl” and “fop-config.xml” files need to be updated if new fonts need to be added), it looks like you need to further update your fop-config.xml file to know about the new font files. Right now, that file still references the NotoSerif fonts (lines 11-22) but doesn’t mention the font family that is referred to as “IPAex” in the updated XSL file. There should be other ways to set this up so that everything is auto-detected and you can set font preferences elsewhere, but if the font family is mentioned in the XSL file that converts the EAD to XSL-FO, I believe that it might need to be explicitly mentioned in your config file (but it’s been a while since I’ve worked with those FOP config files).
Anyhow, here’s an example of what I added to my FOP config file to get things to work for the staff-side PDF, where I set the font-family name to “Ipam” in the as-eas-pdf.xsl file:
<font embed-url="ipam.ttf">
<font-triplet name="Ipam" style="normal" weight="normal"/>
</font>
I didn’t specify all of the font files, as I just wanted to confirm that it worked, but additional settings would likely be required depending on how those fonts work. For our PDF files, for instance, I’ve explicitly referenced a number of different font files. See: https://github.com/YaleArchivesSpace/EAD3-to-PDF-UA/blob/master/fop.xconf
If you add something like that to your fop-config.xml file, switching out the font name of IPam with IPAex, then I believe it should work as expected for your staff-side PDFs.
Mark
From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Hitomi Matsuyama
Sent: Monday, 05 October, 2020 1:15 AM
To: 'Archivesspace Users Group' <archivesspace_users_group at lyralists.lyrasis.org>
Subject: Re: [Archivesspace_Users_Group] PDF export issue
Dear Mark and all,
After having tried the way with FOP as Mark suggested, we are still stuck in this issue…
Attached are as-ead-pdf.xsl file, fop-config.xml, and PDFs generated from PUI; public_print.pdf and from Staff side; staff_print.pdf.
We are using the ArchivesSpace V2.8.0. and specify Japanese as the language of description.
Do you have any idea of what we missed?
If you need further information on our setting, let me know.
Thank you in advance!
Hitomi Matsuyama, Audiovisual Archivist
Nakanoshima Museum of Art, Osaka
1-1-86-8F Noda, Fukushima-ku
Osaka 553-0005 JAPAN
tel. +81 (0)6 64 69 51 93
email. matsuyama at nak-osaka.jp<mailto:matsuyama at nak-osaka.jp>
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> On Behalf Of Hitomi Matsuyama
Sent: Wednesday, September 30, 2020 5:21 PM
To: 'Archivesspace Users Group' <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] PDF export issue
Thank you very much Mark, Steve, and Maura for your prompt responses!
We will follow Mark’s advice. Thanks a lot.
All the best,
Hitomi
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> On Behalf Of Custer, Mark
Sent: Wednesday, September 30, 2020 3:12 AM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] PDF export issue
Steve, all:
It looks like it does cause an issue for the PUI PDFs, as well, at least with an example that I just tested thanks to Maura Carbone, who provided a sample bit of text to try (hi, Maura!). See: http://test.archivesspace.org/repositories/2/archival_objects/3909<https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftest.archivesspace.org%2Frepositories%2F2%2Farchival_objects%2F3909&data=02%7C01%7Cmark.custer%40yale.edu%7Cb7b3171251b7478a2f0b08d868eda89b%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C1%7C637374717421311867&sdata=GKMSDY6A3hLNMICZ6l%2FYFiFEyH%2FgHyt76kv2INq0W7c%3D&reserved=0>.
When I tried the PUI PDF option in that case, the contents of the note were entirely missing (whereas on the staff side, the process replaces any glyphs that are not found in the current font with the "#" character). The PUI PDF process is handled quite differently, though, since that process goes from HTML to PDF.
In both cases, it seems like there should be easier configuration options, since there's no font (or even font family) that's going to cover all character sets.
With Apache FOP, which is used on the staff side, you can configure FOP to auto-detect fonts but you'd still need to make sure to add the fonts where FOP can find them. That said, since the "stylesheets" directory in ASpace is not part of the WAR files, I'd think you could just update those files on the server without too much trouble. Here's some info on that https://xmlgraphics.apache.org/fop/2.1/fonts.html#bulk<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fxmlgraphics.apache.org%2Ffop%2F2.1%2Ffonts.html%23bulk&data=02%7C01%7Cmark.custer%40yale.edu%7Cb7b3171251b7478a2f0b08d868eda89b%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C637374717421311867&sdata=qPW6m%2BAiO%2F4FFwe8kNIMGZ15rpzezOMPcQJV1aTSng8%3D&reserved=0>. Then you'd just need to update the transformation file, which is also in that "stylesheets" directory.
I just did that to test things out and that worked. Example:
* Download new fonts, e.g. https://ctan.org/pkg/ipaex<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fctan.org%2Fpkg%2Fipaex&data=02%7C01%7Cmark.custer%40yale.edu%7Cb7b3171251b7478a2f0b08d868eda89b%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C0%7C637374717421321865&sdata=CsSh0eeBWyKIhROSm%2Byma2UjcOo0e3tLIJjX2hZJdNI%3D&reserved=0>
* Add those to the fop-config.xml file
* Update the as-ead-pdf.xsl file to refer to the new fonts (and this last bit could be handled with a parameter or other means). That said, it would be ideal in ASpace to be able to specify the language contents of the description to clearly indicate the language and scripts that are in use, especially if you want to switch between fonts for different scripts, etc.
Example screenshot attached.
Mark
________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> on behalf of Majewski, Steven Dennis (sdm7g) <sdm7g at virginia.edu<mailto:sdm7g at virginia.edu>>
Sent: Tuesday, September 29, 2020 12:41 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: Re: [Archivesspace_Users_Group] PDF export issue
Yes, a sample would be useful to try to reproduce the issue. It would also be interesting to know if both the staff PDF export and the PUI PDF show the same problems. - Steve M.
On Sep 29, 2020, at 11:42 AM, Custer, Mark <mark.custer at yale.edu<mailto:mark.custer at yale.edu>> wrote:
Dear Hitomi Matsuyama,
Which version of ArchivesSpace are you using?
I'm not familiar with those configuratio settings, but I suspect that they might just be for the PDF formats of the Reports, not for the PDF format of the EAD conversion process.
On newer releases of ArchivesSpace, the default font for the EAD to PDF conversion has been updated to use the NotoSerif font family. See: https://github.com/archivesspace/archivesspace/blob/master/stylesheets/fop-config.xml<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Farchivesspace%2Farchivesspace%2Fblob%2Fmaster%2Fstylesheets%2Ffop-config.xml&data=02%7C01%7Cmark.custer%40yale.edu%7Cb7b3171251b7478a2f0b08d868eda89b%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C1%7C637374717421321865&sdata=h3emwpSavSTHiJzjkyyOLWvi5Y3q1G1bMpAJQoPz1lA%3D&reserved=0>, https://github.com/archivesspace/archivesspace/tree/master/stylesheets/fonts<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Farchivesspace%2Farchivesspace%2Ftree%2Fmaster%2Fstylesheets%2Ffonts&data=02%7C01%7Cmark.custer%40yale.edu%7Cb7b3171251b7478a2f0b08d868eda89b%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C1%7C637374717421331852&sdata=ICunD0Rl2RuVFi9A5ejhdG%2BkQL1j%2BK%2FlEdDIzgTxhn4%3D&reserved=0>, and https://github.com/archivesspace/archivesspace/blob/5da6428562b65493fc087fe3543c4d292f10ff0e/stylesheets/as-ead-pdf.xsl#L124<https://nam05.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Farchivesspace%2Farchivesspace%2Fblob%2F5da6428562b65493fc087fe3543c4d292f10ff0e%2Fstylesheets%2Fas-ead-pdf.xsl%23L124&data=02%7C01%7Cmark.custer%40yale.edu%7Cb7b3171251b7478a2f0b08d868eda89b%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C1%7C637374717421331852&sdata=GYX95dDLgaelThY0vaI8CBnuVI%2FT5MmlTlVXsIEv1YI%3D&reserved=0>
Also, I am assuming that you are referring to the EAD to PDF process in the ArchivesSpace staff interface, i.e. Export --> Generate PDF. If that's not right, just let me know. If that is right, could you share a sample EAD file that could be used for testing with a different font?
All my best,
Mark
________________________________
From: archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org> <archivesspace_users_group-bounces at lyralists.lyrasis.org<mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org>> on behalf of Hitomi Matsuyama <matsuyama at nak-osaka.jp<mailto:matsuyama at nak-osaka.jp>>
Sent: Tuesday, September 29, 2020 4:26 AM
To: archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org> <archivesspace_users_group at lyralists.lyrasis.org<mailto:archivesspace_users_group at lyralists.lyrasis.org>>
Subject: [Archivesspace_Users_Group] PDF export issue
Hello all,
We've had some trouble with PDF export in Japanese.
Descriptive information written with Japanese characters, hiragana, katakana, and kanji, are not exported to ArchivesSpace formatted finding-aids at all.
Our IT staff has tried to modify config.rb as follows in order to add some Japanese specific font types;
AppConfig[:report_pdf_font_paths] = proc { ["#{AppConfig[:backend_url]}/reports/static/fonts/ipa/ipag.ttf"] } AppConfig[:report_pdf_font_family] = "IPAexゴシック, \"IPA Pゴシック\",
\"ヒラギノ角ゴ ProN W3\", \"Hiragino Kaku Gothic ProN\", メイリオ, Meiryo, \"MS Pゴシック\", sans-serif"
However, this doesn’t work out and the description in Japanese is still missing in a PDF.
Have any non-alphabet language users ever faced the same problem?
If it’s been already solved, let us know how to get through.
Thank you!
Hitomi Matsuyama, Audiovisual Archivist
Nakanoshima Museum of Art, Osaka
1-1-86-8F Noda, Fukushima-ku
Osaka 553-0005 JAPAN
tel. +81 (0)6 64 69 51 93
email. matsuyama at nak-osaka.jp<mailto:matsuyama at nak-osaka.jp>
_______________________________________________
Archivesspace_Users_Group mailing list
Archivesspace_Users_Group at lyralists.lyrasis.org<mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group<https://nam05.safelinks.protection.outlook.com/?url=http%3A%2F%2Flyralists.lyrasis.org%2Fmailman%2Flistinfo%2Farchivesspace_users_group&data=02%7C01%7Cmark.custer%40yale.edu%7Cb7b3171251b7478a2f0b08d868eda89b%7Cdd8cbebb21394df8b4114e3e87abeb5c%7C0%7C1%7C637374717421341846&sdata=n0pGwR3phhM8VKHTbLQS47%2FvhV9wwNLEDqkncxQx3E8%3D&reserved=0>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20201005/16212a9a/attachment.html>
More information about the Archivesspace_Users_Group
mailing list