[Archivesspace_Users_Group] Check for Broken URLs Report Plugin
Corey Schmidt
Corey.Schmidt at uga.edu
Tue Jul 27 14:59:05 EDT 2021
Dear all,
Hello, this is Corey, ArchivesSpace PM at the University of Georgia. I hope everyone is well, healthy, and staying cool!
I'm excited to say we at UGA created our first custom plugin report for ArchivesSpace and wanted to share it with the community. The report looks for and returns broken URLs that may exist in note fields across all repositories in an ArchivesSpace instance. Those notes come from resources, archival objects, digital objects, digital object components, digital object file versions (URLs), subject scope and contents, agent person, corporate entity, family, and software. We've used it to find your standard 404 errors, but also other fun ones like 403s and malformed links.
You can find the code for the plugin here, just download the check_urls folder: https://github.com/uga-libraries/uga-archivesspace-reports. Info on how to install an ArchivesSpace plugin can be found here: https://archivesspace.github.io/tech-docs/customization/plugins.html.
The plugin isn't perfect, as it requires you to export it in CSV format, so if you install it and test it, please set the report as a CSV. Additionally, because it's doing many lookups, expect the report to run for a long time. We have over 5000 resources between five repositories and it takes us just under an hour to complete. Lastly, there is no way currently to limit the repository or notes being checked. Filtering results is best done in Excel by clicking on the third header row and using the Data > Filter feature. If anyone has any advice on how to do that in ASpace, I would greatly appreciate the feedback.
A special thanks to Dallas Pillen, who helped us solve the last puzzle of exporting the data in a usable fashion, and Alicia Detelich for her awesome tutorial on how to make a custom reports plugin (https://www.youtube.com/watch?v=ruRWpOGaj1A) and general advice. For anyone else I missed, thank you for your advice and patience.
Please reach out if you have any questions or feedback on the plugin and if you find it useful.
Thanks,
Corey
Corey Schmidt
University of Georgia Special Collections Libraries | ArchivesSpace Project Manager
706-542-8151<tel:7065428151> | Corey.Schmidt at uga.edu<mailto:Corey.Schmidt at uga.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20210727/c71ceda5/attachment.html>
More information about the Archivesspace_Users_Group
mailing list