[Archivesspace_Users_Group] Auto stemming search results on AS 2.5

Custer, Mark mark.custer at yale.edu
Mon Oct 8 15:07:37 EDT 2018


Hi, Aditi.

So, I don’t know how best to approach this, but I can say how we’ve approached this at Yale for the time being.  We’ve updated our solr/schema.xml file to use the Krovetz stemmer, which is a non-aggressive stemmer that works well for the English language since it’s primarily dictionary based.  Here’s an example of how we changed that:  https://github.com/fordmadox/archivesspace/blob/2.4.1.yale.hm.mm/solr/schema.xml#L338-L357 (the stemmer is used both for the index, at line 347, and at query time, at line 355).

Please note that this tactic affects both the staff and public interface.  Well, I should say that it primarily affects the staff interface.  One wrinkle that we’ve discovered is that in the typeahead feature in the staff interface, the query is handled a bit differently there and it’s not stemmed (I’ve been meaning to look into seeing what would need to change here but I still haven’t done that yet).  So, if you did a typeahead for something like “scrapbooks” in our staff interface when trying to link to a heading, when you type in that last “s” then that doesn’t work.  The problem is that the typeahead query is not being stemmed, whereas it’s doing a search on an index that has been stemmed.  Our workaround is just to type “scrapbook” to find and link to “scrapbooks”.  Not ideal, but it beats expecting that users or staff would search for “invoice” OR “invoices” to get a set of results that they’d actually expect.

I’m definitely a proponent that ArchivesSpace (or any discovery service) should include some sort of stemming out of the box, but right now that’s not the case.

Anyhow, I’m curious to hear what sort of approach you take, so please us know.

Mark



From: archivesspace_users_group-bounces at lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces at lyralists.lyrasis.org] On Behalf Of Aditi Worcester
Sent: Monday, 08 October, 2018 12:11 PM
To: Archivesspace Users Group <archivesspace_users_group at lyralists.lyrasis.org>
Subject: [Archivesspace_Users_Group] Auto stemming search results on AS 2.5

Hello,
We’re trying to auto stem our search results on AS 2.5 and don’t quite know how!
For instance, a search for “invoice” on the PUI returns “no results”. A search for “invoices” returns 231 results.
Any suggestions on how best to approach this?
Thank you in advance for your time and help.
Aditi
-----------
Aditi Worcester
Processing Archivist, Kellogg Library
California State University San Marcos
760-750-8359 | aworcester at csusm.edu<mailto:aworcester at csusm.edu>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20181008/bb1de2ad/attachment.html>


More information about the Archivesspace_Users_Group mailing list