<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:#0563C1;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:#954F72;
text-decoration:underline;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-margin-top-alt:auto;
margin-right:0in;
mso-margin-bottom-alt:auto;
margin-left:0in;
font-size:11.0pt;
font-family:"Calibri",sans-serif;}
span.EmailStyle18
{mso-style-type:personal;
font-family:"Calibri",sans-serif;}
span.EmailStyle19
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:#1F497D;}
span.EmailStyle20
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
font-weight:normal;
font-style:normal;
text-decoration:none none;}
span.EmailStyle21
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:#1F497D;}
span.EmailStyle22
{mso-style-type:personal;
font-family:"Calibri",sans-serif;}
span.EmailStyle23
{mso-style-type:personal;
font-family:"Calibri",sans-serif;
color:windowtext;}
span.EmailStyle24
{mso-style-type:personal-compose;
font-family:"Calibri",sans-serif;
color:windowtext;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="#0563C1" vlink="#954F72">
<div class="WordSection1">
<p class="MsoNormal">Hi, Aditi.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">So, I don’t know how best to approach this, but I can say how we’ve approached this at Yale for the time being. We’ve updated our solr/schema.xml file to use the Krovetz stemmer, which is a non-aggressive stemmer that works well for the
English language since it’s primarily dictionary based. Here’s an example of how we changed that:
<a href="https://github.com/fordmadox/archivesspace/blob/2.4.1.yale.hm.mm/solr/schema.xml#L338-L357">
https://github.com/fordmadox/archivesspace/blob/2.4.1.yale.hm.mm/solr/schema.xml#L338-L357</a> (the stemmer is used both for the index, at line 347, and at query time, at line 355).
<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Please note that this tactic affects both the staff and public interface. Well, I should say that it primarily affects the staff interface. One wrinkle that we’ve discovered is that in the typeahead feature in the staff interface, the
query is handled a bit differently there and it’s not stemmed (I’ve been meaning to look into seeing what would need to change here but I still haven’t done that yet). So, if you did a typeahead for something like “scrapbooks” in our staff interface when
trying to link to a heading, when you type in that last “s” then that doesn’t work. The problem is that the typeahead query is not being stemmed, whereas it’s doing a search on an index that has been stemmed. Our workaround is just to type “scrapbook” to
find and link to “scrapbooks”. Not ideal, but it beats expecting that users or staff would search for “invoice” OR “invoices” to get a set of results that they’d actually expect.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">I’m definitely a proponent that ArchivesSpace (or any discovery service) should include some sort of stemming out of the box, but right now that’s not the case.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Anyhow, I’m curious to hear what sort of approach you take, so please us know.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Mark<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b>From:</b> archivesspace_users_group-bounces@lyralists.lyrasis.org [mailto:archivesspace_users_group-bounces@lyralists.lyrasis.org]
<b>On Behalf Of </b>Aditi Worcester<br>
<b>Sent:</b> Monday, 08 October, 2018 12:11 PM<br>
<b>To:</b> Archivesspace Users Group <archivesspace_users_group@lyralists.lyrasis.org><br>
<b>Subject:</b> [Archivesspace_Users_Group] Auto stemming search results on AS 2.5<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Hello,<o:p></o:p></p>
<p class="MsoNormal">We’re trying to auto stem our search results on AS 2.5 and don’t quite know how!<o:p></o:p></p>
<p class="MsoNormal">For instance, a search for “invoice” on the PUI returns “no results”. A search for “invoices” returns 231 results.
<o:p></o:p></p>
<p class="MsoNormal">Any suggestions on how best to approach this? <o:p></o:p></p>
<p class="MsoNormal">Thank you in advance for your time and help.<o:p></o:p></p>
<p class="MsoNormal">Aditi<o:p></o:p></p>
<p class="MsoNormal">-----------<o:p></o:p></p>
<p class="MsoNormal">Aditi Worcester<o:p></o:p></p>
<p class="MsoNormal">Processing Archivist, Kellogg Library<o:p></o:p></p>
<p class="MsoNormal">California State University San Marcos<o:p></o:p></p>
<p class="MsoNormal">760-750-8359 | <a href="mailto:aworcester@csusm.edu">aworcester@csusm.edu</a>
<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</body>
</html>