[Archivesspace_Users_Group] Bug in PUI search filtering?

Andrew Morrison andrew.morrison at bodleian.ox.ac.uk
Thu Mar 25 04:04:20 EDT 2021


This sounds similar to some observations I made last year in a comment 
on the fix that made defaulting to AND between multiple search terms 
work in the PUI:

https://github.com/archivesspace/archivesspace/pull/1861#issuecomment-620676731

Basically, what I believe is happening is that old versions of Solr, 
including the one packaged with ArchivesSpace, ignore the choice of 
implicit-AND between two or more search terms (as set by q.op or mm in 
AppConfig[:solr_params]) when it receives a query that includes explicit 
Boolean logic. That is the case when you add date filters, because the 
ArchivesSpace backend builds a query to send to Solr in which whatever 
you enter as a user is AND'ed with the date range. The number of hits 
goes up even further when you remove the date filter because 
&filter_q[]= in the URL means it is AND'ing the user's query with a 
search for *. Newer versions of Solr allow q.op or mm parameters to 
apply within subqueries, which removes this issue.

To confirm, are the people who see this effect running ArchivesSpace 
2.8.0 or higher with internal Solr (i.e. is AppConfig[:enable_solr] set 
to true in config.rb)? Or, if using external Solr, is the version lower 
than 5.5?

I'm afraid I never got round to raising a JIRA ticket for this, partly 
because it doesn't actually affect us (we run external Solr and also 
have opted for implicit-OR) and partly because there is probably nothing 
that can be done to fix it (except by upgrading the version of Solr that 
comes with ArchivesSpace, but as I understand it another change made in 
Solr after 4.10 makes that difficult.)

Andrew.


On 25/03/2021 01:13, Brian Hoffman wrote:
>
> Responding to Anna’s original message, I think this issue in JIRA 
> captures what is probably going on when a two-word search yields more 
> (rather than fewer) results after filtering by another word:
>
> https://archivesspace.atlassian.net/browse/ANW-1240 
> <https://archivesspace.atlassian.net/browse/ANW-1240>
>
> *From: *<archivesspace_users_group-bounces at lyralists.lyrasis.org> on 
> behalf of Blake Carver <blake.carver at lyrasis.org>
> *Reply-To: *Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org>
> *Date: *Wednesday, March 24, 2021 at 4:39 PM
> *To: *Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org>
> *Subject: *Re: [Archivesspace_Users_Group] Bug in PUI search filtering?
>
> > Blake, any insight?
>
> Nope, it does look like a bug to me, but I'm not sure on that. I'll 
> look into it, and get a JIRA in if that's the case.
>
> ------------------------------------------------------------------------
>
> *From:*archivesspace_users_group-bounces at lyralists.lyrasis.org 
> <archivesspace_users_group-bounces at lyralists.lyrasis.org> on behalf of 
> Anna Robinson-Sweet <robinsa1 at newschool.edu>
> *Sent:* Wednesday, March 24, 2021 4:03 PM
> *To:* Archivesspace Users Group 
> <archivesspace_users_group at lyralists.lyrasis.org>
> *Subject:* Re: [Archivesspace_Users_Group] Bug in PUI search filtering?
>
> That is indeed what seems to be happening. For us it's not only with 
> the date filter, but also the free text filter.
>
> Search for fashion designers (113 results): 
> https://findingaids.archives.newschool.edu/search?utf8=%E2%9C%93&op%5B%5D=&q%5B%5D=fashion+designers&limit=&field%5B%5D=&from_year%5B%5D=&to_year%5B%5D=&commit=Search 
> <https://findingaids.archives.newschool.edu/search?utf8=%E2%9C%93&op%5B%5D=&q%5B%5D=fashion+designers&limit=&field%5B%5D=&from_year%5B%5D=&to_year%5B%5D=&commit=Search>
>
> When I type critics into the filter search bar, 136 results: 
> https://findingaids.archives.newschool.edu/search?utf8=%E2%9C%93&op%5B%5D=&q%5B%5D=fashion+designers&limit=&field%5B%5D=&from_year%5B%5D=&to_year%5B%5D=&action=search&sort=&filter_q%5B%5D=critics&filter_from_year=&filter_to_year=&commit=Search 
> <https://findingaids.archives.newschool.edu/search?utf8=%E2%9C%93&op%5B%5D=&q%5B%5D=fashion+designers&limit=&field%5B%5D=&from_year%5B%5D=&to_year%5B%5D=&action=search&sort=&filter_q%5B%5D=critics&filter_from_year=&filter_to_year=&commit=Search>
>
> On Wed, Mar 24, 2021 at 4:00 PM Michelle Paquette <mpaquette at smith.edu 
> <mailto:mpaquette at smith.edu>> wrote:
>
>     Ha, and actually now I've tested it in my own system and this
>     seems to be the case that when adding the date filter it switches
>     to OR. Doesn't seem to be happening with any other filters though.
>     Blake, any insight?
>
>     On Wed, Mar 24, 2021 at 3:56 PM Michelle Paquette
>     <mpaquette at smith.edu <mailto:mpaquette at smith.edu>> wrote:
>
>         Hi Anna,
>
>         I'm wondering if something is happening to the default AND
>         boolean - if you search fashion OR designers in the first
>         place you're getting 5998 results, so it seems when you remove
>         the date filter it switches from an AND to an OR search.
>
>         Michelle
>
>         On Wed, Mar 24, 2021 at 3:41 PM Anna Robinson-Sweet
>         <robinsa1 at newschool.edu <mailto:robinsa1 at newschool.edu>> wrote:
>
>             Hello,
>
>             We just noticed what appears to be a bug in applying
>             filters to search results in the PUI. When we run a search
>             on a phrase that is more than one word, and then apply a
>             filter to those results, we get more results in the
>             "filtered" search than in the initial one. When we remove
>             the filter, there are even more results than either the
>             initial or filtered search.This only happens when the
>             initial search phrase is more than one word and is not in
>             quotes, suggesting that the filter turns the initial
>             search from an "and" query to an "or" query, or something
>             along these lines.
>
>             Here is an example of the problem:
>
>             First search for the phrase fashion designers (without
>             quotes) returns 113 results:
>             https://findingaids.archives.newschool.edu/search?utf8=%E2%9C%93&op%5B%5D=&q%5B%5D=fashion+designers&limit=&field%5B%5D=&from_year%5B%5D=&to_year%5B%5D=&commit=Search
>             <https://findingaids.archives.newschool.edu/search?utf8=%E2%9C%93&op%5B%5D=&q%5B%5D=fashion+designers&limit=&field%5B%5D=&from_year%5B%5D=&to_year%5B%5D=&commit=Search>
>
>             When I add a date filer 1950-1960 I get 393 results:
>             https://findingaids.archives.newschool.edu/search?utf8=%E2%9C%93&op%5B%5D=&q%5B%5D=fashion+designers&limit=&field%5B%5D=&from_year%5B%5D=&to_year%5B%5D=&action=search&sort=&filter_q%5B%5D=&filter_from_year=1950&filter_to_year=1960&commit=Search
>             <https://findingaids.archives.newschool.edu/search?utf8=%E2%9C%93&op%5B%5D=&q%5B%5D=fashion+designers&limit=&field%5B%5D=&from_year%5B%5D=&to_year%5B%5D=&action=search&sort=&filter_q%5B%5D=&filter_from_year=1950&filter_to_year=1960&commit=Search>
>
>             And when I remove the filter I get 5,998 results:
>             https://findingaids.archives.newschool.edu/search?q[]=fashion+designers&op[]=&field[]=keyword&from_year[]=&to_year[]=&filter_q[]=&sort=
>             <https://findingaids.archives.newschool.edu/search?q%5b%5d=fashion+designers&op%5b%5d=&field%5b%5d=keyword&from_year%5b%5d=&to_year%5b%5d=&filter_q%5b%5d=&sort=>
>
>             Has anyone else encountered this or have any ideas what
>             may be causing the problem?
>
>             Thanks,
>
>             Anna
>
>             -- 
>
>             *ANNA ROBINSON-SWEET (she/her)*
>
>             ASSOCIATE ARCHIVIST
>             THE NEW SCHOOL
>             ARCHIVES & SPECIAL COLLECTIONS
>
>             66 FIFTH AVENUE, NEW YORK, NY 10011
>             robinsa1 at newschool.edu
>             <https://www.newschool.edu/email-signature-generator/robinsa1@newschool.edu>
>             *T* 212-229-2942 3375
>
>             archives.newschool.edu
>             <https://www.newschool.edu/email-signature-generator/archives.newschool.edu>
>
>             Image removed by sender. THE NEW SCHOOL
>
>             _______________________________________________
>             Archivesspace_Users_Group mailing list
>             Archivesspace_Users_Group at lyralists.lyrasis.org
>             <mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
>             http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>             <http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group>
>
>
>         -- 
>
>         Michelle Paquette
>
>         (she/her)
>
>         Metadata & Technical Services Archivist
>
>         Special Collections
>
>         Smith College
>
>         413-585-7029
>
>         mpaquette at smith.edu <mailto:mpaquette at smith.edu>
>
>         Please note: In light of COVID-19, the Libraries are offering
>         contactless pickup, and all other services will continue to be
>         offered remotely. Visit bit.ly/SCLcovid-19
>         <http://bit.ly/SCLcovid-19>for full details.
>
>
>         Please send any questions you may have to
>         libraryhelp at smith.edu <mailto:libraryhelp at smith.edu>and they
>         will be answered as soon as possible. Special
>         Collectionsreference service
>         <mailto:specialcollections at smith.edu>is active, but limited.
>         For information about Smith College’s response to Covid-19,
>         please visit thecollege’s official website
>         <https://www.smith.edu/student-life/health-wellness/coronavirus>.
>
>
>     -- 
>
>     Michelle Paquette
>
>     (she/her)
>
>     Metadata & Technical Services Archivist
>
>     Special Collections
>
>     Smith College
>
>     413-585-7029
>
>     mpaquette at smith.edu <mailto:mpaquette at smith.edu>
>
>     Please note: In light of COVID-19, the Libraries are offering
>     contactless pickup, and all other services will continue to be
>     offered remotely. Visit bit.ly/SCLcovid-19
>     <http://bit.ly/SCLcovid-19>for full details.
>
>
>     Please send any questions you may have to libraryhelp at smith.edu
>     <mailto:libraryhelp at smith.edu>and they will be answered as soon as
>     possible. Special Collectionsreference service
>     <mailto:specialcollections at smith.edu>is active, but limited. For
>     information about Smith College’s response to Covid-19, please
>     visit thecollege’s official website
>     <https://www.smith.edu/student-life/health-wellness/coronavirus>.
>
>     _______________________________________________
>     Archivesspace_Users_Group mailing list
>     Archivesspace_Users_Group at lyralists.lyrasis.org
>     <mailto:Archivesspace_Users_Group at lyralists.lyrasis.org>
>     http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
>     <http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group>
>
>
> -- 
>
> *ANNA ROBINSON-SWEET (she/her)*
>
> ASSOCIATE ARCHIVIST
> THE NEW SCHOOL
> ARCHIVES & SPECIAL COLLECTIONS
>
> 66 FIFTH AVENUE, NEW YORK, NY 10011
> robinsa1 at newschool.edu 
> <https://www.newschool.edu/email-signature-generator/robinsa1@newschool.edu>
> *T* 212-229-2942 3375
>
> archives.newschool.edu 
> <https://www.newschool.edu/email-signature-generator/archives.newschool.edu>
>
> Image removed by sender. THE NEW SCHOOL
>
>
> _______________________________________________
> Archivesspace_Users_Group mailing list
> Archivesspace_Users_Group at lyralists.lyrasis.org
> http://lyralists.lyrasis.org/mailman/listinfo/archivesspace_users_group
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20210325/a54566e1/attachment.html>


More information about the Archivesspace_Users_Group mailing list