[Archivesspace_Users_Group] Indexer failing on restart of service - anyone else seen this?

Wed Oct 21 08:09:21 EDT 2015

For anyone else running a clustered setup, this was the solution to our issue.

Joshua

From: Mark Triggs
Date: Tuesday, October 13, 2015 at 6:47 PM
To: "joshua.d.shaw at dartmouth.edu<mailto:joshua.d.shaw at dartmouth.edu>"
Cc: Archivesspace Users Group
Subject: Re: [Archivesspace_Users_Group] Indexer failing on restart of service - anyone else seen this?

Hi Joshua,

Can I just check: do you have config.rb settings for the following?:

 AppConfig[:search_user_secret] = "a"
 AppConfig[:public_user_secret] = "big"
 AppConfig[:staff_user_secret] = "secret"

You'll want to have strong passwords set for those in your `config/config.rb`, with the same values on each server. When ArchivesSpace starts up, it will make sure those system accounts have the right password set.

Without these settings, each ArchivesSpace instance will generate its own random password upon startup and will change the password in the database for those accounts. When there are multiple nodes in the picture, that would have the effect of locking all but one of them out...

This could be made more clear in the documentation I think. The example multi-tenant config file does have placeholders for the values you need to share across all nodes:

https://github.com/archivesspace/archivesspace/blob/master/clustering/files/archivesspace/tenants/_template/archivesspace/config/config.rb

but the CLUSTERING_README doesn't explicitly call them out (except for one cryptic remark :)

Cheers,

Mark

"Joshua D. Shaw" <Joshua.D.Shaw at dartmouth.edu<mailto:Joshua.D.Shaw at dartmouth.edu>> writes:

Hey All-

This may be related to some of the indexer problems that Maureen and others have noticed from time to time, but I'm throwing this out there in case others want to chime in with a solution (best!) or some insight.

Anyway, here's the scenario. Our setup for production is configured as shown below the fold. Basically 4 frontend servers behind an F5. We've noticed that when we do a stop/start of the frontends (for plugin updates, etc) the indexer for one of the frontends sometimes fails. The logs for that server show a lot of errors of the following type:

 D, [2015-10-07T17:53:28.369000 #16298] DEBUG -- : Thread-9382: POST /users/staff_system/login [session: nil]
 D, [2015-10-07T17:53:28.377000 #16298] DEBUG -- : Thread-9382: Post-processed params: {:username=>"staff_system", :password=>"[FILTERED]", :expiring=>false}
 D, [2015-10-07T17:53:28.529000 #16298] DEBUG -- : Thread-9382:
 Responded with [403, {"Content-Type"=>"application/json",
 "Cache-Control"=>"private, must-revalidate, max-age=0",
 "Content-Length"=>"25"}, ["{\"error\":\"Login failed\"}\n"]]... in
 161.0ms
 D, [2015-10-07T17:53:28.542000 #16298] DEBUG -- : Thread-9384: POST /update_monitor [session: nil]
 D, [2015-10-07T17:53:28.548000 #16298] DEBUG -- : Thread-9384: Post-processed params: {:active_edits=>#<JSONModel(:active_edits) {"jsonmodel_type"=>"active_edits", "active_edits"=>[]}>}
 D, [2015-10-07T17:53:28.550000 #16298] DEBUG -- : Thread-9384:
 Responded with [403, {"Content-Type"=>"application/json",
 "Cache-Control"=>"private, must-revalidate, max-age=0",
 "Content-Length"=>"26"}, ["{\"error\":\"Access denied\"}\n"]]... in
 8.0ms
 D, [2015-10-07T17:53:28.558000 #16298] DEBUG -- : Thread-5526: POST /users/staff_system/login [session: nil]
 D, [2015-10-07T17:53:28.564000 #16298] DEBUG -- : Thread-5526: Post-processed params: {:username=>"staff_system", :password=>"[FILTERED]", :expiring=>false}
 D, [2015-10-07T17:53:28.723000 #16298] DEBUG -- : Thread-5526:
 Responded with [403, {"Content-Type"=>"application/json",
 "Cache-Control"=>"private, must-revalidate, max-age=0",
 "Content-Length"=>"25"}, ["{\"error\":\"Login failed\"}\n"]]... in
 165.0ms
 D, [2015-10-07T17:53:32.879000 #16298] DEBUG -- : Thread-5518: POST /users/search_indexer/login [session: nil]
 D, [2015-10-07T17:53:32.885000 #16298] DEBUG -- : Thread-5518: Post-processed params: {:username=>"search_indexer", :password=>"[FILTERED]", :expiring=>false}
 D, [2015-10-07T17:53:33.050000 #16298] DEBUG -- : Thread-5518:
 Responded with [403, {"Content-Type"=>"application/json",
 "Cache-Control"=>"private, must-revalidate, max-age=0",
 "Content-Length"=>"25"}, ["{\"error\":\"Login failed\"}\n"]]... in
 171.0ms
 #<RuntimeError: Authentication to backend failed: {"error":"Login failed"}

Sometimes a simple start/stop of that particular server solves the issue, but we have had to replicate a known good server to the bad to clear things up.

Has anyone else seen something similar? Any thoughts on what a potential cause (and cure) might be?

--
Mark Triggs
<mark at dishevelled.net<mailto:mark at dishevelled.net>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lyralists.lyrasis.org/pipermail/archivesspace_users_group/attachments/20151021/354a2400/attachment.html>