Closed Bug 791783 Opened 12 years ago Closed 12 years ago

passenger dies on puppet dashboard

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

Details

Symptoms:
 http://secure.pub.build.mozilla.org/puppetdash/
(and anything under it) 404's; no error_log to speak of.  I did see

--------------------------------------
[ pid=26763 ] Backtrace with 34 frames:
PassengerHelperAgent[0x513f49]
/lib64/libpthread.so.0[0x33f600f500]
/lib64/libc.so.6(gsignal+0x35)[0x33f5c328a5]
/lib64/libc.so.6(abort+0x175)[0x33f5c34085]
/usr/lib64/libstdc++.so.6(_ZN9__gnu_cxx27__verbose_terminate_handlerEv+0x12d)[0x3f60cbea5d]
/usr/lib64/libstdc++.so.6[0x3f60cbcbe6]
/usr/lib64/libstdc++.so.6[0x3f60cbbb79]
/usr/lib64/libstdc++.so.6(__gxx_personality_v0+0x261)[0x3f60cbc5d1]
/lib64/libgcc_s.so.1[0x3f60810323]
/lib64/libgcc_s.so.1(_Unwind_Resume+0x77)[0x3f608103f7]
PassengerHelperAgent(_ZN5boost5mutex4lockEv+0x6b)[0x4a7865]
PassengerHelperAgent(_ZN5boost10lock_guardINS_5mutexEEC1ERS1_+0x2a)[0x4dfdac]
PassengerHelperAgent(_ZN9Passenger5Timer5startEv+0x23)[0x4ae6a9]
PassengerHelperAgent(_ZN18TimerUpdateHandler18clientDisconnectedERN9Passenger13MessageServer19CommonClientContextERN5boost10shared_ptrINS1_13ClientContextEEE+0x40)[0x4dbd1e]
PassengerHelperAgent(_ZN9Passenger13MessageServer29DisconnectEventBroadcastGuardD1Ev+0x91)[0x4b617f]
PassengerHelperAgent(_ZN9Passenger13MessageServer22clientHandlingMainLoopENS_14FileDescriptorE+0xc76)[0x4b7c3a]
PassengerHelperAgent(_ZNK5boost4_mfi3mf1IvN9Passenger13MessageServerENS2_14FileDescriptorEEclEPS3_S4_+0x7d)[0x509411]
PassengerHelperAgent(_ZN5boost3_bi5list2INS0_5valueIPN9Passenger13MessageServerEEENS2_INS3_14FileDescriptorEEEEclINS_4_mfi3mf1IvS4_S7_EENS0_5list0EEEvNS0_4typeIvEERT_RT0_i+0x79)[0x507a1f]
PassengerHelperAgent(_ZN5boost3_bi6bind_tIvNS_4_mfi3mf1IvN9Passenger13MessageServerENS4_14FileDescriptorEEENS0_5list2INS0_5valueIPS5_EENS9_IS6_EEEEEclEv+0x3f)[0x5056db]
PassengerHelperAgent(_ZN5boost6detail8function26void_function_obj_invoker0INS_3_bi6bind_tIvNS_4_mfi3mf1IvN9Passenger13MessageServerENS7_14FileDescriptorEEENS3_5list2INS3_5valueIPS8_EENSC_IS9_EEEEEEvE6invokeERNS1_15function_bufferE+0x23)[0x5000a9]
PassengerHelperAgent(_ZNK5boost9function0IvEclEv+0x73)[0x4e05eb]
PassengerHelperAgent(_ZN3oxt20dynamic_thread_group11thread_mainERN5boost8functionIFvvEEEPNS0_13thread_handleE+0x3a)[0x4acc48]
PassengerHelperAgent(_ZNK5boost4_mfi3mf2IvN3oxt20dynamic_thread_groupERNS_8functionIFvvEEEPNS3_13thread_handleEEclEPS3_S7_S9_+0x70)[0x5090b6]
PassengerHelperAgent(_ZN5boost3_bi5list3INS0_5valueIPN3oxt20dynamic_thread_groupEEENS2_INS_8functionIFvvEEEEENS2_IPNS4_13thread_handleEEEEclINS_4_mfi3mf2IvS4_RS9_SC_EENS0_5list0EEEvNS0_4typeIvEERT_RT0_i+0x88)[0x5074f4]
PassengerHelperAgent(_ZN5boost3_bi6bind_tIvNS_4_mfi3mf2IvN3oxt20dynamic_thread_groupERNS_8functionIFvvEEEPNS5_13thread_handleEEENS0_5list3INS0_5valueIPS5_EENSE_IS8_EENSE_ISB_EEEEEclEv+0x3f)[0x5051ab]
PassengerHelperAgent(_ZN5boost6detail8function26void_function_obj_invoker0INS_3_bi6bind_tIvNS_4_mfi3mf2IvN3oxt20dynamic_thread_groupERNS_8functionIFvvEEEPNS8_13thread_handleEEENS3_5list3INS3_5valueIPS8_EENSH_ISB_EENSH_ISE_EEEEEEvE6invokeERNS1_15function_bufferE+0x23)[0x4ff585]
PassengerHelperAgent(_ZNK5boost9function0IvEclEv+0x73)[0x4e05eb]
PassengerHelperAgent(_ZN3oxt6thread11thread_mainEN5boost8functionIFvvEEENS1_10shared_ptrINS0_11thread_dataEEE+0x64)[0x4a8cee]
PassengerHelperAgent(_ZN5boost3_bi5list2INS0_5valueINS_8functionIFvvEEEEENS2_INS_10shared_ptrIN3oxt6thread11thread_dataEEEEEEclIPFvS5_SB_ENS0_5list0EEEvNS0_4typeIvEERT_RT0_i+0x8c)[0x50c0c8]
PassengerHelperAgent(_ZN5boost3_bi6bind_tIvPFvNS_8functionIFvvEEENS_10shared_ptrIN3oxt6thread11thread_dataEEEENS0_5list2INS0_5valueIS4_EENSD_IS9_EEEEEclEv+0x3f)[0x50c033]
PassengerHelperAgent(_ZN5boost6detail11thread_dataINS_3_bi6bind_tIvPFvNS_8functionIFvvEEENS_10shared_ptrIN3oxt6thread11thread_dataEEEENS2_5list2INS2_5valueIS6_EENSF_ISB_EEEEEEE3runEv+0x1e)[0x50b838]
PassengerHelperAgent(thread_proxy+0x6d)[0x51cd76]
/lib64/libpthread.so.0[0x33f6007851]
/lib64/libc.so.6(clone+0x6d)[0x33f5ce76dd]
--------------------------------------

on one of the nodes here, but both nodes were 404'ing.

I was able to fix this by killing the PassengerHelperAgent process.
Upstream bug:
  https://projects.puppetlabs.com/issues/16538

I don't see more segfaults, so I'm guessing that's unrelated.
I had the DB VIPs reversed, so this was sometimes hitting a read-only database.  I've fixed the VIPs.  Let's see if that helps..
Nope
I'm reliably seeing this in the production.log on each failure now.

> ActionController::RoutingError (No route matches "/puppetdash/" with {:method=>:get}):
>   sass (3.1.2) [v] rails/./lib/sass/plugin/rack.rb:54:in `call'
>   passenger (3.0.11) lib/phusion_passenger/rack/request_handler.rb:96:in `process_request'
>   passenger (3.0.11) lib/phusion_passenger/abstract_request_handler.rb:513:in `accept_and_process_next_request'
>   passenger (3.0.11) lib/phusion_passenger/abstract_request_handler.rb:274:in `main_loop'
>   passenger (3.0.11) lib/phusion_passenger/classic_rails/application_spawner.rb:321:in `start_request_handler'
>   passenger (3.0.11) lib/phusion_passenger/classic_rails/application_spawner.rb:275:in `send'
>   passenger (3.0.11) lib/phusion_passenger/classic_rails/application_spawner.rb:275:in `handle_spawn_application'
>   passenger (3.0.11) lib/phusion_passenger/utils.rb:479:in `safe_fork'
>   passenger (3.0.11) lib/phusion_passenger/classic_rails/application_spawner.rb:270:in `handle_spawn_application'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server.rb:357:in `__send__'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server.rb:357:in `server_main_loop'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server.rb:206:in `start_synchronously'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server.rb:180:in `start'
>   passenger (3.0.11) lib/phusion_passenger/classic_rails/application_spawner.rb:149:in `start'
>   passenger (3.0.11) lib/phusion_passenger/spawn_manager.rb:219:in `spawn_rails_application'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server_collection.rb:132:in `lookup_or_add'
>   passenger (3.0.11) lib/phusion_passenger/spawn_manager.rb:214:in `spawn_rails_application'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server_collection.rb:82:in `synchronize'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server_collection.rb:79:in `synchronize'
>   passenger (3.0.11) lib/phusion_passenger/spawn_manager.rb:213:in `spawn_rails_application'
>   passenger (3.0.11) lib/phusion_passenger/spawn_manager.rb:132:in `spawn_application'
>   passenger (3.0.11) lib/phusion_passenger/spawn_manager.rb:275:in `handle_spawn_application'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server.rb:357:in `__send__'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server.rb:357:in `server_main_loop'
>   passenger (3.0.11) lib/phusion_passenger/abstract_server.rb:206:in `start_synchronously'
>   passenger (3.0.11) helper-scripts/passenger-spawn-server:99
> 
> Rendering /usr/share/puppet-dashboard/public/404.html (404 Not Found)

That suggests this has something to do with the sub-uri hosting.
http://blog.ashchan.com/archive/2008/12/10/passengers-railsbaseuri-not-working/#comment-3053 -

It appears Passenger hasn't taken care of the relative url root for our projects app. To solve this problem, add this line to environment.rb:

config.action_controller.relative_url_root = "/projects"
So, we're going to move this to its own SSL vhost, so it can run at /.  We'll use the Mozilla CA to generate the cert.
(The underlying problem here is that Passenger can't run the same app at a sub-URI *and* at / on the same servers.  So it breaks once a report is submitted on /, and Passenger caches that state)
OK, I moved it to https://puppetdash.pvt.build.mozilla.org, which uses the Mozilla root cert.  So you can either accept it as an exception, or add the cert to firefox.
  https://wiki.mozilla.org/MozillaRootCertificate

Access is limited to releng and relops via LDAP.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.