小山千晶
koyac****@gmail*****
2014年 11月 5日 (水) 20:16:18 JST
お世話になります。小山と申します。 [質問内容] apacheのフェイルオーバーを実験しているのですがうまくいきませ ん。nodeA、Bの2台で構成している状態から、nodeBにフェイルオー バーさせるために nodeAの apacheを落とすのですが、両方とも stopになってしまいます。 具体的には、Resource Groupが以下のように遷移します。 pacemaker (ocf::heartbeat:Dummy): Started nodeA failover (lsb:failover): Started nodeA apache (lsb:httpd): Started nodeA zabbix-server (lsb:zabbix-server): Started nodeA ↓↓↓ pacemaker (ocf::heartbeat:Dummy): Started nodeA failover (lsb:failover): Started nodeA apache (lsb:httpd): Started [ nodeB nodeA] zabbix-server (lsb:zabbix-server): Started nodeA ↓↓↓ pacemaker (ocf::heartbeat:Dummy): Started nodeA failover (lsb:failover): Started nodeA apache (lsb:httpd): Started nodeA FAILED zabbix-server (lsb:zabbix-server): nodeA ↓↓↓ pacemaker (ocf::heartbeat:Dummy): Started nodeB failover (lsb:failover): Started nodeB apache (lsb:httpd): Started nodeB FAILED zabbix-server (lsb:zabbix-server): nodeB ↓↓↓ pacemaker (ocf::heartbeat:Dummy): Started nodeB failover (lsb:failover): Started nodeB apache (lsb:httpd): Stopped zabbix-server (lsb:zabbix-server): Stopped ログも確認したのですが、よくわかっていません。。フェイルオー バー付近のログを添付しましたので(どこを省いてよいかわからず、 まるまる添付しています)、丸投げとなってしまい申し訳ありませ んが、考えられる原因等ご教授いただけますでしょうか。 よろしくお願いいたします。 [環境] Amazon Linux AMI 2014.09.1 × 2 pacemaker 1.0.13-2.el5 corosync 1.4.1-17.6.amzn1 → zabbixを ELBで冗長構成にしたいので、以下を参考にアク ティブ/スタンバイ構成を作成しました。 ELBとHeartbeatでアクティブ/スタンバイ構成 http://blog.cloudpack.jp/2012/12/13/aws-news-cdp-floating-ip-heartbeat-pacemaker/ 以下、pacemakerの設定内容です。 ---設定内容--- ### Cluster Option ### property no-quorum-policy="ignore" \ stonith-enabled="false" \ crmd-transition-delay="2s" ### Resource Defaults ### rsc_defaults resource-stickiness="INFINITY" \ migration-threshold="1" ### Primitive Configuration ### primitive pacemaker ocf:heartbeat:Dummy \ op start interval="0s" timeout="300s" on-fail="restart" \ op monitor interval="10s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="300s" on-fail="block" primitive apache lsb:httpd \ op start interval="0s" timeout="300s" on-fail="restart" \ op monitor interval="30s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="300s" on-fail="block" primitive failover lsb:failover \ op start interval="0s" timeout="300s" on-fail="restart" \ op monitor interval="10s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="300s" on-fail="block" primitive zabbix-server lsb:zabbix-server \ op start interval="0s" timeout="300s" on-fail="restart" \ op monitor interval="10s" timeout="60s" on-fail="restart" \ op stop interval="0s" timeout="300s" on-fail="block" group Cluster pacemaker failover apache zabbix-server ------ -------------- next part -------------- HTML$B$NE:IU%U%!%$%k$rJ]4I$7$^$7$?(B...下載 -------------- next part -------------- Nov 05 10:48:53 nodeA cib: [16023]: info: cib_process_request: Operation complete: op cib_delete for section //node_state[@uname='nodeB']//lrm_resource[@id='apache'] (origin=nodeB/crmd/ 99, version=0.23.51): ok (rc=0) Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: - <cib admin_epoch="0" epoch="23" num_updates="51" > Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: - <configuration > Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: - <crm_config > Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: - <cluster_property_set id="cib-bootstrap-options" > Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: - <nvpair value="1415183365" id="cib-bootstrap-options-last-lrm-refresh" /> Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: - </cluster_property_set> Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: - </crm_config> Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: - </configuration> Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: - </cib> Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: + <cib admin_epoch="0" epoch="24" num_updates="1" > Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: + <configuration > Nov 05 10:48:53 nodeA crmd: [16027]: info: abort_transition_graph: te_update_diff:274 - Triggered transition abort (complete=1, tag=lrm_rsc_op, id=apache_monitor_0, magic=0:0;6:45:7:5a1e1fe5-3401-43eb-bce2-505b51bdf2de, cib=0.23.51) : Resource op removal Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: + <crm_config > Nov 05 10:48:53 nodeA crmd: [16027]: info: abort_transition_graph: need_abort:59 - Triggered transition abort (complete=1) : Non-status change Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: + <cluster_property_set id="cib-bootstrap-options" > Nov 05 10:48:53 nodeA crmd: [16027]: info: need_abort: Aborting on change to admin_epoch Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: + <nvpair value="1415184533" id="cib-bootstrap-options-last-lrm-refresh" /> Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: + </cluster_property_set> Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: + </crm_config> Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: + </configuration> Nov 05 10:48:53 nodeA cib: [16023]: info: log_data_element: cib:diff: + </cib> Nov 05 10:48:53 nodeA cib: [16023]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=nodeB/crmd/101, version=0.24.1): ok (rc=0) Nov 05 10:48:53 nodeA crmd: [16027]: info: abort_transition_graph: te_update_diff:164 - Triggered transition abort (complete=1, tag=transient_attributes, id=nodeB, magic=NA, cib=0.24.2) : Transient attribute: removal Nov 05 10:48:53 nodeA crmd: [16027]: info: config_query_callback: Checking for expired actions every 900000ms Nov 05 10:48:53 nodeA crmd: [16027]: info: config_query_callback: Sending expected-votes=2 to corosync Nov 05 10:48:53 nodeA crmd: [16027]: info: ais_dispatch: Membership 164: quorum retained Nov 05 10:48:53 nodeA crmd: [16027]: info: crm_ais_dispatch: Setting expected votes to 2 Nov 05 10:48:53 nodeA cib: [16023]: info: cib_process_request: Operation complete: op cib_modify for section crm_config (origin=local/crmd/307, version=0.24.2): ok (rc=0) Nov 05 10:48:53 nodeA cib: [21148]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-5.raw Nov 05 10:48:53 nodeA cib: [21148]: info: write_cib_contents: Wrote version 0.24.0 of the CIB to disk (digest: b980ee2be9da116ae329104eaae9a3a9) Nov 05 10:48:53 nodeA cib: [21148]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.ohOhPD (digest: /var/lib/heartbeat/crm/cib.pXZc1L) Nov 05 10:48:55 nodeA crmd: [16027]: info: crm_timer_popped: New Transition Timer (I_PE_CALC) just popped! Nov 05 10:48:55 nodeA crmd: [16027]: info: do_state_transition: State transition S_IDLE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ] Nov 05 10:48:55 nodeA crmd: [16027]: info: do_state_transition: Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED Nov 05 10:48:55 nodeA crmd: [16027]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Nov 05 10:48:55 nodeA crmd: [16027]: info: do_pe_invoke: Query 308: Requesting the current CIB: S_POLICY_ENGINE Nov 05 10:48:55 nodeA crmd: [16027]: info: do_pe_invoke_callback: Invoking the PE: query=308, ref=pe_calc-dc-1415184535-323, seq=164, quorate=1 Nov 05 10:48:55 nodeA pengine: [16026]: notice: unpack_config: On loss of CCM Quorum: Ignore Nov 05 10:48:55 nodeA pengine: [16026]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 Nov 05 10:48:55 nodeA pengine: [16026]: info: determine_online_status: Node nodeB is online Nov 05 10:48:55 nodeA pengine: [16026]: info: determine_online_status: Node nodeA is online Nov 05 10:48:55 nodeA pengine: [16026]: notice: unpack_rsc_op: Operation apache_monitor_0 found resource apache active on nodeA Nov 05 10:48:55 nodeA pengine: [16026]: notice: group_print: Resource Group: Cluster Nov 05 10:48:55 nodeA pengine: [16026]: notice: native_print: pacemaker (ocf::heartbeat:Dummy): Started nodeA Nov 05 10:48:55 nodeA pengine: [16026]: notice: native_print: failover (lsb:failover): Started nodeA Nov 05 10:48:55 nodeA pengine: [16026]: notice: native_print: apache (lsb:httpd): Started nodeA Nov 05 10:48:55 nodeA pengine: [16026]: notice: native_print: zabbix-server (lsb:zabbix-server): Started nodeA Nov 05 10:48:55 nodeA pengine: [16026]: notice: LogActions: Leave resource pacemaker (Started nodeA) Nov 05 10:48:55 nodeA pengine: [16026]: notice: LogActions: Leave resource failover (Started nodeA) Nov 05 10:48:55 nodeA pengine: [16026]: notice: LogActions: Leave resource apache (Started nodeA) Nov 05 10:48:55 nodeA pengine: [16026]: notice: LogActions: Leave resource zabbix-server (Started nodeA) Nov 05 10:48:55 nodeA crmd: [16027]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Nov 05 10:48:55 nodeA crmd: [16027]: info: unpack_graph: Unpacked transition 56: 3 actions in 3 synapses Nov 05 10:48:55 nodeA crmd: [16027]: info: do_te_invoke: Processing graph 56 (ref=pe_calc-dc-1415184535-323) derived from /var/lib/pengine/pe-input-319.bz2 Nov 05 10:48:55 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 8: monitor apache_monitor_0 on nodeB Nov 05 10:48:55 nodeA pengine: [16026]: info: process_pe_message: Transition 56: PEngine Input stored in: /var/lib/pengine/pe-input-319.bz2 Nov 05 10:48:55 nodeA crmd: [16027]: WARN: status_from_rc: Action 8 (apache_monitor_0) on nodeB failed (target: 7 vs. rc: 0): Error Nov 05 10:48:55 nodeA crmd: [16027]: info: abort_transition_graph: match_graph_event:299 - Triggered transition abort (complete=0, tag=lrm_rsc_op, id=apache_monitor_0, magic=0:0;8:56:7:5a1e1fe5-3401-43eb-bce2-505b51bdf2de, cib=0.24.3) : Event failed Nov 05 10:48:55 nodeA crmd: [16027]: info: update_abort_priority: Abort priority upgraded from 0 to 1 Nov 05 10:48:55 nodeA crmd: [16027]: info: update_abort_priority: Abort action done superceeded by restart Nov 05 10:48:55 nodeA crmd: [16027]: info: match_graph_event: Action apache_monitor_0 (8) confirmed on nodeB (rc=4) Nov 05 10:48:55 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 7: probe_complete probe_complete on nodeB - no waiting Nov 05 10:48:55 nodeA crmd: [16027]: info: run_graph: ==================================================== Nov 05 10:48:55 nodeA crmd: [16027]: notice: run_graph: Transition 56 (Complete=2, Pending=0, Fired=0, Skipped=1, Incomplete=0, Source=/var/lib/pengine/pe-input-319.bz2): Stopped Nov 05 10:48:55 nodeA crmd: [16027]: info: te_graph_trigger: Transition 56 is now complete Nov 05 10:48:57 nodeA crmd: [16027]: info: crm_timer_popped: New Transition Timer (I_PE_CALC) just popped! Nov 05 10:48:57 nodeA crmd: [16027]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ] Nov 05 10:48:57 nodeA crmd: [16027]: info: do_state_transition: Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED Nov 05 10:48:57 nodeA crmd: [16027]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Nov 05 10:48:57 nodeA crmd: [16027]: info: do_pe_invoke: Query 309: Requesting the current CIB: S_POLICY_ENGINE Nov 05 10:48:57 nodeA crmd: [16027]: info: do_pe_invoke_callback: Invoking the PE: query=309, ref=pe_calc-dc-1415184537-326, seq=164, quorate=1 Nov 05 10:48:57 nodeA pengine: [16026]: notice: unpack_config: On loss of CCM Quorum: Ignore Nov 05 10:48:57 nodeA pengine: [16026]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 Nov 05 10:48:57 nodeA pengine: [16026]: info: determine_online_status: Node nodeB is online Nov 05 10:48:57 nodeA pengine: [16026]: info: determine_online_status: Node nodeA is online Nov 05 10:48:57 nodeA pengine: [16026]: notice: unpack_rsc_op: Operation apache_monitor_0 found resource apache active on nodeB Nov 05 10:48:57 nodeA pengine: [16026]: notice: unpack_rsc_op: Operation apache_monitor_0 found resource apache active on nodeA Nov 05 10:48:57 nodeA pengine: [16026]: ERROR: native_add_running: Resource lsb::httpd:apache appears to be active on 2 nodes. Nov 05 10:48:57 nodeA pengine: [16026]: WARN: See http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active for more information. Nov 05 10:48:57 nodeA pengine: [16026]: notice: group_print: Resource Group: Cluster Nov 05 10:48:57 nodeA pengine: [16026]: notice: native_print: pacemaker (ocf::heartbeat:Dummy): Started nodeA Nov 05 10:48:57 nodeA pengine: [16026]: notice: native_print: failover (lsb:failover): Started nodeA Nov 05 10:48:57 nodeA pengine: [16026]: notice: native_print: apache (lsb:httpd): Started Nov 05 10:48:57 nodeA pengine: [16026]: notice: native_print: 0 : nodeB Nov 05 10:48:57 nodeA pengine: [16026]: notice: native_print: 1 : nodeA Nov 05 10:48:57 nodeA pengine: [16026]: notice: native_print: zabbix-server (lsb:zabbix-server): Started nodeA Nov 05 10:48:57 nodeA pengine: [16026]: WARN: native_create_actions: Attempting recovery of resource apache Nov 05 10:48:57 nodeA pengine: [16026]: notice: RecurringOp: Start recurring monitor (30s) for apache on nodeA Nov 05 10:48:57 nodeA pengine: [16026]: notice: LogActions: Leave resource pacemaker (Started nodeA) Nov 05 10:48:57 nodeA pengine: [16026]: notice: LogActions: Leave resource failover (Started nodeA) Nov 05 10:48:57 nodeA pengine: [16026]: notice: LogActions: Move resource apache (Started nodeB -> nodeA) Nov 05 10:48:57 nodeA pengine: [16026]: notice: LogActions: Restart resource zabbix-server (Started nodeA) Nov 05 10:48:57 nodeA crmd: [16027]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Nov 05 10:48:57 nodeA crmd: [16027]: info: unpack_graph: Unpacked transition 57: 12 actions in 12 synapses Nov 05 10:48:57 nodeA crmd: [16027]: info: do_te_invoke: Processing graph 57 (ref=pe_calc-dc-1415184537-326) derived from /var/lib/pengine/pe-error-17.bz2 Nov 05 10:48:57 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 20 fired and confirmed Nov 05 10:48:57 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 16: stop zabbix-server_stop_0 on nodeA (local) Nov 05 10:48:57 nodeA lrmd: [16024]: info: cancel_op: operation monitor[91] on lsb::zabbix-server::zabbix-server for client 16027, its parameters: CRM_meta_on_fail=[restart] crm_feature_set=[3.0.1] CRM_meta_name=[monitor] CRM_meta_interval=[10000] CRM_meta_timeout=[60000] cancelled Nov 05 10:48:57 nodeA crmd: [16027]: info: do_lrm_rsc_op: Performing key=16:57:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=zabbix-server_stop_0 ) Nov 05 10:48:57 nodeA lrmd: [16024]: info: rsc:zabbix-server:92: stop Nov 05 10:48:57 nodeA crmd: [16027]: info: process_lrm_event: LRM operation zabbix-server_monitor_10000 (call=91, status=1, cib-update=0, confirmed=true) Cancelled Nov 05 10:48:57 nodeA lrmd: [21166]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:48:57 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) Shutting down Zabbix server: Nov 05 10:48:57 nodeA pengine: [16026]: ERROR: process_pe_message: Transition 57: ERRORs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-error-17.bz2 Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) [ Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) OK Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) ] Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) ^M Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) Nov 05 10:48:59 nodeA crmd: [16027]: info: process_lrm_event: LRM operation zabbix-server_stop_0 (call=92, rc=0, cib-update=310, confirmed=true) ok Nov 05 10:48:59 nodeA crmd: [16027]: info: match_graph_event: Action zabbix-server_stop_0 (16) confirmed on nodeA (rc=0) Nov 05 10:48:59 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 13: stop apache_stop_0 on nodeB Nov 05 10:48:59 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 14: stop apache_stop_0 on nodeA (local) Nov 05 10:48:59 nodeA lrmd: [16024]: info: cancel_op: operation monitor[89] on lsb::httpd::apache for client 16027, its parameters: CRM_meta_on_fail=[restart] crm_feature_set=[3.0.1] CRM_meta_name=[monitor] CRM_meta_interval=[30000] CRM_meta_timeout=[60000] cancelled Nov 05 10:48:59 nodeA crmd: [16027]: info: do_lrm_rsc_op: Performing key=14:57:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=apache_stop_0 ) Nov 05 10:48:59 nodeA lrmd: [16024]: info: rsc:apache:93: stop Nov 05 10:48:59 nodeA crmd: [16027]: info: process_lrm_event: LRM operation apache_monitor_30000 (call=89, status=1, cib-update=0, confirmed=true) Cancelled Nov 05 10:48:59 nodeA lrmd: [21176]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) Stopping httpd: Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) [ Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) OK Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) ] Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) ^M Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) Nov 05 10:48:59 nodeA crmd: [16027]: info: process_lrm_event: LRM operation apache_stop_0 (call=93, rc=0, cib-update=311, confirmed=true) ok Nov 05 10:48:59 nodeA crmd: [16027]: info: match_graph_event: Action apache_stop_0 (14) confirmed on nodeA (rc=0) Nov 05 10:48:59 nodeA crmd: [16027]: info: match_graph_event: Action apache_stop_0 (13) confirmed on nodeB (rc=0) Nov 05 10:48:59 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 21 fired and confirmed Nov 05 10:48:59 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 5 fired and confirmed Nov 05 10:48:59 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 18 fired and confirmed Nov 05 10:48:59 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 15: start apache_start_0 on nodeA (local) Nov 05 10:48:59 nodeA crmd: [16027]: info: do_lrm_rsc_op: Performing key=15:57:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=apache_start_0 ) Nov 05 10:48:59 nodeA lrmd: [16024]: info: rsc:apache:94: start Nov 05 10:48:59 nodeA lrmd: [21188]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:start:stdout) Starting httpd: Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:start:stdout) [ Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:start:stdout) OK Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:start:stdout) ] Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:start:stdout) ^M Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (apache:start:stdout) Nov 05 10:48:59 nodeA crmd: [16027]: info: process_lrm_event: LRM operation apache_start_0 (call=94, rc=0, cib-update=312, confirmed=true) ok Nov 05 10:48:59 nodeA crmd: [16027]: info: match_graph_event: Action apache_start_0 (15) confirmed on nodeA (rc=0) Nov 05 10:48:59 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 4: monitor apache_monitor_30000 on nodeA (local) Nov 05 10:48:59 nodeA crmd: [16027]: info: do_lrm_rsc_op: Performing key=4:57:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=apache_monitor_30000 ) Nov 05 10:48:59 nodeA lrmd: [16024]: info: rsc:apache:95: monitor Nov 05 10:48:59 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 17: start zabbix-server_start_0 on nodeA (local) Nov 05 10:48:59 nodeA crmd: [16027]: info: do_lrm_rsc_op: Performing key=17:57:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=zabbix-server_start_0 ) Nov 05 10:48:59 nodeA lrmd: [16024]: info: rsc:zabbix-server:96: start Nov 05 10:48:59 nodeA lrmd: [21200]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:start:stdout) Starting Zabbix server: Nov 05 10:48:59 nodeA crmd: [16027]: info: process_lrm_event: LRM operation apache_monitor_30000 (call=95, rc=7, cib-update=313, confirmed=false) not running Nov 05 10:48:59 nodeA crmd: [16027]: WARN: status_from_rc: Action 4 (apache_monitor_30000) on nodeA failed (target: 0 vs. rc: 7): Error Nov 05 10:48:59 nodeA crmd: [16027]: WARN: update_failcount: Updating failcount for apache on nodeA after failed monitor: rc=7 (update=value++, time=1415184539) Nov 05 10:48:59 nodeA crmd: [16027]: info: abort_transition_graph: match_graph_event:299 - Triggered transition abort (complete=0, tag=lrm_rsc_op, id=apache_monitor_30000, magic=0:7;4:57:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de, cib=0.24.8) : Event failed Nov 05 10:48:59 nodeA crmd: [16027]: info: update_abort_priority: Abort priority upgraded from 0 to 1 Nov 05 10:48:59 nodeA crmd: [16027]: info: update_abort_priority: Abort action done superceeded by restart Nov 05 10:48:59 nodeA crmd: [16027]: info: match_graph_event: Action apache_monitor_30000 (4) confirmed on nodeA (rc=4) Nov 05 10:48:59 nodeA attrd: [16025]: info: attrd_local_callback: Expanded fail-count-apache=value++ to 1 Nov 05 10:48:59 nodeA attrd: [16025]: info: attrd_trigger_update: Sending flush op to all hosts for: fail-count-apache (1) Nov 05 10:48:59 nodeA attrd: [16025]: info: attrd_perform_update: Sent update 144: fail-count-apache=1 Nov 05 10:48:59 nodeA attrd: [16025]: info: attrd_trigger_update: Sending flush op to all hosts for: last-failure-apache (1415184539) Nov 05 10:48:59 nodeA crmd: [16027]: info: abort_transition_graph: te_update_diff:150 - Triggered transition abort (complete=0, tag=nvpair, id=status-nodeA-fail-count-apache, name=fail-count-apache, value=1, magic=NA, cib=0.24.9) : Transient attribute: update Nov 05 10:48:59 nodeA attrd: [16025]: info: attrd_perform_update: Sent update 146: last-failure-apache=1415184539 Nov 05 10:48:59 nodeA crmd: [16027]: info: update_abort_priority: Abort priority upgraded from 1 to 1000000 Nov 05 10:48:59 nodeA crmd: [16027]: info: update_abort_priority: 'Event failed' abort superceeded Nov 05 10:48:59 nodeA crmd: [16027]: info: abort_transition_graph: te_update_diff:150 - Triggered transition abort (complete=0, tag=nvpair, id=status-nodeA-last-failure-apache, name=NA, value=1415184539, magic=NA, cib=0.24.10) : Transient attribute: update Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:start:stdout) [ Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:start:stdout) OK Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:start:stdout) ] Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:start:stdout) ^M Nov 05 10:48:59 nodeA lrmd: [16024]: info: RA output: (zabbix-server:start:stdout) Nov 05 10:48:59 nodeA crmd: [16027]: info: process_lrm_event: LRM operation zabbix-server_start_0 (call=96, rc=0, cib-update=314, confirmed=true) ok Nov 05 10:48:59 nodeA crmd: [16027]: info: match_graph_event: Action zabbix-server_start_0 (17) confirmed on nodeA (rc=0) Nov 05 10:48:59 nodeA crmd: [16027]: info: run_graph: ==================================================== Nov 05 10:48:59 nodeA crmd: [16027]: notice: run_graph: Transition 57 (Complete=10, Pending=0, Fired=0, Skipped=2, Incomplete=0, Source=/var/lib/pengine/pe-error-17.bz2): Stopped Nov 05 10:48:59 nodeA crmd: [16027]: info: te_graph_trigger: Transition 57 is now complete Nov 05 10:49:01 nodeA crmd: [16027]: info: crm_timer_popped: New Transition Timer (I_PE_CALC) just popped! Nov 05 10:49:01 nodeA crmd: [16027]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ] Nov 05 10:49:01 nodeA crmd: [16027]: info: do_state_transition: Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED Nov 05 10:49:01 nodeA crmd: [16027]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Nov 05 10:49:01 nodeA crmd: [16027]: info: do_pe_invoke: Query 315: Requesting the current CIB: S_POLICY_ENGINE Nov 05 10:49:01 nodeA crmd: [16027]: info: do_pe_invoke_callback: Invoking the PE: query=315, ref=pe_calc-dc-1415184541-333, seq=164, quorate=1 Nov 05 10:49:01 nodeA pengine: [16026]: notice: unpack_config: On loss of CCM Quorum: Ignore Nov 05 10:49:01 nodeA pengine: [16026]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 Nov 05 10:49:01 nodeA pengine: [16026]: info: determine_online_status: Node nodeB is online Nov 05 10:49:01 nodeA pengine: [16026]: info: determine_online_status: Node nodeA is online Nov 05 10:49:01 nodeA pengine: [16026]: notice: unpack_rsc_op: Operation apache_monitor_0 found resource apache active on nodeB Nov 05 10:49:01 nodeA pengine: [16026]: notice: unpack_rsc_op: Operation apache_monitor_0 found resource apache active on nodeA Nov 05 10:49:01 nodeA pengine: [16026]: WARN: unpack_rsc_op: Processing failed op apache_monitor_30000 on nodeA: not running (7) Nov 05 10:49:01 nodeA pengine: [16026]: notice: group_print: Resource Group: Cluster Nov 05 10:49:01 nodeA pengine: [16026]: notice: native_print: pacemaker (ocf::heartbeat:Dummy): Started nodeA Nov 05 10:49:01 nodeA pengine: [16026]: notice: native_print: failover (lsb:failover): Started nodeA Nov 05 10:49:01 nodeA pengine: [16026]: notice: native_print: apache (lsb:httpd): Started nodeA FAILED Nov 05 10:49:01 nodeA pengine: [16026]: notice: native_print: zabbix-server (lsb:zabbix-server): Started nodeA Nov 05 10:49:01 nodeA pengine: [16026]: info: get_failcount: apache has failed 1 times on nodeA Nov 05 10:49:01 nodeA pengine: [16026]: WARN: common_apply_stickiness: Forcing apache away from nodeA after 1 failures (max=1) Nov 05 10:49:01 nodeA pengine: [16026]: notice: RecurringOp: Start recurring monitor (10s) for pacemaker on nodeB Nov 05 10:49:01 nodeA pengine: [16026]: notice: RecurringOp: Start recurring monitor (10s) for failover on nodeB Nov 05 10:49:01 nodeA pengine: [16026]: notice: RecurringOp: Start recurring monitor (30s) for apache on nodeB Nov 05 10:49:01 nodeA pengine: [16026]: notice: RecurringOp: Start recurring monitor (10s) for zabbix-server on nodeB Nov 05 10:49:01 nodeA pengine: [16026]: notice: LogActions: Move resource pacemaker (Started nodeA -> nodeB) Nov 05 10:49:01 nodeA pengine: [16026]: notice: LogActions: Move resource failover (Started nodeA -> nodeB) Nov 05 10:49:01 nodeA pengine: [16026]: notice: LogActions: Move resource apache (Started nodeA -> nodeB) Nov 05 10:49:01 nodeA pengine: [16026]: notice: LogActions: Move resource zabbix-server (Started nodeA -> nodeB) Nov 05 10:49:01 nodeA crmd: [16027]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Nov 05 10:49:01 nodeA crmd: [16027]: info: unpack_graph: Unpacked transition 58: 17 actions in 17 synapses Nov 05 10:49:01 nodeA crmd: [16027]: info: do_te_invoke: Processing graph 58 (ref=pe_calc-dc-1415184541-333) derived from /var/lib/pengine/pe-input-320.bz2 Nov 05 10:49:01 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 22 fired and confirmed Nov 05 10:49:01 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 17: stop zabbix-server_stop_0 on nodeA (local) Nov 05 10:49:01 nodeA crmd: [16027]: info: do_lrm_rsc_op: Performing key=17:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=zabbix-server_stop_0 ) Nov 05 10:49:01 nodeA lrmd: [16024]: info: rsc:zabbix-server:97: stop Nov 05 10:49:01 nodeA lrmd: [21245]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:49:01 nodeA pengine: [16026]: info: process_pe_message: Transition 58: PEngine Input stored in: /var/lib/pengine/pe-input-320.bz2 Nov 05 10:49:01 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) Shutting down Zabbix server: Nov 05 10:49:03 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) [ Nov 05 10:49:03 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) OK Nov 05 10:49:03 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) ] Nov 05 10:49:03 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) ^M Nov 05 10:49:03 nodeA lrmd: [16024]: info: RA output: (zabbix-server:stop:stdout) Nov 05 10:49:03 nodeA crmd: [16027]: info: process_lrm_event: LRM operation zabbix-server_stop_0 (call=97, rc=0, cib-update=316, confirmed=true) ok Nov 05 10:49:03 nodeA crmd: [16027]: info: match_graph_event: Action zabbix-server_stop_0 (17) confirmed on nodeA (rc=0) Nov 05 10:49:03 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 4: stop apache_stop_0 on nodeA (local) Nov 05 10:49:03 nodeA lrmd: [16024]: info: cancel_op: operation monitor[95] on lsb::httpd::apache for client 16027, its parameters: CRM_meta_on_fail=[restart] crm_feature_set=[3.0.1] CRM_meta_name=[monitor] CRM_meta_interval=[30000] CRM_meta_timeout=[60000] cancelled Nov 05 10:49:03 nodeA crmd: [16027]: info: do_lrm_rsc_op: Performing key=4:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=apache_stop_0 ) Nov 05 10:49:03 nodeA lrmd: [16024]: info: rsc:apache:98: stop Nov 05 10:49:03 nodeA crmd: [16027]: info: process_lrm_event: LRM operation apache_monitor_30000 (call=95, status=1, cib-update=0, confirmed=true) Cancelled Nov 05 10:49:03 nodeA lrmd: [21257]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:49:03 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) Stopping httpd: Nov 05 10:49:04 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) [ Nov 05 10:49:04 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) OK Nov 05 10:49:04 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) ] Nov 05 10:49:04 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) ^M Nov 05 10:49:04 nodeA lrmd: [16024]: info: RA output: (apache:stop:stdout) Nov 05 10:49:04 nodeA crmd: [16027]: info: process_lrm_event: LRM operation apache_stop_0 (call=98, rc=0, cib-update=317, confirmed=true) ok Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action apache_stop_0 (4) confirmed on nodeA (rc=0) Nov 05 10:49:04 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 12: stop failover_stop_0 on nodeA (local) Nov 05 10:49:04 nodeA lrmd: [16024]: info: cancel_op: operation monitor[87] on lsb::failover::failover for client 16027, its parameters: CRM_meta_on_fail=[restart] crm_feature_set=[3.0.1] CRM_meta_name=[monitor] CRM_meta_interval=[10000] CRM_meta_timeout=[60000] cancelled Nov 05 10:49:04 nodeA crmd: [16027]: info: do_lrm_rsc_op: Performing key=12:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=failover_stop_0 ) Nov 05 10:49:04 nodeA lrmd: [16024]: info: rsc:failover:99: stop Nov 05 10:49:04 nodeA crmd: [16027]: info: process_lrm_event: LRM operation failover_monitor_10000 (call=87, status=1, cib-update=0, confirmed=true) Cancelled Nov 05 10:49:04 nodeA lrmd: [21269]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:49:04 nodeA lrmd: [16024]: info: RA output: (failover:stop:stderr) failover[21272]: stop Nov 05 10:49:04 nodeA crmd: [16027]: info: process_lrm_event: LRM operation failover_stop_0 (call=99, rc=0, cib-update=318, confirmed=true) ok Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action failover_stop_0 (12) confirmed on nodeA (rc=0) Nov 05 10:49:04 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 9: stop pacemaker_stop_0 on nodeA (local) Nov 05 10:49:04 nodeA lrmd: [16024]: info: cancel_op: operation monitor[85] on ocf::Dummy::pacemaker for client 16027, its parameters: CRM_meta_on_fail=[restart] crm_feature_set=[3.0.1] CRM_meta_name=[monitor] CRM_meta_interval=[10000] CRM_meta_timeout=[60000] cancelled Nov 05 10:49:04 nodeA crmd: [16027]: info: do_lrm_rsc_op: Performing key=9:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=pacemaker_stop_0 ) Nov 05 10:49:04 nodeA lrmd: [16024]: info: rsc:pacemaker:100: stop Nov 05 10:49:04 nodeA crmd: [16027]: info: process_lrm_event: LRM operation pacemaker_monitor_10000 (call=85, status=1, cib-update=0, confirmed=true) Cancelled Nov 05 10:49:04 nodeA crmd: [16027]: info: process_lrm_event: LRM operation pacemaker_stop_0 (call=100, rc=0, cib-update=319, confirmed=true) ok Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action pacemaker_stop_0 (9) confirmed on nodeA (rc=0) Nov 05 10:49:04 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 23 fired and confirmed Nov 05 10:49:04 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 5 fired and confirmed Nov 05 10:49:04 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 20 fired and confirmed Nov 05 10:49:04 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 10: start pacemaker_start_0 on nodeB Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action pacemaker_start_0 (10) confirmed on nodeB (rc=0) Nov 05 10:49:04 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 11: monitor pacemaker_monitor_10000 on nodeB Nov 05 10:49:04 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 13: start failover_start_0 on nodeB Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action pacemaker_monitor_10000 (11) confirmed on nodeB (rc=0) Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action failover_start_0 (13) confirmed on nodeB (rc=0) Nov 05 10:49:04 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 14: monitor failover_monitor_10000 on nodeB Nov 05 10:49:04 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 15: start apache_start_0 on nodeB Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action failover_monitor_10000 (14) confirmed on nodeB (rc=0) Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action apache_start_0 (15) confirmed on nodeB (rc=0) Nov 05 10:49:04 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 16: monitor apache_monitor_30000 on nodeB Nov 05 10:49:04 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 18: start zabbix-server_start_0 on nodeB Nov 05 10:49:04 nodeA crmd: [16027]: WARN: status_from_rc: Action 16 (apache_monitor_30000) on nodeB failed (target: 0 vs. rc: 7): Error Nov 05 10:49:04 nodeA crmd: [16027]: WARN: update_failcount: Updating failcount for apache on nodeB after failed monitor: rc=7 (update=value++, time=1415184544) Nov 05 10:49:04 nodeA crmd: [16027]: info: abort_transition_graph: match_graph_event:299 - Triggered transition abort (complete=0, tag=lrm_rsc_op, id=apache_monitor_30000, magic=0:7;16:58:0:5a1e1fe5- 3401-43eb-bce2-505b51bdf2de, cib=0.24.21) : Event failed Nov 05 10:49:04 nodeA crmd: [16027]: info: update_abort_priority: Abort priority upgraded from 0 to 1 Nov 05 10:49:04 nodeA crmd: [16027]: info: update_abort_priority: Abort action done superceeded by restart Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action apache_monitor_30000 (16) confirmed on nodeB (rc=4) Nov 05 10:49:04 nodeA crmd: [16027]: info: abort_transition_graph: te_update_diff:150 - Triggered transition abort (complete=0, tag=nvpair, id=status-nodeB-fail-count-apache, name=fail-count-apache, value=1, magic=NA, cib=0.24.22) : Transient attribute: update Nov 05 10:49:04 nodeA crmd: [16027]: info: update_abort_priority: Abort priority upgraded from 1 to 1000000 Nov 05 10:49:04 nodeA crmd: [16027]: info: update_abort_priority: 'Event failed' abort superceeded Nov 05 10:49:04 nodeA crmd: [16027]: info: abort_transition_graph: te_update_diff:150 - Triggered transition abort (complete=0, tag=nvpair, id=status-nodeB-last-failure-apache, name=NA, value=1415184544, magic=NA, cib=0.24.23) : Transient attribute: update Nov 05 10:49:04 nodeA crmd: [16027]: info: match_graph_event: Action zabbix-server_start_0 (18) confirmed on nodeB (rc=0) Nov 05 10:49:04 nodeA crmd: [16027]: info: run_graph: ==================================================== Nov 05 10:49:04 nodeA crmd: [16027]: notice: run_graph: Transition 58 (Complete=15, Pending=0, Fired=0, Skipped=2, Incomplete=0, Source=/var/lib/pengine/pe-input-320.bz2): Stopped Nov 05 10:49:04 nodeA crmd: [16027]: info: te_graph_trigger: Transition 58 is now complete Nov 05 10:49:06 nodeA crmd: [16027]: info: crm_timer_popped: New Transition Timer (I_PE_CALC) just popped! Nov 05 10:49:06 nodeA crmd: [16027]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ input=I_PE_CALC cause=C_TIMER_POPPED origin=crm_timer_popped ] Nov 05 10:49:06 nodeA crmd: [16027]: info: do_state_transition: Progressed to state S_POLICY_ENGINE after C_TIMER_POPPED Nov 05 10:49:06 nodeA crmd: [16027]: info: do_state_transition: All 2 cluster nodes are eligible to run resources. Nov 05 10:49:06 nodeA crmd: [16027]: info: do_pe_invoke: Query 320: Requesting the current CIB: S_POLICY_ENGINE Nov 05 10:49:06 nodeA crmd: [16027]: info: do_pe_invoke_callback: Invoking the PE: query=320, ref=pe_calc-dc-1415184546-345, seq=164, quorate=1 Nov 05 10:49:06 nodeA pengine: [16026]: notice: unpack_config: On loss of CCM Quorum: Ignore Nov 05 10:49:06 nodeA pengine: [16026]: info: unpack_config: Node scores: 'red' = -INFINITY, 'yellow' = 0, 'green' = 0 Nov 05 10:49:06 nodeA pengine: [16026]: info: determine_online_status: Node nodeB is online Nov 05 10:49:06 nodeA pengine: [16026]: info: determine_online_status: Node nodeA is online Nov 05 10:49:06 nodeA pengine: [16026]: notice: unpack_rsc_op: Operation apache_monitor_0 found resource apache active on nodeB Nov 05 10:49:06 nodeA pengine: [16026]: WARN: unpack_rsc_op: Processing failed op apache_monitor_30000 on nodeB: not running (7) Nov 05 10:49:06 nodeA pengine: [16026]: notice: unpack_rsc_op: Operation apache_monitor_0 found resource apache active on nodeA Nov 05 10:49:06 nodeA pengine: [16026]: WARN: unpack_rsc_op: Processing failed op apache_monitor_30000 on nodeA: not running (7) Nov 05 10:49:06 nodeA pengine: [16026]: notice: group_print: Resource Group: Cluster Nov 05 10:49:06 nodeA pengine: [16026]: notice: native_print: pacemaker (ocf::heartbeat:Dummy): Started nodeB Nov 05 10:49:06 nodeA pengine: [16026]: notice: native_print: failover (lsb:failover): Started nodeB Nov 05 10:49:06 nodeA pengine: [16026]: notice: native_print: apache (lsb:httpd): Started nodeB FAILED Nov 05 10:49:06 nodeA pengine: [16026]: notice: native_print: zabbix-server (lsb:zabbix-server): Started nodeB Nov 05 10:49:06 nodeA pengine: [16026]: info: get_failcount: apache has failed 1 times on nodeB Nov 05 10:49:06 nodeA pengine: [16026]: WARN: common_apply_stickiness: Forcing apache away from nodeB after 1 failures (max=1) Nov 05 10:49:06 nodeA pengine: [16026]: info: get_failcount: apache has failed 1 times on nodeA Nov 05 10:49:06 nodeA pengine: [16026]: WARN: common_apply_stickiness: Forcing apache away from nodeA after 1 failures (max=1) Nov 05 10:49:06 nodeA pengine: [16026]: info: rsc_merge_weights: pacemaker: Rolling back scores from apache Nov 05 10:49:06 nodeA pengine: [16026]: info: rsc_merge_weights: failover: Rolling back scores from apache Nov 05 10:49:06 nodeA pengine: [16026]: info: rsc_merge_weights: apache: Rolling back scores from zabbix-server Nov 05 10:49:06 nodeA pengine: [16026]: info: native_color: Resource apache cannot run anywhere Nov 05 10:49:06 nodeA pengine: [16026]: info: native_color: Resource zabbix-server cannot run anywhere Nov 05 10:49:06 nodeA pengine: [16026]: notice: LogActions: Leave resource pacemaker (Started nodeB) Nov 05 10:49:06 nodeA pengine: [16026]: notice: LogActions: Leave resource failover (Started nodeB) Nov 05 10:49:06 nodeA pengine: [16026]: notice: LogActions: Stop resource apache (nodeB) Nov 05 10:49:06 nodeA pengine: [16026]: notice: LogActions: Stop resource zabbix-server (nodeB) Nov 05 10:49:06 nodeA crmd: [16027]: info: do_state_transition: State transition S_POLICY_ENGINE -> S_TRANSITION_ENGINE [ input=I_PE_SUCCESS cause=C_IPC_MESSAGE origin=handle_response ] Nov 05 10:49:06 nodeA crmd: [16027]: info: unpack_graph: Unpacked transition 59: 5 actions in 5 synapses Nov 05 10:49:06 nodeA crmd: [16027]: info: do_te_invoke: Processing graph 59 (ref=pe_calc-dc-1415184546-345) derived from /var/lib/pengine/pe-input-321.bz2 Nov 05 10:49:06 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 16 fired and confirmed Nov 05 10:49:06 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 13: stop zabbix-server_stop_0 on nodeB Nov 05 10:49:06 nodeA pengine: [16026]: info: process_pe_message: Transition 59: PEngine Input stored in: /var/lib/pengine/pe-input-321.bz2 Nov 05 10:49:08 nodeA crmd: [16027]: info: match_graph_event: Action zabbix-server_stop_0 (13) confirmed on nodeB (rc=0) Nov 05 10:49:08 nodeA crmd: [16027]: info: te_rsc_command: Initiating action 4: stop apache_stop_0 on nodeB Nov 05 10:49:08 nodeA crmd: [16027]: info: match_graph_event: Action apache_stop_0 (4) confirmed on nodeB (rc=0) Nov 05 10:49:08 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 17 fired and confirmed Nov 05 10:49:08 nodeA crmd: [16027]: info: te_pseudo_action: Pseudo action 5 fired and confirmed Nov 05 10:49:08 nodeA crmd: [16027]: info: run_graph: ==================================================== Nov 05 10:49:08 nodeA crmd: [16027]: notice: run_graph: Transition 59 (Complete=5, Pending=0, Fired=0, Skipped=0, Incomplete=0, Source=/var/lib/pengine/pe-input-321.bz2): Complete Nov 05 10:49:08 nodeA crmd: [16027]: info: te_graph_trigger: Transition 59 is now complete Nov 05 10:49:08 nodeA crmd: [16027]: info: notify_crmd: Transition 59 status: done - <null> Nov 05 10:49:08 nodeA crmd: [16027]: info: do_state_transition: State transition S_TRANSITION_ENGINE -> S_IDLE [ input=I_TE_SUCCESS cause=C_FSA_INTERNAL origin=notify_crmd ] Nov 05 10:49:08 nodeA crmd: [16027]: info: do_state_transition: Starting PEngine Recheck Timer -------------- next part -------------- Nov 05 10:48:53 nodeB crmd: [13932]: info: do_lrm_invoke: Removing resource apache from the LRM Nov 05 10:48:53 nodeB crmd: [13932]: info: do_lrm_invoke: Resource 'apache' deleted for 15837_crm_resource on nodeB Nov 05 10:48:53 nodeB crmd: [13932]: info: notify_deleted: Notifying 15837_crm_resource on nodeB that apache was deleted Nov 05 10:48:53 nodeB crmd: [13932]: info: send_direct_ack: ACK'ing resource op apache_delete_60000 from 0:0:crm-resource-15837: lrm_invoke-lrmd-1415184533-12 Nov 05 10:48:53 nodeB attrd: [13930]: info: attrd_trigger_update: Sending flush op to all hosts for: fail-count-apache (<null>) Nov 05 10:48:53 nodeB attrd: [13930]: info: attrd_perform_update: Sent delete 107: node=nodeB, attr=fail-count-apache, id=<n/a>, set=(null), section=status Nov 05 10:48:53 nodeB attrd: [13930]: info: attrd_perform_update: Sent delete 109: node=nodeB, attr=fail-count-apache, id=<n/a>, set=(null), section=status Nov 05 10:48:53 nodeB crmd: [13932]: info: config_query_callback: Checking for expired actions every 900000ms Nov 05 10:48:53 nodeB crmd: [13932]: info: config_query_callback: Sending expected-votes=2 to corosync Nov 05 10:48:53 nodeB crmd: [13932]: info: ais_dispatch: Membership 164: quorum retained Nov 05 10:48:53 nodeB cib: [15838]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-6.raw Nov 05 10:48:53 nodeB cib: [15838]: info: write_cib_contents: Wrote version 0.24.0 of the CIB to disk (digest: a2620e6ff70b79e8d8b07b6fe5f15238) Nov 05 10:48:53 nodeB cib: [15838]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.FtBDrf (digest: /var/lib/heartbeat/crm/cib.dTUseV) Nov 05 10:48:55 nodeB lrmd: [13929]: notice: lrmd_rsc_new(): No lrm_rprovider field in message Nov 05 10:48:55 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=8:56:7:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=apache_monitor_0 ) Nov 05 10:48:55 nodeB lrmd: [13929]: info: rsc:apache:80: probe Nov 05 10:48:55 nodeB crmd: [13932]: info: process_lrm_event: LRM operation apache_monitor_0 (call=80, rc=0, cib-update=103, confirmed=true) ok Nov 05 10:48:59 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=13:57:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=apache_stop_0 ) Nov 05 10:48:59 nodeB lrmd: [13929]: info: rsc:apache:81: stop Nov 05 10:48:59 nodeB lrmd: [15845]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:48:59 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) Stopping httpd: Nov 05 10:48:59 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) [ Nov 05 10:48:59 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) OK Nov 05 10:48:59 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) ] Nov 05 10:48:59 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) ^M Nov 05 10:48:59 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) Nov 05 10:48:59 nodeB crmd: [13932]: info: process_lrm_event: LRM operation apache_stop_0 (call=81, rc=0, cib-update=104, confirmed=true) ok Nov 05 10:49:04 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=10:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=pacemaker_start_0 ) Nov 05 10:49:04 nodeB lrmd: [13929]: info: rsc:pacemaker:82: start Nov 05 10:49:04 nodeB crmd: [13932]: info: process_lrm_event: LRM operation pacemaker_start_0 (call=82, rc=0, cib-update=105, confirmed=true) ok Nov 05 10:49:04 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=11:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=pacemaker_monitor_10000 ) Nov 05 10:49:04 nodeB lrmd: [13929]: info: rsc:pacemaker:83: monitor Nov 05 10:49:04 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=13:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=failover_start_0 ) Nov 05 10:49:04 nodeB lrmd: [13929]: info: rsc:failover:84: start Nov 05 10:49:04 nodeB lrmd: [15866]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:49:04 nodeB crmd: [13932]: info: process_lrm_event: LRM operation pacemaker_monitor_10000 (call=83, rc=0, cib-update=106, confirmed=false) ok Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (failover:start:stderr) failover[15871]: start Nov 05 10:49:04 nodeB crmd: [13932]: info: process_lrm_event: LRM operation failover_start_0 (call=84, rc=0, cib-update=107, confirmed=true) ok Nov 05 10:49:04 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=14:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=failover_monitor_10000 ) Nov 05 10:49:04 nodeB lrmd: [13929]: info: rsc:failover:85: monitor Nov 05 10:49:04 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=15:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=apache_start_0 ) Nov 05 10:49:04 nodeB lrmd: [13929]: info: rsc:apache:86: start Nov 05 10:49:04 nodeB lrmd: [15876]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:49:04 nodeB crmd: [13932]: info: process_lrm_event: LRM operation failover_monitor_10000 (call=85, rc=0, cib-update=108, confirmed=false) ok Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (apache:start:stdout) Starting httpd: Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (apache:start:stdout) [ Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (apache:start:stdout) OK Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (apache:start:stdout) ] Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (apache:start:stdout) ^M Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (apache:start:stdout) Nov 05 10:49:04 nodeB crmd: [13932]: info: process_lrm_event: LRM operation apache_start_0 (call=86, rc=0, cib-update=109, confirmed=true) ok Nov 05 10:49:04 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=16:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=apache_monitor_30000 ) Nov 05 10:49:04 nodeB lrmd: [13929]: info: rsc:apache:87: monitor Nov 05 10:49:04 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=18:58:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=zabbix-server_start_0 ) Nov 05 10:49:04 nodeB lrmd: [13929]: info: rsc:zabbix-server:88: start Nov 05 10:49:04 nodeB lrmd: [15890]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (zabbix-server:start:stdout) Starting Zabbix server: Nov 05 10:49:04 nodeB crmd: [13932]: info: process_lrm_event: LRM operation apache_monitor_30000 (call=87, rc=7, cib-update=110, confirmed=false) not running Nov 05 10:49:04 nodeB attrd: [13930]: info: attrd_ais_dispatch: Update relayed from nodeA Nov 05 10:49:04 nodeB attrd: [13930]: info: attrd_local_callback: Expanded fail-count-apache=value++ to 1 Nov 05 10:49:04 nodeB attrd: [13930]: info: attrd_trigger_update: Sending flush op to all hosts for: fail-count-apache (1) Nov 05 10:49:04 nodeB attrd: [13930]: info: attrd_perform_update: Sent update 115: fail-count-apache=1 Nov 05 10:49:04 nodeB attrd: [13930]: info: attrd_ais_dispatch: Update relayed from nodeA Nov 05 10:49:04 nodeB attrd: [13930]: info: attrd_trigger_update: Sending flush op to all hosts for: last-failure-apache (1415184544) Nov 05 10:49:04 nodeB attrd: [13930]: info: attrd_perform_update: Sent update 117: last-failure-apache=1415184544 Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (zabbix-server:start:stdout) [ Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (zabbix-server:start:stdout) OK Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (zabbix-server:start:stdout) ] Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (zabbix-server:start:stdout) ^M Nov 05 10:49:04 nodeB lrmd: [13929]: info: RA output: (zabbix-server:start:stdout) Nov 05 10:49:04 nodeB crmd: [13932]: info: process_lrm_event: LRM operation zabbix-server_start_0 (call=88, rc=0, cib-update=111, confirmed=true) ok Nov 05 10:49:06 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=13:59:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=zabbix-server_stop_0 ) Nov 05 10:49:06 nodeB lrmd: [13929]: info: rsc:zabbix-server:89: stop Nov 05 10:49:06 nodeB lrmd: [15908]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:49:06 nodeB lrmd: [13929]: info: RA output: (zabbix-server:stop:stdout) Shutting down Zabbix server: Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (zabbix-server:stop:stdout) [ Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (zabbix-server:stop:stdout) OK Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (zabbix-server:stop:stdout) ] Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (zabbix-server:stop:stdout) ^M Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (zabbix-server:stop:stdout) Nov 05 10:49:08 nodeB crmd: [13932]: info: process_lrm_event: LRM operation zabbix-server_stop_0 (call=89, rc=0, cib-update=112, confirmed=true) ok Nov 05 10:49:08 nodeB lrmd: [13929]: info: cancel_op: operation monitor[87] on lsb::httpd::apache for client 13932, its parameters: CRM_meta_on_fail=[restart] crm_feature_set=[3.0.1] CRM_meta_name=[monitor] CRM_meta_interval=[30000] CRM_meta_timeout=[60000] cancelled Nov 05 10:49:08 nodeB crmd: [13932]: info: do_lrm_rsc_op: Performing key=4:59:0:5a1e1fe5-3401-43eb-bce2-505b51bdf2de op=apache_stop_0 ) Nov 05 10:49:08 nodeB lrmd: [13929]: info: rsc:apache:90: stop Nov 05 10:49:08 nodeB crmd: [13932]: info: process_lrm_event: LRM operation apache_monitor_30000 (call=87, status=1, cib-update=0, confirmed=true) Cancelled Nov 05 10:49:08 nodeB lrmd: [15918]: WARN: For LSB init script, no additional parameters are needed. Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) Stopping httpd: Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) [ Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) OK Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) ] Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) ^M Nov 05 10:49:08 nodeB lrmd: [13929]: info: RA output: (apache:stop:stdout) Nov 05 10:49:08 nodeB crmd: [13932]: info: process_lrm_event: LRM operation apache_stop_0 (call=90, rc=0, cib-update=113, confirmed=true) ok