PCSD daemon is running on all the nodes which are supposed to be part of cluster, but an attempt to generate a token fails with following error.
[root@pcs1 ~]# pcs cluster auth cs1.internal cs2.internal cs3.internal -u hacluster
Password:
Error: Unable to communicate with cs1.internal
Error: Unable to communicate with cs3.internal
Error: Unable to communicate with cs2.internal
Well pcs daemon is responsible for keeping corosync configuration files synchronized across all the nodes and starting/stopping cluster services. Each node in the cluster must be authorized to each other. This enables nodes to perform actions on other nodes.
root@pcs1 ~]# pcs cluster auth cs1.internal cs2.internal cs3.internal -u hacluster --debug
XDG_SESSION_ID=54
_=/usr/sbin/pcs
http_proxy=http://proxy.test:80
https_proxy=http://proxy.test:80
--Debug Input Start--
{"username": "hacluster", "local": false, "nodes": {"cs1.internal": null, "cs3.internal": null, "cs2.internal": null}, "password": "hapassword", "force": false}
--Debug Input End--
Finished running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb auth
Return value: 0
--Debug Stdout Start--
{
"status": "ok",
"data": {
"auth_responses": {
"cs3.internal": {
"status": "noresponse"
},
"cs2.internal": {
"status": "noresponse"
},
"cs1.internal": {
"status": "noresponse"
}
},
"sync_successful": true,
"sync_nodes_err": [
],
"sync_responses": {
}
},
"I, [2018-05-29T16:54:06.303538 #13362] INFO -- : SRWT Node: cs1.internal Request: check_auth\n",
"E, [2018-05-29T16:54:06.303538 #13362] ERROR -- : Unable to connect to node cs1.internal, no token available\n",
"I, [2018-05-29T16:54:06.303538 #13362] INFO -- : SRWT Node: cs3.internal Request: check_auth\n",
"E, [2018-05-29T16:54:06.303538 #13362] ERROR -- : Unable to connect to node cs3.internal, no token available\n",
"I, [2018-05-29T16:54:06.303579 #13362] INFO -- : SRWT Node: cs2.internal Request: check_auth\n",
"E, [2018-05-29T16:54:06.303628 #13362] ERROR -- : Unable to connect to node cs2.internal, no token available\n"
The error message indicates a connection failure to cluster nodes,let's see the common blocking factor's.
[root@pcs1 ~]# echo $http_proxy
http_proxy=http://proxy.test:80
[root@pcs1 ~]# echo $https_proxy
https://bugzilla.redhat.com/show_bug.cgi?id=1315627
[root@pcs1 ~]# pcs cluster auth cs1.internal cs2.internal cs3.internal -u hacluster
Password:
Error: Unable to communicate with cs1.internal
Error: Unable to communicate with cs3.internal
Error: Unable to communicate with cs2.internal
Why to authenticate?
Debug
root@pcs1 ~]# pcs cluster auth cs1.internal cs2.internal cs3.internal -u hacluster --debug
XDG_SESSION_ID=54
_=/usr/sbin/pcs
http_proxy=http://proxy.test:80
https_proxy=http://proxy.test:80
--Debug Input Start--
{"username": "hacluster", "local": false, "nodes": {"cs1.internal": null, "cs3.internal": null, "cs2.internal": null}, "password": "hapassword", "force": false}
--Debug Input End--
Finished running: /usr/bin/ruby -I/usr/lib/pcsd/ /usr/lib/pcsd/pcsd-cli.rb auth
Return value: 0
{
"status": "ok",
"data": {
"auth_responses": {
"cs3.internal": {
"status": "noresponse"
},
"cs2.internal": {
"status": "noresponse"
},
"cs1.internal": {
"status": "noresponse"
}
},
"sync_successful": true,
"sync_nodes_err": [
],
"sync_responses": {
}
},
.....
"I, [2018-05-29T16:54:06.303538 #13362] INFO -- : SRWT Node: cs1.internal Request: check_auth\n",
"E, [2018-05-29T16:54:06.303538 #13362] ERROR -- : Unable to connect to node cs1.internal, no token available\n",
"I, [2018-05-29T16:54:06.303538 #13362] INFO -- : SRWT Node: cs3.internal Request: check_auth\n",
"E, [2018-05-29T16:54:06.303538 #13362] ERROR -- : Unable to connect to node cs3.internal, no token available\n",
"I, [2018-05-29T16:54:06.303579 #13362] INFO -- : SRWT Node: cs2.internal Request: check_auth\n",
"E, [2018-05-29T16:54:06.303628 #13362] ERROR -- : Unable to connect to node cs2.internal, no token available\n"
The error message indicates a connection failure to cluster nodes,let's see the common blocking factor's.
- Firewall is running, but High availability is allowed.(include Corosync/Pacemaker ports)
[root@pcs1 ~]# firewall-cmd --state
running
[root@pcs1 ~]#
[root@pcs1 ~]# firewall-cmd --zone=public --list-services
dhcpv6-client ssh high-availability
- PCSD daemon status on all the nodes.
root@pcs1 ~]# systemctl status pcsd
● pcsd.service - PCS GUI and remote configuration interface
Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
Active: active (running) since Sun 2018-05-27 08:53:25 IST; 2 days ago
Docs: man:pcsd(8)
man:pcs(8)
Main PID: 612 (pcsd)
CGroup: /system.slice/pcsd.service
└─612 /usr/bin/ruby /usr/lib/pcsd/pcsd > /dev/null &
May 27 08:53:05 pcs1.tux systemd[1]: Starting PCS GUI and remote configuration interface...
May 27 08:53:25 pcs1.tux systemd[1]: Started PCS GUI and remote configuration interface.
- PCSD listening on port 224
[root@pcs1 ~]# lsof -Pi :2224
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
pcsd 612 root 7u IPv4 16876 0t0 TCP *:2224 (LISTEN)
Note: IPv6 is disabled and by default it bind to IPv4, else it will show IPv6.
What else, most of the blocking factors are set correct? This puzzled me to think on other network factor's, then noticed a proxy line in the debug logs.
Yes, I have exported proxy in bashrc profile and pcsd didn't like that!!
[root@pcs1 ~]# echo $http_proxy
http_proxy=http://proxy.test:80
https_proxy=http://proxy.test:80
Further research on it, revealed a bug and it seems like regression.
Let's leave that to OS vendor (Redhat/Oracle?!)https://bugzilla.redhat.com/show_bug.cgi?id=1315627
Workaround
Unset the Environmental variable
[root@pcs1 ~]# unset http_proxy;unset https_proxy
What next?
[root@pcs1 ~]# unset http_proxy;unset https_proxy
What next?
Try to authenticate.
[root@pcs1 ~]# pcs cluster auth cs1.internal cs2.internal cs3.internal -u hacluster
Above dissertation applies to Oracle Linux 7 and Redhat Linux 7.
Above dissertation applies to Oracle Linux 7 and Redhat Linux 7.