Troubleshooting: Cluster initialization problems in AWS/Azure
Cannot SSH to the first node of the cluster
SSH connection fails
- Double-check the IP address
- If connection is refused, wait 2 minutes after the VM is launched
- If authentication fails, make sure to connect as user fg@ and provide the private key corresponding to the public key provided when creating the cluster.
- In the security group settings check that TCP port 22 is open for inbound connections from the system with the SSH client
"Cluster initialization failed" message
After logging in to the first node you see "Cluster initialization failed" message
- On the first node open
/tmp/fg_setup.log file and check for an error message at the end of the file.
- If DNS server check has failed then verify that the DNS server IP address you provided in Cloud Provisioning tool is correct. DNS server is required for downloading Oracle installation files. You can check whether DNS works by running
nslookup flashgrid.io on the first node.
- If there is some other error reported then contact FlashGrid support
Cluster initialization not completing
More than 2 hours passed after creating the cluster, but when logging in to the first node you still see "Cluster initialization in progress" message.
On the first node open
/tmp/fg_setup.log file and scroll to the bottom. If you see repeated SSH errors trying to connect to another node then the security group setting might be incorrect. If you are creating the cluster in an existing VPC/VNet then check that the corresponding security group has UDP ports 4801, 4802, 4803 open between the group members.