The PATROL Agent error log shows frequent connection established and lost with the integration service. Wed Feb 01 18:42:29 2017: ID 1021fd: I: Stats Streaming Restart : Success Wed Feb 01 18:42:47 2017: ID 1021fa: I: Connection established with Integration Service ISN on port 3183 Wed Feb 01 18:47:02 2017: ID 1021fb: W: Connection with Integration Service ISN with port 3183 lost Wed Feb 01 18:47:37 2017: ID 1021fa: I: Connection established with Integration Service ISN on port 3183 Wed Feb 01 18:48:48 2017: ID 1021fb: W: Connection with Integration Service ISN with port 3183 lost Wed Feb 01 18:49:23 2017: ID 1021fa: I: Connection established with Integration Service ISN on port 3183 Wed Feb 01 18:50:32 2017: ID 1021fb: W: Connection with Integration Service ISN with port 3183 lost Wed Feb 01 18:51:42 2017: ID 1021fa: I: Connection established with Integration Service ISN on port 3183 |
Check individual PATROL Agents per instance (and not several PATROL Agents disconnecting at the same time from the ISN). It might suggest that the PATROL Agent may be getting impacted by a CPU spike which results in disconnection from the Integration Service Node (ISN), which ends up resolving itself. Enable process monitoring on the PATROL Agent servers to determine which process is consuming the resources. If the CPU spikes are considered normal & expected, then incorporate one of the concepts captured in this link to delay the alert: https://communities.bmc.com/ideas/8607 If the problem is observed with several PATROL Agents at the same time, it might indicate that the ISN is overloaded or it might be network problems (particularly if the ISN is on another subnet than the affected PATROL Agents); checking the continuous ping response from the PATROL Agent to the Integration service will confirm if there are any network packets are getting dropped For the ISN overload scenario, check the MaxHeap value for the ISN (e.g. .\custom\conf\pnagent.conf) vs. the actual amount of memory used by the ISN (e.g. Task Manager - For Windows / top - For Linux). Load Balancers can cause similar behavior. As a test, connect a few of the PATROL Agents directly to an ISN, rather then through a Load Balancer to see whether they disconnect. |