BMC Helix Service Monitoring (AIOPs) OnPrem: Deployment manager failing during upgrade from 21.3.03.001 to 22.2.01 for prometheus-entity-reconciler-service with "Error creating: pods "prometheus-entity-reconciler" is forbidden: exceeded quota" The following errors are observed in "prometheus-entity-reconciler-service" replica set events: Command to check logs: kubectl describe replicaset <prometheus-entity-reconciler-service-name> -n <ade-platform-namespace> replicaset-controller Error creating: pods "prometheus-entity-reconciler-service-6c4d78b695-5qrbm" is forbidden: exceeded quota: compute-resources-bmc, requested: limits.cpu=4,limits.memory=12Gi, used: limits. cpu=327412m,limits.memory=658382Mi, limited: limits.cpu=328250m,limits.memory=660658Mi Warning FailedCreate replicaset-controller Error creating: pods "prometheus-entity-reconciler-service-6c4d78b695-pb76q" is forbidden: exceeded quota: compute-resources-bmc, requested: limits.cpu=4,limits.memory=12Gi, used: limits. |
The issue is due to insufficient resources for the ade-platform-namespace namespace on the cluster Since the cluster namespace does not have the required resources (CPUs Core and RAM) hence the prometheus-entity-reconciler-service is failing Please ensure the cluster is having resources available to allocate the required resources (CPUs Core and RAM) to the namespace on which the BMC Helix Service Monitoring (AIOPs) on-prem is deployed - ade-platform-namespace. Please refer to the "Namespace resource requirements" documentation and update the namespace resources accordingly, to update the resource quota please use - "kubectl edit resourcequota" Once resource requirements are met, please try to perform the upgrade again If the issue persists then please open a support case and share "helix-on-prem-deployment-manager/logs/deployment.log" file |