List of Logs required to troubleshoot reconciliation engine issues:
- Logs in debug level, recon engine process log [arrecond.log], and reconciliation job logs.
- From Remedy side: arerror.log and armonitor.log
- 3. AR API+SQL log captured in a single file (from the time the issue occurred. Please note when the issue occurred exactly for easy reading since these logs can be huge)
4. If there is any crash or hang issue, then crash/hang dumps may be required. For Windows OS, ADPlus can be used to capture dumps. Configuring ADPlus for capturing arrecond.exe crash/hang dumps Non-windows, OS itself generates the core.dump for the arrecond process. 5. If the issue occurs on a server group try to isolate the problem to just one server for easy logging. Otherwise, you will have to get the logs from all the servers.
Logging
Change log level from Configuration Manager dashboard > Configuration > Core Configuration -> Reconciliation, Select the server for RE in the server drop-down list and go to the logs section. By default, Logging level is – Error. For debug logs, 1. make sure Logging Level = Debug. 2. Restart only arrecond process. To get it done, you can kill the process. The ARMonitor will start it again. 3. Usually these logs are located in %Arsystem home%/db/re on Windows and on non-windows OS, these are located in /opt/bmc/ARSystem/db/re.


To Troubleshoot recon engine issues, along with the debug level recon engine process log [arrecond.log] and recon engine job logs, we also need arerror.log, armonitor.log, and AR server API+SQL log captured in a single file for the duration of the recon job run. 1. Jobs are not getting started/not running/queued Search the “RE:Job_Runs” form to know the status of the job. If the job is in a queued state: We need to make sure, that the recon job is triggered only after the recon engine process is completely up and running. Armonitor.log and arrecond.log indicate if the recon engine process arrecond.exe/.sh] is started and fully up or not. If the job is triggered before the service is up then the job goes queued. Check the entries in the RE: Job_Runs form with the status “Queued”. Delete these queued entries and then trigger the job. Search the “Application pending” form and if there are entries present with the “TriggerJob” command.
 These trigger entries are supposed to be picked up by the AR Dispatcher and delivered to the RE Process, if they are stuck in there it could be for multiple reasons. 1. Check if the AR dispatcher is running or not. When the recon job is triggered, an entry is created in the “Application pending” form. 2. Then this entry is processed by the AR dispatcher and forwarded to the reconciliation engine. If the AR dispatcher goes down, then the entry in “Application pending” will not be picked by the recon engine until the polling interval has elapsed [defined in reconciliation settings]. 3. This interval is 5 hours OOTB. Also, in case of application, pending form entries download ardispacher log /opt/bmc/ARSystem/db or from BMC software installed path from db directory 4. If the log shows below details that means the issue is not with the dispatcher and RE has to be restarted and RE logs need to be checked. Dispatch Encountered XXXX commands in the category Reconciliation Dispatch Waking server Reconciliation Engine.
If more help needed in troubleshooting the AR Dispatcher please contact support and log a ARServer case.
NOTE** In a single server setup ensure that reconciliation is ranked 1 in the Server Group Operation Ranking. If RE is not ranked in a single server setup, it won't pick up the trigger-job application pending record and the RE job will stay in a queued status.
2. CIs are not getting processed by the Recon engine [identified/merged] If you think, that there are some CIs that are not reconciled, then please check
- #* If recon job was executed or not#* If yes, then was there any error. In case of an error, check the relevant job log file.#* If no errors, then check if there is any qualification used in activities in the recon job. If yes, then check if the CI in question satisfies the qualification or not. #* Also, capture the AR API+SQL logs in a single file, for the duration of the recon job run. These logs when mapped with the recon debug job log, help in understanding the issue.
- We can check the recon job log file for more details about job processing.
During ID activity, the recon engine processes all the CIs from the source dataset with Recon ID = 0 and satisfies the qualification mentioned in the ID activity. During merge activity, the recon engine processes all the CIs from the source dataset with Recon ID != 0, with the Ready to Merge flag (ReconciliationMergeStatus), if the merge order is set to 'by class in a separate transaction, and satisfying the qualification mentioned in merge activity.
3. How to know, how many CIs and which CIs will qualify for Identification Activity Recon engine picks up all the CIs which have reconciliationID as 0, for identification activity. If there is any qualification used in the identification activity, then that is applied to all the CIs, which have reconciliation as 0. So to know which CIs will get processed in the Identification search in the source dataset with reconID as 0. Use the following queries to know the total number of CIs that will be identified.
- select schemaid from arschema where name = 'BMC.CORE:BMC_BaseElement';
- select count(*) from T[result from above query] where C400127400 = 'Source Data set ID' and C400129200 = 0;
- select schemaid from arschema where name = 'BMC.CORE:BMC_BaseRelationship';
- select count(*) from T[result from above query] where C400127400 = 'Source Data set ID' and C400129200 = 0;
- Add results from 2nd and 4th queries to get the total number of CIs that will be identified.
Even if the identification activity has already started, then also you can use the above queries to follow the processing. In that case, queries will give a number of CIs which are yet to be processed in identification. Note: If there is any qualification used in identification activity then while calculating the no. append the clause with the used qualification, in 2nd and 4th query.
4. How to know, which and how many CIs will qualify for merge activity Up to 7.6 patch 2 version, The Recon engine picks up all the CIs that are modified after the last recon job run [based on the modified date of CI]. 7.6.03 and onwards, Flag “ReconciliationEngineMergeStatus” is maintained by the CMDB engine to track the changes in CIs. For any change in CI, the CMDB engine resets this flag to “ReadyToMerge”. Recon engine picks up all the CIs which have “ReconciliationengineMergeStatus” as “ReadyToMerge”. When CI is merged to the target data set, the recon engine sets this flag to “MergeDone”. Use the following queries to know the total number of CIs that will get merged [before starting the merge activity].
- select schemaid from arschema where name = 'BMC.CORE:BMC_BaseElement';
- select count(*) from T[result from above query] where C530060100 = 40 and C400127400 = 'Source Data set ID' and C400129200 != 0;
- select schemaid from arschema where name = 'BMC.CORE:BMC_BaseRelationship';
- select count(*) from T[result from above query] where C530060100 = 40 and C400127400 = 'Source Data set ID' and C400129200 != 0;
Add results from the 2nd and 4th query to get the total number of CIs, which will get merged.
-
If the merge activity has already started, then also you can use the above queries. In that case, queries will give a number of CIs which are yet to be processed in the merge. Note: If there is any qualification used in merge activity then while calculating the no. append the clause with the used qualification, in the 2nd and 4th query. 5. How to exclude some CIs during Identification/merge activity To exclude CIs during any activity in the recon engine, we can set qualifications in the activity. Qualification is built on a class basis. For e.g. BMC_ComputerSystem with name = %abc% But if you want to set qualifications across the classes, then put the qualification at the BMC_BaseElement class. e.g. 1. If the user wants to process ALL types [classes] of CIs, but within computer system CIs, wants to process only desktops, then the qualification should be: On BaseElement class where 1=1 (to include all CIs) On ComputerSystem class where Type = Desktop. NOTE: When you include a qualification in a class, and you want all other classes to participate you need to include BaseElement. Also, remember that qualifications apply to the current class and subclasses. 2. If the user wants to process only computer system CIs out of all the CIs. Basically, exclude all other classes, then qualification should be on BMC_BaseElement class as classID = BMC_ComputerSystem. Note: Recon engine identifies the related child CIs and the child relationships between them implicitly, which means that in the first example, along with the Desktop CIs, all its related child CIs and the child relationships between them will also be identified. 6. Getting multi-match or “dataset ID and reconciliationID combination is not unique in <dataset>” errors during Identification Activity See Troubleshooting Reconciliation based on error messages . 7. Duplicate CIs We need to first define clearly, how we say if two CIs are duplicates of each other or not. For e.g. if the serial number is the same for two CIs and we say that those two are the same. Then we can say duplication is on serial number attribute. So, to avoid duplication, we should use the serial number in the identification rule for that class as first priority. This way recon engine will consider two CIs as the same and will assign them the same recon ID. If you already have CIs that are identified and are duplicated, then change the identification rule first, then Reset the recon ID for all duplicated CIs remove those CIs from the asset dataset and then re-run the recon job. Duplicates based on Recon ID: CMDB engine has a check that prevents any application/user from creating any CI with a duplicate Recon ID in the same dataset. While creating CI, it checks if the dataset ID and Recon ID combination are unique or not. So duplicates on ReconID are possible, only if CMDB is bypassed in any of the operations. For e.g. if CIs are imported using some tool in the ASSET dataset directly or any operation directly in the database like import or modify. Also, we have observed cases, where two recon engines were running in parallel and which were causing duplicate CIs. In the case of the server group, we should have a recon engine running only on one server. In ar.cfg file there is an option, reconciliation-engine-suspended: T or F. This option should be F only on one server and on the rest of the servers, it should be T. Following are the database queries to know if there are duplicate CIs based on ReconID in ASSET dataset.
- Select schemaid from arschema where name like ‘BMC.CORE:BMC_BaseElement’
Use the result of above query in following query in place of 471 - select C179, C200000020 from T471 ORDER BY C3 where C400127400 = 'BMC.ASSET' and C400129200 in (select C400129200 from T471 where C400127400 = 'BMC.ASSET' group by C400129200 having count(*) > 1 )
This query will list all the CIs which are duplicated on ReconId in a dataset.
- 8. Getting warnings like “Skipping instance during merge/Identification”
This warning indicates that there are some CIs that are skipped during reconciliation. The reason for skipping is usually, that parent CI is not reconciled. For weak CIs, it is mandatory to reconcile the parent CI first. In the above examples, the log snippets show that when computer system CI was not identified due to data issues, then The processing of the sub-tree was canceled. [2014/09/25 04:23:30.1460] [ INFO ] [TID: 0000003048] : Cancelled the processing of rest of the sub-tree for instance of class = BMC.CORE:BMC_ComputerSystem and instance id = OI-4e4b0bcfc3fa4d3ea37e534d34a819eb
9. Reconciliation engine performance issues See: Reconciliation Best Practices and Improving Reconciliation Performance --For Reconciliation Performance issues the best deal is to start with Data study followed by data clean up , which includes purging MAD ci's , deleting very old ci's. This you can do on the basis of different qualification rules like class wise, year wise and of course have prior backups ready.
10. Some of the CIs are not considered for Identification in customized class
- CMDB does not work on BaseElement class CIs. It is a root class. CIs must be there in actual classes derived from BaseElement. Applications created on CMDB like Reconcilaition, Normalization works only on derived classes of BaseElement and BaseRelationship.
- CIs present in BMC.CORE:BMC_BaseElement and records are missing on "BMC.CORE:BMC_Test_"(Regular form) and BMC.CORE:BMC_Tes (join). The CIs will be orphan records and will not be eligible for reconciliation.
See also BEST FAQ on BMC CMDB Suite - Reconciliation
|