TrueSight Infrastructure Management (TSIM) cell will not start and/or there are many xact file present in the cell var directory |
The StateBuilder (statbld process) creates a mcdb.lock file when it runs. It creates a mcdb.0 file from the existing mcdb file then it creates a new mcdb file from a combination of the data in the mcdb.0, xact and xact.n files. Each time StateBuilder is not able to complete successfully you will find another xact.n file getting created as a copy of the current xact file. To check the consistency of the state files (mcdb and xact) please run: mlogchk -n <cell> This will give you advice on what to do to resolve any inconsistencies. IMPORTANT* Please get a directory listing of MCELL_HOME\var\<cell> and check to see if the xact files are continuous sequence from xact.1 -> xact.n NOTE* mcdb.0 and mcdb.lock files only appear when StateBuilder is executing or it has crashed unexpectedly. A) If the xact file sequence is continuous (starting with xact.1) use this option: - first CHECK the timestamps: xact.1 must be older than the mcdb.0 file !! To make StateBuilder process all the transactions and events that are contained in the xact files do the following: 1. Stop the cell from restarting 2. Make sure StateBuilder is not already running 3. Make a backup copy of MCELL_HOME\var\<cell> directory 4. Remove mcdb.0 and remove mcdb.lock file 5. Run StateBuilder manually statbld -n <cell> 6. If successful restart the cell B) If the xact file sequence is not continuous (xact exists but no xact.1) then use this option: 1. Stop the cell from restarting 2. Make sure StateBuilder is not already running 3. Make a backup copy of MCELL_HOME\var\<cell> directory 4. Remove mcdb file and remove mcdb.lock file 5. Rename mcdb.0 to mcdb 6. Rename xact to xact.1 7. Run StateBuilder statbld -n <cell> 8. xact.n files should be deleted when StateBuilder finished 8. Restart the cell C) If you have too many xact.n files it may take hours or days for the StateBuilder to process them all. If you are happy to lose all the events and transactions in the xact files you can avoid this by restarting using only the data held in mcdb. With this method you will still keep your service model. To do this, do the following: 1. Stop the cell from restarting 2. Make sure StateBuilder is not already running 3. Make a backup copy of MCELL_HOME\var\<cell> directory 4. Remove anything in the MCELL_HOME\var\<cell> directory EXCEPT mcdb and xact.1 files !! Rename xact.1 to xact xact.1 file should be the oldest of the xact.n files and timestamp should not differ that much from timestamp of mcdb file. 5. Restart the cell D) Another option is to clear all events from your cell. To do this you can restart the cell using the option -ie, for example: mcell -n <cell> -ie ============================== You can preempt this issue by writing a rule to monitor the exit_status slot of the MC_CELL_STATBLD_STOP event. This slot will contain the exit status of the last StateBuilder process. If the StateBuilder process exits, due to their being a mcdb.0 left behind, then the slot would contain a non 0 exit status. Your rule monitoring this slot could then cause a notification to occur, to alert you, in which case, you would just need to remove the mcdb.0 and mcdb.lock, if there is not a statbld.exe process running at the time, and then the next statbld.exe would run successfully. e.g. refine statbld_failed : MC_CELL_STATBLD_STOP($SS) where [ $SS.exit_status != 0 ] { execute($SS, "mc_sendmail.cmd", ["bmc@bmc.com"], YES); } END obviously replace the email address which ever address you need. Then add the rule to your .load, recompile and restart the cell. You will also need to run the script mc_setup_mail.cmd which you will find under %MCELL_HOME%\etc\<cell>\kb\bin\w4 . This script will prompt you for an SMTP server address, and login (usually your own email address). |