October 21, 2010

Process Monitoring in OMU

Use the following script to do process monitoring of managed node from OMU -

To use the script, create a measurement threshold policy and write program name as : "script_name" "monitor_name" "process_name"

and put threshold 1 for not running condition and threshold 0 for running condition.

###ENV##

MON_NAME=$1
OPCMON="/opt/OV/bin/OpC/opcmon";
OPTION="-option proc"
CMD="/usr/bin/ps"
PROCESS=$2
PROGNAME=`basename $0`
###ENV#

OBJECT=""
#############################################Find whether file exists####

CMD_OUTPUT=`$CMD -ef | grep $PROCESS | grep -v grep | wc -l`
if [ ${CMD_OUTPUT} -eq 0 ]
then
STATUS=1;
else
STATUS=0;
fi
$OPCMON $MON_NAME=$STATUS

exit

October 8, 2010

How to remove and recreate all certificates in OMU

This procedure is very long and involves manual steps on all agents and redeployment of
policies to all agents. It should only be used in last resort when no other option is
available. For instance, this procedure may be considered if the private key of the
certificate authority has been lost or compromised.
This procedure consists of several subprocedures:
Remove all certificates on the management server
Recreate the trusted certificate on the management server
Recreate the server and node certificate on the management server
Backup the certificates and private keys on the management server
Prepare the management server for certificate and policy deployment
Redeploy policies to the management server
Redeploy policies to the agent on the management server
Recreate the certificates and redeploy policies on all the agents
These subprocedures are designed to be run in sequence. It is not safe to jump directly to
a subprocedure until you have completed all previous subprocedures. Once you have
started with the first subprocedure, you must complete all subprocedures to recover a
fully operational OVO setup.

Remove all certificates on the management server
All steps in this subprocess should be taken on the management server.
If the OVO management server runs on a cluster as a package or resource group, first put
the package or resource group into maintenance mode to avoid it from switching to
another node.
Stop all OVO management server, agent and L-core processes:
mgmtsv# ovstop opc ovoacomm
mgmtsv# ovc –kill
mgmtsv# ps –ef | grep ov
mgmtsv# ps –ef | grep opc
mgmtsv# ps –ef | grep coda
Ensure that all OVO and L-core processes have stopped. It is quite common that some
processes will not stop or that “ovc” will report an error. This is due to the fact that some
processes communicate locally through HTTPS and you are currently resolving a
problem with certificates that may adversely affect HTTPS communication. You will
have to kill these processes manually. Use “kill -9” if necessary.
Now remove all certificates on the management server:

OV Certificates Cookbook Version 1.0
Chapter 4 How to remove and recreate all certificates


NOTE: after taking the following steps the OVO setup will not be fully operational
until you proceed with all steps up to and including Recreate the certificates and
redeploy policies on all the agents, which implies manual steps on all agents and
redeployment of policies to all agents.

mgmtsv# ovcert -list
+---------------------------------------------------------+
| Keystore Content |
+---------------------------------------------------------+
| Certificates: |
| dcd0c94c-cb7d-7506-079a-9cc1b0282993 (*) |
+---------------------------------------------------------+
| Trusted Certificates: |
| CA_dcd0c94c-cb7d-7506-079a-9cc1b0282993 |
+---------------------------------------------------------+
+---------------------------------------------------------+
| Keystore Content (OVRG: server) |
+---------------------------------------------------------+
| Certificates: |
| dcd0c94c-cb7d-7506-079a-9cc1b0282993 (*) |
+---------------------------------------------------------+
| Trusted Certificates: |
| CA_dcd0c94c-cb7d-7506-079a-9cc1b0282993 (*) |
+---------------------------------------------------------+

mgmtsv#: ovcert -remove dcd0c94c-cb7d-7506-079a-9cc1b0282993
* Do you really want to remove the certificate with alias
'dcd0c94c-cb7d-7506-079a-9cc1b0282993' (yes(y)/no(n))? y
INFO: Certificate has been successfully removed.
mgmtsv# ovcert -remove CA_dcd0c94c-cb7d-7506-079a-9cc1b0282993
* Do you really want to remove the certificate with alias
'CA_dcd0c94c-cb7d-7506-079a-9cc1b0282993' (yes(y)/no(n))? y
INFO: Certificate has been successfully removed.
mgmtsv# ovcert -remove dcd0c94c-cb7d-7506-079a-9cc1b0282993 -ovrg server
* Do you really want to remove the certificate with alias
'dcd0c94c-cb7d-7506-079a-9cc1b0282993' (yes(y)/no(n))? y
INFO: Certificate has been successfully removed.
mgmtsv# ovcert -remove CA_dcd0c94c-cb7d-7506-079a-9cc1b0282993 -ovrg server
* Do you really want to remove the certificate with alias
'CA_dcd0c94c-cb7d-7506-079a-9cc1b0282993' (yes(y)/no(n))? y
INFO: Certificate has been successfully removed.


You should now see the following:

mgmtsv# ovcert -list
+---------------------------------------------------------+
| Keystore Content |
+---------------------------------------------------------+
| Certificates: |
+---------------------------------------------------------+
| Trusted Certificates: |
+---------------------------------------------------------+
+---------------------------------------------------------+
| Keystore Content (OVRG: server) |
+---------------------------------------------------------+
| Certificates: |
+---------------------------------------------------------+
| Trusted Certificates: |
OV Certificates Cookbook Version 1.0
Chapter 4 How to remove and recreate all certificates Page 26
+---------------------------------------------------------+


You must now proceed with step Recreate the trusted certificate on the management
server.

4.2 Recreate the trusted certificate on the management server

All steps in this subprocess should be taken on the management server.
Since all generated certificates must be signed by the certificate authority, as a first step
we must recreate the trusted certificate, also referred to as the root certificate or the CA
certificate.
To recreate the trusted certificate on the server:
mgmstv# ovcm -newcacert
INFO: Generating a new CA key pair...
INFO: Installing...
INFO: Installation was successful.
You should now see the following:
mgmtsv# ovcert -list
+---------------------------------------------------------+
| Keystore Content |
+---------------------------------------------------------+
| Certificates: |
+---------------------------------------------------------+
| Trusted Certificates: |
+---------------------------------------------------------+
+---------------------------------------------------------+
| Keystore Content (OVRG: server) |
+---------------------------------------------------------+
| Certificates: |
+---------------------------------------------------------+
| Trusted Certificates: |
| CA_dcd0c94c-cb7d-7506-079a-9cc1b0282993 (*) |
+---------------------------------------------------------+
Now you can export the trusted certificate from the server side and import it on the node
side:
mgmtsv# ovcert -exporttrusted -file /tmp/trustedcertif -ovrg server
INFO: Trusted certificates have been successfully exported to file '/tmp/
trustedcertif'.
mgmtsv# ovcert -importtrusted -file /tmp/trustedcertif
INFO: Import operation was successful.
You should now see the following:
mgmtsv# ovcert -list
+---------------------------------------------------------+
| Keystore Content |
+---------------------------------------------------------+
| Certificates: |
OV Certificates Cookbook Version 1.0
Chapter 4 How to remove and recreate all certificates Page 27
+---------------------------------------------------------+
| Trusted Certificates: |
| CA_dcd0c94c-cb7d-7506-079a-9cc1b0282993 |
+---------------------------------------------------------+
+---------------------------------------------------------+
| Keystore Content (OVRG: server) |
+---------------------------------------------------------+
| Certificates: |
+---------------------------------------------------------+
| Trusted Certificates: |
| CA_dcd0c94c-cb7d-7506-079a-9cc1b0282993 (*) |
+---------------------------------------------------------+
You must now proceed with step Recreate the server and node certificate on the
management server.

Recreate the server and node certificate on the management
server

All steps in this subprocess should be taken on the management server, but they depend
on whether the OVO management server runs standalone or as a package or resource
group on a cluster.


IF The OVO management server runs on a cluster as a package or
resource group


Issue and import a new server certificate:
mgmtsv# ovcm -issue -file /tmp/certif -name $(hostname package/virtual node) -pass mypwd -coreid
$(ovcoreid –ovrg server)
INFO: Issued certificate was written to file '/tmp/certif'.
mgmtsv# ovcert -importcert -file /tmp/certif -pass mypwd -ovrg server
INFO: Import operation was successful.
mgmtsv# rm /tmp/certif
Issue and import a new node certificate:
mgmtsv# ovcm -issue -file /tmp/certif -name $(hostname active cluster node) -pass mypwd -coreid
$(ovcoreid)
INFO: Issued certificate was written to file '/tmp/certif'.
mgmtsv# ovcert -importcert -file /tmp/certif -pass mypwd
INFO: Import operation was successful.
mgmtsv# rm /tmp/certif.

October 2, 2010

OMU 9 Configuration Backup

Default OMU 9 configuration backup is available at following location and configuration backup is taken through the GUI using option Tasks -> Download all –
/opt/OV/OMU/adminUI/data/clipboard/

January 15, 2010

OMW to OMU Message forwarding

Problem -
I am having two management server architecture in my environment, wherein OMU acts as Manager of Managers and OMW monitors all the servers and applications.
I want to forward all the alerts from OMW to OMU.

Solution -
Whenver you add any node in OMW, add the same node simultaneously in OMU.
To add node in OMU, follow the below steps -
(1) Add node in OMU through Actions -> Node -> Add, specify the node name, Fully Qualified Node Name, server type etc.
Uncheck the Automatic Software De-installation, Check communication options that core id is 00000000000000000

(2) Once the node addition is successful, get the ovcoreid from the OMW server for the same node.

(3) execute following command in OMU -
/opt/OV/bin/OpC/utils/opcnode -chg_id
for example : /opt/OV/bin/OpC/utils/opcnode -chg_id abc.vtg BBC_SOL_SPARC solaris COMM_BBC NETWORK_IP unix 034fere34fd343fer

Once the command is executed successfully, move the node to appropriate node group

(4) Restart OMU services : /opt/OV/bin/opcsv -stop
/opt/OV/bin/opcsv -start

Switch management server from OMW to OMU in case OMW failure

Problem -
I am having two management server architecture and out of which Operation Manager for Unix acts as Manager of Managers which receives alerts from OMW, NNM, BAC etc. and other management server is OMW which monitors servers and applications.

I want to forward all alerts to OMU from OMW in case OMW services goes down or OMW server is not available.

Solution -
To achieve above target, perform following actions -

(1) Create a failover policy in OMW.
Failover policy would of type Flexible Management Policy with following content -

#
# Configuration file
# defines two action-allow managers
# messages are always send to the primary manager
#
TIMETEMPLATES
#none
RESPMGRCONFIGS
RESPMGRCONFIG
DESCRIPTION "responsible mgrs for Nodes in VAA"
#Secondary managers have to be specified,
#but this section is not used by OpenView Operations for Windows
SECONDARYMANAGERS
SECONDARYMANAGER
NODE IP 0.0.0.0 ""
DESCRIPTION "HP OpenView Operation for Unix Management Server"
SECONDARYMANAGER
NODE IP 0.0.0.0 """
DESCRIPTION "HP OpenView Opearion for Windows Management Server"
ACTIONALLOWMANAGERS
ACTIONALLOWMANAGER
NODE IP 0.0.0.0 """
DESCRIPTION "HP OpenView Operation for Unix Management Server"
ACTIONALLOWMANAGER
NODE IP 0.0.0.0 """
DESCRIPTION "HP OpenView Opearion for Windows Management Server"
# Recommended: include the primary manager into the list of action-allow managers
# even if all possible managers are already listed
ACTIONALLOWMANAGER
NODE IP 0.0.0.0 "$OPC_PRIMARY_MGR"
DESCRIPTION "current primary manager"
MSGTARGETRULES
MSGTARGETRULE
DESCRIPTION "always send all messages to current primary manager"
MSGTARGETRULECONDS
MSGTARGETMANAGERS
MSGTARGETMANAGER
TIMETEMPLATE "$OPC_ALWAYS"
OPCMGR IP 0.0.0.0 "$OPC_PRIMARY_MGR"

(2) Deploy the policy on all of the managed nodes
(3) When you want to switch management server from OMW to OMU, run following command on OMU server :
opcragt -primmgr
Wait till the command execution completes and check whether command is executed succesfully. If it goes successful then management server switching is done.
(4) If you want to revert the management server back to OMW then issue the same command on OMW server :
opcragt -primmgr

Unable to get proper severity symbol against node icon

Problem -
Unable to get proper severity symbol against the icon of the nodes in OMW.
For example, if a server A has got an alert of severity critical then the corresponding node icon should represent red symbol.

Solution -
Restart OMW services using command vpstat -3 -r and once command is successful, verify that all the services running fine using following command :
vpstat -3

January 12, 2010

Unable to do service discovery of nodes in OMW

Problem -

I am unable to get service map of the nodes which are added in OMW, the agent services are running absolutely fine and discovery policies are also been deployed but still i am not able to get service map of the nodes.

Solution -

Please perform following options and check whether you are able to get service map of the nodes -

a. Remove all the policies from this Agent
b. Confirm you have no policies on the Agent with the
command ("ovpolicy –list")
c. Kill the Agent the Agent (“ovc –kill”)
d. Rename the file system.txt to system.txt.old
e. Delete all the file in the folder “C:\Documents and
Settings\All Users\Application Data\HP\HP BTO Software\tmp\agtrep”
f. Re-start the Agent
g. Re-deploy the service discovery policies

January 7, 2010

OvEpMsgActSrv Service in OMW is aborted

Problem -
In OMW 8.10 OvEpMsgActSrv Service is aborted abnormally and even restarting the servicesand rebooting the system it didn't started

Solution - Stopped OMW services using vpstat -3 -r stop and removed following folder

$OvShareDir\server\datafiles\bbc\snf\OvEpMessageActionServer

and then restarted services using vpstat -3 -r start

which finally resolvd the issue.