Netview.Nv6kTBSMDirect

NetView-TBSM Integration: Passing correlated events to TBSM through the common listener.

The TBSM adapter for NetView, out of the box, is relatively simple to configure. After an initial bulk upload, subsequent topology changes & traps are reflected in the console.

That's fine, but, how do I get correlated events directly through the Common Listener without going through T/EC? Unfortunately, in our enterprise network, ICMP timeouts are quite common (unless a 30+ second timeout is specified). With 1500+ remote sites, this would greatly increase our polling interval beyond requirements. Yet we want to avoid false-negatives. Since the CL listens to trapd.trace, nvcorrd is out of the picture. The main reason for going direct is that Resource Objects created through T/EC lack the details provided with NetView's bulk upload, and lack the detailed NetView topology.

Here is a distributed MLM/NetView configuration which addresses such needs. Mind you, this is what seemed logical to me at the time. The TBSM development team weren't to interested in modifying their NetView CL to allow for correlated events from nvcorrd, let alone traps from other enterprises. Leslie Clark(and others) have described methods of trap re-generation(taking a trap and generating an nv6k enterprise event like ifDown/Up), so I let MLM filters and actions create nv6000 events which will flow through the CL. As long as the trap ID appears in the topxtrapgate.conf(uncommented), it will get to TBSM.

* Each remote site has an MLM(NT) polling all required LAN nodes, and its router's interfaces from within (trap destination is the Corporate NetView/MLM server).

* The Corporate NetView server runs MLM for receiving and processing MLM enterprise events from each downstream site, much like the attended-MLM scenario used on the NT platform.

* For the initial Bulk Upload, everything should be in a "managed state" so it appears managed in TBSM resource view. After the bulk upload completes, set all nodes being polled by the hub MLM or remote MLM to "unmanaged". This will prevent netmon from needlessly re-pinging the down interface/node when an MLM sourced Status or Session event is entered in trapd.trace . This is what the filters and their action scripts are for.

* Alias group(s) consisting of all remote MLM hosts set up on the Hub MLM, and an smMonitor group to poll these nodes. If they go down, we are blind for that site. Upon a failed poll for a remote MLM, we determine the break point by pinging the switch the MLM is plugged into(we don't have ITSA). If successful, at least we've narrowed it down to the switch port for the MLM or something involving the MLM's NIC.

* If not successful, we try the router's interfaces until we get a return code "0" , etc...

* Backbone devices are polled using netmon, there's not much of a time-out issue there.

* snmpCollect threshold events can go directly through the CL, or MLM threshold events turned into nv6k threshold events. Here is a modest attempt at a diagram:

Remote MLM ----> Corporate MLM | filter, command(snmptrap nv6k trap ID) -------> trapd -----> TBSM

Feel free to review Jane's sample snmptrap script on generating traps.

-- ThomasCaputo - 23 Feb 2005

TBSMMLM.doc: Technical Paper in more detail.