OSSEC Agent to Server Connection Issues

So naturally, as of late, I have found myself doing more than I probably need to on my servers and in the process causing more headaches then required. One of those issues has been with the communication between my agents and the mother-ship (command control) server with my OSSEC installs.

The first thing to understand is how to check the status of your agents and easiest way to do that is running the following on the server install (my mothership):

# /var/ossec/bin/agent_control -lc

This will list out all your agents and if they are active it’ll read Active. If they are inactive, they don’t read inactive unfortunately, they just don’t show up.

The next thing is to check your logs and in the default installations this is where it’ll be:

# tail -F /var/ossec/logs/ossec.log

If you have a connection issue you’re likely to see something like the following in the client log:

2012/10/09 03:39:33 ossec-agentd(4101): WARN: Waiting for server reply (not started). Tried: ‘[mothership IP]‘.
2012/10/09 03:39:35 ossec-agentd: INFO: Trying to connect to server ([mothership IP]:1514).
2012/10/09 03:39:35 ossec-agentd: INFO: Using IPv4 for: [mothership IP] .
2012/10/09 03:39:56 ossec-agentd(4101): WARN: Waiting for server reply (not started). Tried: ‘[mothership IP]‘.
2012/10/09 03:40:16 ossec-agentd: INFO: Trying to connect to server ([mothership IP]:1514).
2012/10/09 03:40:16 ossec-agentd: INFO: Using IPv4 for: [mothership IP] .

As you are probably thinking this isn’t exactly the most helpful of warnings, it’s not telling you anything about the issue. But you do know you can’t connect. A couple of things I can say that will help troubleshoot on the client box is to do the following:

First check your IPTABLES rules:

# iptables -nL

If you have a number of rules and policies you might want to try disabling everything to see if you can establish a connection. To verify that its reaching the mothership server though you’ll want to run tcpdump on the mothership and see if any packets are reaching the box. Easiest way is to do the following:

# tcpdump -i eth0 port 1514
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes

Note that eth0 is your network interface card. If on a NIX box you can run ifconfig and you’re looking for the card that has your internet protocol address next to the inet addr:. So it’d look like this:

# ifconfig
eth0 Link encap:Ethernet HWaddr G3:4P:91:CD:5A:6B
inet addr:100.1.5.68 Bcast:100.1.5.255 Mask:255.255.255.0

Once you identify the interface that is what you define in the syntax. And port is the UDP port that is used to communicate, if you didn’t change it on setup then it’ll be 1514. If it’s running you’ll start seeing traffic coming into the box as the servers kick it into gear. The easiest way to get it talking is to restart the agent boxes and you can do so here:

# /var/ossec/bin/ossec-control restart

If you have cleared your firewall and you don’t see traffic take a look at the ossec.log file on the mothership to see what might be going on. If you see the following you’re in luck:

# tail -F /var/ossec/logs/ossec.log
2012/10/09 03:47:17 ossec-remoted: WARN: Duplicate error: global: 0, local: 51, saved global: 5, saved local:7563
2012/10/09 03:47:17 ossec-remoted(1407): ERROR: Duplicated counter for ‘Agent001′.
2012/10/09 03:47:23 ossec-remoted: WARN: Duplicate error: global: 0, local: 52, saved global: 5, saved local:7563
2012/10/09 03:47:23 ossec-remoted(1407): ERROR: Duplicated counter for ‘Agent001′.
2012/10/09 03:47:27 ossec-remoted: WARN: Duplicate error: global: 0, local: 53, saved global: 5, saved local:7563
2012/10/09 03:47:27 ossec-remoted(1407): ERROR: Duplicated counter for ‘Agent001′.
2012/10/09 03:47:32 ossec-remoted: WARN: Duplicate error: global: 0, local: 54, saved global: 5, saved local:7563
2012/10/09 03:47:32 ossec-remoted(1407): ERROR: Duplicated counter for ‘Agent001′.
2012/10/09 03:47:38 ossec-remoted: WARN: Duplicate error: global: 0, local: 55, saved global: 5, saved local:7563
2012/10/09 03:47:38 ossec-remoted(1407): ERROR: Duplicated counter for ‘Agent001′.

This actually helped me out a lot. A quick Google search gets us here: http://www.ossec.net/doc/faq/unexpected.html and that is where everything became clear. This section specifically helped me out:

This normally happens when you restore the ossec files from a backup or you reinstall server or agents without performing an upgrade.

Here is the catch though, this was only applicable on one agent server, but following the instructions and applying to all agents actually fixed all the issues. And the fix is simple if you’re not looking to read the page. Simply do this on both the agent[s] and mothership, starting with the mothership.

# /var/ossec/bin/ossec-control stop
Killing ossec-monitord ..
Killing ossec-logcollector ..
Killing ossec-remoted ..
Killing ossec-syscheckd ..
Killing ossec-analysisd ..
Killing ossec-maild ..
Killing ossec-execd ..
# rm -rf /var/ossec/queue/rids/*
# /var/ossec/bin/ossec-control start

Remember, apply the same thing on all boxes and surprisingly, everything should start talking to each other again.

Cheers.