Zabbix: Difference between revisions

From DWIKI
mNo edit summary
(47 intermediate revisions by the same user not shown)
Line 20: Line 20:
*[[Grafana|Grafana]]  
*[[Grafana|Grafana]]  
*[https://blog.zabbix.com/zabbix-ha-cluster-setups/8264/ https://blog.zabbix.com/zabbix-ha-cluster-setups/8264/] Zabbix HA cluster]  
*[https://blog.zabbix.com/zabbix-ha-cluster-setups/8264/ https://blog.zabbix.com/zabbix-ha-cluster-setups/8264/] Zabbix HA cluster]  
*[https://blog.zabbix.com/zabbix-agent-active-vs-passive/9207/ Active vs Passive]
*[https://geofrogger.net/review/zabbix20.svg Very old network diagram]
*[https://blog.zabbix.com/fighting-notification-floods-and-misleading-alerts-in-distributed-zabbix-deployments/11600/ Fighting zabbix alert floods]
==Installation==
*[https://repo.zabbix.com/ Zabbix repositories]
== Installing Zabbix from git ==
git clone [https://github.com/zabbix/zabbix.git https://github.com/zabbix/zabbix.git]
cd zabbix
./bootstrap.sh




== Zabbix API ==
== Zabbix API ==


*[https://www.zabbix.com/documentation/current/manual/api The Zabbix API]
*[https://www.zabbix.com/documentation/current/manual/api The Zabbix API]  
*[https://www.zabbix.com/integrations/python API and python]
*[https://www.zabbix.com/integrations/python API and python]  
 
 
 
 
= Zabbix error codes =
 
== Z3005 ==
 
Database issue
 
= Items =
*[https://www.zabbix.com/documentation/current/nl/manual/config/items/itemtypes/zabbix_agent#supported-item-keys Supported Item Keys]
==Item dialog==
*[https://www.zabbix.com/documentation/5.4/en/manual/config/items/item Item documentation]
*[https://www.zabbix.com/documentation/5.4/en/manual/config/items/itemtypes/zabbix_agent Zabbix agent items]
===Units===
 
*B
*uptime
*unixtime
*s
 
== proc.mem ==
 
proc.mem[<name>,<user>,<mode>,<cmdline>,<memtype>]
 
=== name ===
 
??
 
=== cmdline ===


=Installing from git=
regex like php-fpm:
git clone https://github.com/zabbix/zabbix.git
cd zabbix
./bootstrap.sh


===memtype===
*[https://www.zabbix.com/documentation/5.0/en/manual/appendix/items/proc_mem_notes Notes on proc.mem memtypes]


= Templates =


=Templates=
*[https://github.com/zabbix/community-templates/ Community templates]
==Mysql template==


= Configuration =


== Zabbix agent active ==


=== On client ===


=== On server ===


Set Agent IP to 0.0.0.0


&nbsp;
=Zabbix and SQL=
==Find hosts with hostmacro defined==
select h.host, m.macro, m.value from hosts h, hostmacro m where macro like '%FOO%' and h.hostid = m.hostid;


= FAQ =
= FAQ =
Line 51: Line 101:
  zabbix_server --runtime-control log_level_increase=trapper
  zabbix_server --runtime-control log_level_increase=trapper
        
        
&nbsp;


=== Reload zabbix server configuration ===
=== Reload zabbix server configuration ===
You can't, but you might want


  zabbix_server -c /etc/zabbix/zabbix_server.conf -R config_cache_reload
  zabbix_server -c /etc/zabbix/zabbix_server.conf -R config_cache_reload
Line 59: Line 113:


=== No media defined for user ===
=== No media defined for user ===
=== The frontend does not match Zabbix database. ===
Probably version conflict between frontend and server
&nbsp;
=== value cache working in low memory mode ===
Increase ValueCacheSize
&nbsp;
&nbsp;


== PROXY ==
== PROXY ==


=== Zabbix does not support SQLite3 database upgrade. ===
[[ Zabbix Proxy ]]
 
== Front end ==
 
=== Visable name vs hostname ===


Stop the proxy and (re)move the .db file
Visible name: {HOST.NAME}


=== [Z3002] cannot create database 'zabbix_proxy': [0] unable to open database file ===
Hostname: {HOST.HOST}


fix path
Host IP: (as defined in Interface->IP/DNS) {HOST.CONN}


&nbsp;
&nbsp;


=== Cannot parse heartbeat from active proxy ===
=== Acknowledge multiple items ===


TBD, probably name issue
Monitor->Problems apply filters, select all, mass update


=== Troubleshoot high queue on proxy ===
&nbsp;


*[https://blog.zabbix.com/how-to-troubleshoot-zabbix-proxy-high-queue/12244/ https://blog.zabbix.com/how-to-troubleshoot-zabbix-proxy-high-queue/12244/]
=== No permissions to referred object or it does not exist! ===


&nbsp;
Graph no longer exists. Probably items no longer discovered


=== cannot obtain data from proxy "proxybox": ZBX_TCP_READ() failed: [104] Connection reset by peer ===
=== Cannot add host ===


?????
??


== SNMP ==
== SNMP ==


=== Cannot find host interface on "esxhost" for item key foo ===
=== Cannot find host interface on "xxxhost" for item key foo ===


Might mean you're trying to import an SNMP template before configuring SNMP for the host
Might mean you're trying to import an SNMP template before configuring SNMP for the host


&nbsp;
&nbsp;
&nbsp;
=== No SNMP data ===
=== snmp_parse_oid(): cannot parse OID "IF-MIB::ifSpeed.3 ===


== Agent side ping check ==
== Agent side ping check ==
Line 101: Line 179:
&nbsp;
&nbsp;


== IPMI errors ==
&nbsp;
 
 
==LLD/Discovery==
=== Discover: value must be a JSON object ===
 
Could mean you need to escape slashes, check output with zabbix_get
 
 
===Cannot create item: item with the same key===
make sure the key containts "{#MACRONAME}"
 
== Discovery data example ==
 
Output of a discovery script should look like:
 
{"data":[
  {"{#VAR1}":"value11","#{VAR2":"value12"},
  {"{#VAR1}":"value21","#{VAR2":"value22"}
]}
 
 
 
 
== IPMI ==
 
=== IPMI Monitoring account for zabbix ===
 
[https://www.thomas-krenn.com/en/wiki/Configuring_IPMI_under_Linux_using_ipmitool https://www.thomas-krenn.com/en/wiki/Configuring_IPMI_under_Linux_using_ipmitool]
 
&nbsp;


=== cannot connect to IPMI host: [22] Operation canceled ===
=== cannot connect to IPMI host: [22] Operation canceled ===


Usually temporary because of broken ipmi lib, ignore it
Usually temporary because of broken ipmi lib, ignore it
&nbsp;


=== cannot connect to IPMI host: [16777411] Unknown error 16777411 ===
=== cannot connect to IPMI host: [16777411] Unknown error 16777411 ===


another classic
classic, probably authentication problem
 
=== cannot connect to IPMI host: [22] Invalid argument ===
 
== zabbix_sender ==
 
=== processed: 0; failed: 1 ===
 
Possible causes:
 
*incorrect hostname
*incorrect item key
*item not in the server configuration cache yet
*Allowed hosts in trapper item
*phase of moon
*aliens
 
&nbsp;
 
=== Testing zabbix_sender ===
 
zabbix_sender stuff


== Filters ==
== Filters ==
Line 119: Line 250:
possibly authentication method issue
possibly authentication method issue


== Discover: value must be a JSON object ==
Could mean you need to escape slashes


&nbsp;
&nbsp;
 
=== Calculated items ===
== Cannot create item: Invalid first parameter ==
== Cannot create item: Invalid first parameter ==
 
== Cannot create item, error in formula ==
Problably a calculated item, try doublequoting the item key:
Problably a calculated item, try doublequoting the item key:


Line 138: Line 266:
  yum install zabbix-agent
  yum install zabbix-agent


== Discovery data example ==
Output of a discovery script should look like:
{"data":[
  {"{#VAR1}":"value11","#{VAR2":"value12"},
  {"{#VAR1}":"value21","#{VAR2":"value22"}
]}


== Backing up tables ==
== Backing up tables ==


[https://www.zabbix.org/wiki/Docs/howto/mysql_backup_script https://www.zabbix.org/wiki/Docs/howto/mysql_backup_script]
[https://www.zabbix.org/wiki/Docs/howto/mysql_backup_script https://www.zabbix.org/wiki/Docs/howto/mysql_backup_script]
&nbsp;


== cannot send list of active checks ==
== cannot send list of active checks ==


Most likely ServerActive is defined in agent config, while not used at all
If in agent log: most likely ServerActive is defined in agent config, while not used at all
 
It is also possible agent is sending some active check to server while host is monitored via proxy.
 
== active check configuration update started to fail ==
 
??


== Latest 20 issues ==
== Latest 20 issues ==


DEFAULT_LATEST_ISSUES_CNT in/usr/share/zabbix/include/defines.inc.php
DEFAULT_LATEST_ISSUES_CNT in/usr/share/zabbix/include/defines.inc.php
&nbsp;


== Zabbix unreachable poller processes more than 75% busy ==
== Zabbix unreachable poller processes more than 75% busy ==


Increase '''StartPollersUnreachable'''
Increase '''StartPollersUnreachable'''
&nbsp;
== Zabbix poller processes more than 75% busy ==
another mystery


== More than 100 items having missing data for more than 10 minutes ==
== More than 100 items having missing data for more than 10 minutes ==
Line 229: Line 365:


Check StartVMwareCollectors on server or proxy
Check StartVMwareCollectors on server or proxy
&nbsp;


== unsupported item key ==
== unsupported item key ==
Line 235: Line 373:


  echo 1
  echo 1
=== became not supported: Not supported by Zabbix Agent ===
probably output by userparameter/script


== ansible or API not showing host groups ==
== ansible or API not showing host groups ==
Line 262: Line 404:
&nbsp;
&nbsp;


&nbsp;


== another network error, wait for 8 seconds ==
== another network error, wait for 8 seconds ==


'''UnreachableDelay'''=8
'''UnreachableDelay'''=8
&nbsp;


== failed: first network error ==
== failed: first network error ==


Setting '''Timeout '''in server configuration
Setting '''Timeout '''in server configuration
also Timeout in agents?
&nbsp;
== no active checks on server ==
???
&nbsp;
&nbsp;
== show cpu utilization ==
Monitoring->host->graphs
[[Category:Monitoring]]

Revision as of 11:01, 14 April 2022

Links

Installation

Installing Zabbix from git

git clone https://github.com/zabbix/zabbix.git
cd zabbix 
./bootstrap.sh


Zabbix API


 

Zabbix error codes

Z3005

Database issue

Items

Item dialog

Units

  • B
  • uptime
  • unixtime
  • s

proc.mem

proc.mem[<name>,<user>,<mode>,<cmdline>,<memtype>]

name

??

cmdline

regex like php-fpm:

memtype

Templates

Configuration

Zabbix agent active

On client

On server

Set Agent IP to 0.0.0.0

 

Zabbix and SQL

Find hosts with hostmacro defined

select h.host, m.macro, m.value from hosts h, hostmacro m where macro like '%FOO%' and h.hostid = m.hostid;

FAQ

SERVER

Adjust loglevel

zabbix_server --runtime-control log_level_increase=trapper
      

 

Reload zabbix server configuration

You can't, but you might want

zabbix_server -c /etc/zabbix/zabbix_server.conf -R config_cache_reload

 

No media defined for user

The frontend does not match Zabbix database.

Probably version conflict between frontend and server

 

value cache working in low memory mode

Increase ValueCacheSize

 

 

PROXY

Zabbix Proxy

Front end

Visable name vs hostname

Visible name: {HOST.NAME}

Hostname: {HOST.HOST}

Host IP: (as defined in Interface->IP/DNS) {HOST.CONN}

 

Acknowledge multiple items

Monitor->Problems apply filters, select all, mass update

 

No permissions to referred object or it does not exist!

Graph no longer exists. Probably items no longer discovered

Cannot add host

??

SNMP

Cannot find host interface on "xxxhost" for item key foo

Might mean you're trying to import an SNMP template before configuring SNMP for the host

 

 

No SNMP data

snmp_parse_oid(): cannot parse OID "IF-MIB::ifSpeed.3

Agent side ping check

UserParameter=pingtime[*],fping -e $1|sed 's/^.*(\([0-9].*\) ms).*$/\1/g'
UserParameter=pingalive[*],fping $1|grep -q alive;echo $?

 

 


LLD/Discovery

Discover: value must be a JSON object

Could mean you need to escape slashes, check output with zabbix_get


Cannot create item: item with the same key

make sure the key containts "{#MACRONAME}"

Discovery data example

Output of a discovery script should look like:

{"data":[
  {"{#VAR1}":"value11","#{VAR2":"value12"},
  {"{#VAR1}":"value21","#{VAR2":"value22"}
]}



IPMI

IPMI Monitoring account for zabbix

https://www.thomas-krenn.com/en/wiki/Configuring_IPMI_under_Linux_using_ipmitool

 

cannot connect to IPMI host: [22] Operation canceled

Usually temporary because of broken ipmi lib, ignore it

 

cannot connect to IPMI host: [16777411] Unknown error 16777411

classic, probably authentication problem

cannot connect to IPMI host: [22] Invalid argument

zabbix_sender

processed: 0; failed: 1

Possible causes:

  • incorrect hostname
  • incorrect item key
  • item not in the server configuration cache yet
  • Allowed hosts in trapper item
  • phase of moon
  • aliens

 

Testing zabbix_sender

zabbix_sender stuff

Filters

The regular expressions referred to in discovery are found under Administration->General, and then "Regular expressions" in the dropdown at top right of the page

cannot connect to IPMI host: [125] Operation canceled

possibly authentication method issue


 

Calculated items

Cannot create item: Invalid first parameter

Cannot create item, error in formula

Problably a calculated item, try doublequoting the item key:

last("foo[bar]")

 

Install recent zabbix on CentOS/RHEL

rpm -ivh https://repo.zabbix.com/zabbix/3.4/rhel/7/x86_64/zabbix-release-3.4-2.el7.noarch.rpm
yum install zabbix-agent


Backing up tables

https://www.zabbix.org/wiki/Docs/howto/mysql_backup_script

 

cannot send list of active checks

If in agent log: most likely ServerActive is defined in agent config, while not used at all

It is also possible agent is sending some active check to server while host is monitored via proxy.

active check configuration update started to fail

??

Latest 20 issues

DEFAULT_LATEST_ISSUES_CNT in/usr/share/zabbix/include/defines.inc.php

 

Zabbix unreachable poller processes more than 75% busy

Increase StartPollersUnreachable

 

Zabbix poller processes more than 75% busy

another mystery

More than 100 items having missing data for more than 10 minutes

Could be high load. Also check Administration->Queue

Zabbix escalator processes more than 75% busy

probably high system load overall

Check agent

zabbix_get -s my.host.com -k agent.version

ZBX_NOTSUPPORTED

Could be anything, enable logging on agent. It could be version mismatch. Check

zabbix_get -s yourhost -k agent.version

If that works, you're calling for an undefined or unsupported key.

Incorrect trigger expression. Host "xx" does not exist or you have no access to this host.

Means there's no related item.

zabbix_get returns nothing

best look at log on agent side

run playbook on single host

ansible_playbook -l somehost somplay.yml

Category:Monitoring

 

Zabbix server is not running: the information displayed may not be current

Might be selinux: http://sysads.co.uk/2013/11/zabbix-server-running-alert/

 

 

 

Monitoring vmware

vmware.hv.cpu.usage[{$URL},{HOST.HOST}]" became not supported: Couldn't resolve host name

Set macro {$URL} to https://your.ip/sdk/ (shouldn't discovery figure that out from {$HOST} ?
      

Couldn't resolve host name

Sometimes it's a matter of waiting a few hours

 

vmware events collector returned empty result

???

No "vmware collector" processes started.

Check StartVMwareCollectors on server or proxy

 

unsupported item key

This might mean it's expecting a value from the script you're calling.

echo 1

became not supported: Not supported by Zabbix Agent

probably output by userparameter/script

ansible or API not showing host groups

Permissions!! See Administration->User Groups

 

failed to update local proxy configuration copy: invalid field name "items.lastlogsize"

check everything :)

Received value [11] is not suitable for value type [Numeric (unsigned)] and data type [Decimal]

This probably means the agent returned 1\n1

 

database is down: retrying in 10 seconds

try upping max_connections

[Incorrect key file for table 'items'; try to repair it

Could be something /tmp related

 

 

another network error, wait for 8 seconds

UnreachableDelay=8

 

failed: first network error

Setting Timeout in server configuration

also Timeout in agents?

 

no active checks on server

???

 

 

show cpu utilization

Monitoring->host->graphs