1 # Arborist-SNMP |
1 |
|
2 Arborist-SNMP |
|
3 ============= |
2 |
4 |
3 home |
5 home |
4 : http://bitbucket.org/mahlon/Arborist-SNMP |
6 : http://bitbucket.org/mahlon/Arborist-SNMP |
5 |
7 |
6 code |
8 code |
7 : http://code.martini.nu/Arborist-SNMP |
9 : http://code.martini.nu/Arborist-SNMP |
8 |
10 |
9 |
11 |
10 ## Description |
12 Description |
|
13 ----------- |
11 |
14 |
12 Arborist is a monitoring toolkit that follows the UNIX philosophy |
15 Arborist is a monitoring toolkit that follows the UNIX philosophy |
13 of small parts and loose coupling for stability, reliability, and |
16 of small parts and loose coupling for stability, reliability, and |
14 customizability. |
17 customizability. |
15 |
18 |
16 This adds SNMP support to Arborist's monitoring, for things such as: |
19 This adds various SNMP support to Arborist's monitoring, specifically |
|
20 for OIDS involving: |
17 |
21 |
18 - Disk space capacity |
22 - Disk space capacity |
19 - System load |
23 - System CPU utilization |
20 - Free memory |
24 - Memory and swap usage |
21 - Swap in use |
|
22 - Running process checks |
25 - Running process checks |
23 |
26 |
24 |
27 It tries to provide sane defaults, while allowing fine grained settings |
25 ## Prerequisites |
28 per resource node. Both Windows and UCD-SNMP systems are supported. |
26 |
29 |
27 * Ruby 2.2 or better |
30 |
28 |
31 Prerequisites |
29 |
32 ------------- |
30 ## Installation |
33 |
|
34 * Ruby 2.3 or better |
|
35 * Net-SNMP libraries |
|
36 |
|
37 |
|
38 Installation |
|
39 ------------ |
31 |
40 |
32 $ gem install arborist-snmp |
41 $ gem install arborist-snmp |
33 |
42 |
34 |
43 |
35 ## Usage |
44 Configuration |
36 |
45 ------------- |
37 In this example, we've created a resource node under an existing host, like so: |
46 |
38 |
47 Global configuration overrides can be added to the Arborist config file, |
39 Arborist::Host( 'example' ) do |
48 under the `snmp` key. |
40 description "Example host" |
49 |
41 address '10.6.0.169' |
50 The defaults are as follows: |
42 resource 'load', description: 'machine load' |
51 |
43 resource 'disk' do |
52 arborist: |
44 include: [ '/', '/mnt' ] |
53 snmp: |
45 end |
54 timeout: 2 |
46 end |
55 retries: 1 |
47 |
56 community: public |
48 |
57 version: 2c |
49 From a monitor file, require this library, and create an snmp instance. |
58 port: 161 |
50 You can reuse a single instance, or create individual ones per monitor. |
59 batchsize: 25 |
51 |
60 cpu: |
52 require 'arborist/monitor/snmp' |
61 warn_at: 80 |
53 |
62 disk: |
54 Arborist::Monitor '5 minute load average check' do |
63 warn_at: 90 |
|
64 include: ~ |
|
65 exclude: |
|
66 - "^/dev(/.+)?$" |
|
67 - "^/net(/.+)?$" |
|
68 - "^/proc$" |
|
69 - "^/run$" |
|
70 - "^/sys/" |
|
71 memory: |
|
72 physical_warn_at: ~ |
|
73 swap_warn_at: 60 |
|
74 processes: |
|
75 check: [] |
|
76 |
|
77 |
|
78 The `warn_at` keys imply usage capacity as a percentage. ie: "Warn me |
|
79 when a disk mount point is at 90 percent utilization." |
|
80 |
|
81 |
|
82 ### Library Options |
|
83 |
|
84 * **timeout**: How long to wait for an SNMP response, in seconds? |
|
85 * **retries**: If an error occurs during SNMP communication, try again this many times before giving up. |
|
86 * **community**: The SNMP community name for reading data. |
|
87 * **version**: The SNMP protocol version. 1 and 2c are supported. |
|
88 * **port**: The UDP port SNMP is listening on. |
|
89 * **batchsize**: How many hosts to gather SNMP data on simultaneously. |
|
90 |
|
91 |
|
92 ### Category Options and Behavior |
|
93 |
|
94 #### CPU |
|
95 |
|
96 * **warn_at**: Set the node to a `warning` state when utilization is at or over this percentage. |
|
97 |
|
98 Utilization takes into account CPU core counts, and uses the 5 minute |
|
99 load average to calculate a percentage of current CPU use. |
|
100 |
|
101 2 properties are set on the node. `cpu` contains the detected CPU count |
|
102 and current utilization. `load` contains the 1, 5, and 15 minute load |
|
103 averages of the machine. |
|
104 |
|
105 |
|
106 #### Disk |
|
107 |
|
108 * **warn_at**: Set the node to a `warning` state when disk capacity is at or over this amount. |
|
109 You can also set this to a Hash, keyed on mount name, if you want differing |
|
110 warning values per mount point. A mount point that is at 100% capacity will |
|
111 be explicity set to `down`, as the resource it represents has been exhausted. |
|
112 * **include**: String or Array of Strings. If present, only matching mount points are |
|
113 considered while performing checks. These are treated as regular expressions. |
|
114 * **exclude**: String or Array of Strings. If present, matching mount point are removed |
|
115 from evaluation. These are treated as regular expressions. |
|
116 |
|
117 |
|
118 #### Memory |
|
119 |
|
120 * **physical_warn_at**: Set the node to a `warning` state when RAM utilization is at or over this percentage. |
|
121 * **swap_warn_at**: Set the node to a `warning` state when swap utilization is at or over this percentage. |
|
122 |
|
123 Warnings are only set for swap my default, since that is usually a |
|
124 better indication of an impending problem. |
|
125 |
|
126 |
|
127 #### Processes |
|
128 |
|
129 * **check**: String or Array of Strings. A list of processes to check if running. These are |
|
130 treated as regular expressions, and include process arguments. |
|
131 |
|
132 If any process in the list is not found in the process table, the |
|
133 resource is set to a `down` state. |
|
134 |
|
135 |
|
136 Examples |
|
137 -------- |
|
138 |
|
139 In the simplest form, using default behaviors and settings, here's an |
|
140 example Monitor configuration: |
|
141 |
|
142 require 'arborist/snmp' |
|
143 |
|
144 Arborist::Monitor 'cpu load check', :cpu do |
|
145 every 1.minute |
|
146 match type: 'resource', category: 'cpu' |
|
147 exec( Arborist::Monitor::SNMP::CPU ) |
|
148 end |
|
149 |
|
150 Arborist::Monitor 'partition capacity', :disk do |
|
151 every 1.minute |
|
152 match type: 'resource', category: 'disk' |
|
153 exec( Arborist::Monitor::SNMP::Disk ) |
|
154 end |
|
155 |
|
156 Arborist::Monitor 'process checks', :proc do |
|
157 every 1.minute |
|
158 match type: 'resource', category: 'process' |
|
159 exec( Arborist::Monitor::SNMP::Process ) |
|
160 end |
|
161 |
|
162 Arborist::Monitor 'memory', :memory do |
|
163 every 1.minute |
|
164 match type: 'resource', category: 'memory' |
|
165 exec( Arborist::Monitor::SNMP::Memory ) |
|
166 end |
|
167 |
|
168 |
|
169 Additionally, if you'd like these SNMP monitors to rely on the SNMP |
|
170 service itself, you can add a UDP check for that. |
|
171 |
|
172 Arborist::Monitor 'udp service checks', :udp do |
55 every 30.seconds |
173 every 30.seconds |
56 match type: 'resource', category: 'load' |
174 match type: 'service', protocol: 'udp' |
57 include_down true |
175 exec( Arborist::Monitor::Socket::UDP ) |
58 use :addresses |
176 end |
59 |
177 |
60 snmp = Arborist::Monitor::SNMP::Load( error_at: 10 ) |
178 |
61 exec( snmp ) |
179 And a default node declaration: |
62 end |
180 |
63 |
181 Arborist::Host 'example' do |
64 Arborist::Monitor 'mount capacity check' do |
182 description 'An example host' |
65 every 30.seconds |
183 address 'demo.example.com' |
66 match type: 'resource', category: 'disk' |
184 |
67 include_down true |
185 resource 'cpu' |
68 use :addresses, :config |
186 resource 'memory' |
69 |
187 resource 'disk' |
70 exec( Arborist::Monitor::SNMP::Disk ) |
188 end |
71 end |
189 |
72 |
190 |
73 |
191 |
74 Please see the rdoc for all the mode types and error_at options. Per |
192 All configuration can be overridden from the defaults using the `config` |
75 node "config" vars override global defaults when instantiating the |
193 pragma, per node. Here's a more elaborate example that performs the following: |
76 monitor. |
194 |
|
195 * All SNMP monitored resources are quieted if the SNMP service itself is unavailable. |
|
196 * Only monitor specific disk partitions, warning at different capacities . |
|
197 * Ensure the 'important' processing is running with the '--production' flag. |
|
198 * Warns at 95% memory utilization OR 10% swap. |
|
199 |
|
200 |
|
201 Arborist::Host 'example' do |
|
202 description 'An example host' |
|
203 address 'demo.example.com' |
|
204 |
|
205 service 'snmp', protocol: 'udp' |
|
206 |
|
207 resource 'cpu', description: 'machine cpu load' do |
|
208 depends_on 'example-snmp' |
|
209 end |
|
210 |
|
211 resource 'memory', description: 'machine ram and swap' do |
|
212 depends_on 'example-snmp' |
|
213 config physical_warn_at: 95, swap_warn_at: 10 |
|
214 end |
|
215 |
|
216 resource 'disk', description: 'partition capacity' do |
|
217 depends_on 'example-snmp' |
|
218 config \ |
|
219 include: [ |
|
220 '^/tmp', |
|
221 '^/var' |
|
222 ], |
|
223 warn_at: { |
|
224 '/tmp' => 50, |
|
225 '/var' => 80 |
|
226 } |
|
227 end |
|
228 |
|
229 resource 'process' do |
|
230 depends_on 'example-snmp' |
|
231 config check: 'important --production' |
|
232 end |
|
233 end |
77 |
234 |
78 |
235 |
79 |
236 |
80 ## License |
237 ## License |
81 |
238 |
82 Copyright (c) 2016, Michael Granger and Mahlon E. Smith |
239 Copyright (c) 2016-2018 Michael Granger and Mahlon E. Smith |
83 All rights reserved. |
240 All rights reserved. |
84 |
241 |
85 Redistribution and use in source and binary forms, with or without |
242 Redistribution and use in source and binary forms, with or without |
86 modification, are permitted provided that the following conditions are met: |
243 modification, are permitted provided that the following conditions are met: |
87 |
244 |