|
1 |
|
2 Volta |
|
3 ===== |
|
4 |
|
5 What is volta? |
|
6 -------------- |
|
7 |
|
8 Volta is a high performance, low resource URI rewriter for use with the |
|
9 Squid caching proxy server (http://www.squid-cache.org/.) With it, you |
|
10 can dynamically alter URI requests that pass through Squid based on |
|
11 various criteria. |
|
12 |
|
13 It uses a state machine to parse URIs and rules, and a constant database |
|
14 to store and access those rules. It can then either perform conditional |
|
15 rewrites internally, or by evaluating Lua scripts. |
|
16 |
|
17 |
|
18 Why is it called "volta"? |
|
19 ------------------------- |
|
20 |
|
21 It's a type of old Italian music written in triple-time. Quick! |
|
22 |
|
23 |
|
24 How fast is it? |
|
25 --------------- |
|
26 |
|
27 On a 2Ghz Xeon 5130, it can process a million squid requests against |
|
28 10000 rules in less than 8 seconds, using about 800k of ram. On an |
|
29 1.8Ghz Intel E4300, it can do it in 3 seconds. |
|
30 |
|
31 Your mileage may vary, but for most all intents and purposes the answer |
|
32 is "definitely fast enough." |
|
33 |
|
34 |
|
35 Configuring squid |
|
36 ----------------- |
|
37 |
|
38 You must enable url rewriting from within the squid.conf file. |
|
39 |
|
40 url_rewrite_program /usr/local/bin/volta |
|
41 |
|
42 ... and that's it. You may need some additional customization, like where |
|
43 the volta database is stored on disk: |
|
44 |
|
45 url_rewrite_program /usr/local/bin/volta -f /var/db/squid/volta.db |
|
46 |
|
47 Busy servers: |
|
48 |
|
49 While Volta is lightweight enough to simply increase the amount of |
|
50 rewriter children, it also supports Squid's rewrite_concurrency format |
|
51 if you find that to be more efficient for your environment. Adjust to |
|
52 taste. |
|
53 |
|
54 url_rewrite_children 5 startup=1 idle=2 concurrency=50 |
|
55 |
|
56 |
|
57 Using volta |
|
58 ----------- |
|
59 |
|
60 See the INSTALL file for instructions on how to compile volta. |
|
61 |
|
62 Volta reads its rewrite rules from a local database. You can create the |
|
63 rules in a text editor, then convert it to the database like so: |
|
64 |
|
65 % volta -c rules.txt |
|
66 |
|
67 You'll be left with a "volta.db" file in the current directory. Put it |
|
68 wherever you please, and use the -f flag to point to it. |
|
69 |
|
70 |
|
71 Rule file syntax |
|
72 ---------------- |
|
73 |
|
74 Volta's rule syntax is designed to be easy to parse by humans and |
|
75 machines. Blank lines are skipped, as is any line that starts with the |
|
76 '#' character, so you can keep the ascii version of your rules well |
|
77 documented and in version control. There is no practical limit on the |
|
78 number of rules in this database. |
|
79 |
|
80 When compiling the ruleset into the database format, volta detects |
|
81 malformed rules and stops if there are any problems, leaving your |
|
82 original database intact. You can change the ruleset at any time while |
|
83 volta is running, and the new rules will take affect within about 10 |
|
84 seconds. No need to restart squid! |
|
85 |
|
86 There are two types of rules -- positive matches, and negative matches. |
|
87 Positive matches cause the rewrite, negative matches intentionally allow |
|
88 the original request to pass. Rule order is consistent, top-down, first |
|
89 match wins. Fields are separated by any amount of whitespace (spaces or |
|
90 tabs.) |
|
91 |
|
92 |
|
93 ### Positive matches: |
|
94 |
|
95 **First field**: *the hostname to match* |
|
96 |
|
97 You can use an exact hostname (www.example.com), or the top level |
|
98 domain (tld) if you want to match everything under a specific host |
|
99 (example.com.) You can also use a single '*' to match every request, |
|
100 though this essentially bypasses a lot of what makes volta quick, it |
|
101 is included for completeness. You may have an unlimited amount of |
|
102 rules per hostname. Hostnames are compared without case sensitivity. |
|
103 |
|
104 **Second field**: *the path to match* |
|
105 |
|
106 This can be an exact match ('/path/to/something.html'), a regular |
|
107 expression ('\.(jpg|gif|png)$'), or a single '*' to match for any |
|
108 path. Regular expressions are matched without case sensitivity. There |
|
109 is currently no internal support for captures, though you can use |
|
110 a Lua rule (see below) for more complex processing. |
|
111 |
|
112 **Third field**: *the redirect code and url to rewrite to* |
|
113 |
|
114 Any pieces of a url that are omitted are automatically replaced |
|
115 with the original request's element -- the exception is a hostname, |
|
116 which is required. If you omit a redirect code, the URL rewrite is |
|
117 transparent to the client. You can attach a 301: or 302: prefix to |
|
118 cause a permanent or temporary code to be respectively sent, instead. |
|
119 |
|
120 If you require more complex processing than what volta provides |
|
121 internally, you can also specify a path to a Lua script (prefixed |
|
122 with 'lua:'.) See the 'Lua rules' section of this README for more |
|
123 information. |
|
124 |
|
125 |
|
126 ### Negative matches: |
|
127 |
|
128 **First field**: *the hostname to match* |
|
129 |
|
130 See above -- all the same rules apply. |
|
131 |
|
132 |
|
133 **Second field**: *the path to match* |
|
134 |
|
135 See above -- all the same rules apply. |
|
136 |
|
137 |
|
138 **Third field**: *the 'negative' marker* |
|
139 |
|
140 This is simply the '-' character, that signals to volta that this is |
|
141 a negative matching rule. |
|
142 |
|
143 |
|
144 You can easily test your rules by running volta on the command line, and |
|
145 pasting URLs into it. Boost the debug level (-d4) if you're having any issues. |
|
146 |
|
147 |
|
148 Examples |
|
149 -------- |
|
150 |
|
151 Rewrite all requests to Google to the SSL version: |
|
152 |
|
153 google.com * 302:https://www.google.com |
|
154 |
|
155 This will redirect the request "http://www.google.com/search?q=test" to |
|
156 "https://www.google.com/search?q=test". |
|
157 |
|
158 |
|
159 Transparently alter all uploaded images on imgur to be my face: :) |
|
160 |
|
161 i.imgur.com \.(gif|png|jpg)$ http://www.martini.nu/images/mahlon.jpg |
|
162 |
|
163 |
|
164 Expand a local, non qualified hostname to a FQDN (useful alongside the |
|
165 'dns_defnames' squid setting to enforce browser proxy behaviors): |
|
166 |
|
167 local-example * local-example.company.com |
|
168 |
|
169 |
|
170 Cause all blog content except for 2011 posts to permanently redirect to |
|
171 an archival page: |
|
172 |
|
173 martini.nu /blog/2011 - |
|
174 martini.nu /blog 301:martini.nu/content-archived.html |
|
175 |
|
176 |
|
177 Send all requests to reddit/r/WTF/* through a lua script for further processing. |
|
178 |
|
179 reddit.com /r/wtf lua:/path/to/a/lua-script.lua |
|
180 |
|
181 |
|
182 Turn off rewriting for specific network segment or IP address: |
|
183 |
|
184 Squid has this ability built in -- see the 'url_rewrite_access' setting. |
|
185 Alternatively, do the checks in lua. |
|
186 |
|
187 |
|
188 |
|
189 Lua Rules |
|
190 --------- |
|
191 |
|
192 Volta has an embedded Lua interpreter that you can use to perform all |
|
193 kinds of conditional rewrites. Read more about the syntax of the Lua |
|
194 language here: http://www.lua.org/manual/5.1/ |
|
195 |
|
196 ### Loading a script |
|
197 |
|
198 To use a Lua script, prefix the rewrite target of a volta rule with |
|
199 'lua:'. The rest of the target is then treated as a path to the script. |
|
200 (You can find an example in the Examples section of this README.) |
|
201 |
|
202 You can specify a path to either an ascii file, or Lua bytecode. (If |
|
203 speed is an absolute premium, I'm seeing around a 25% performance |
|
204 increase by using Lua bytecode files.) |
|
205 |
|
206 You can use different scripts for different rules, or use the same |
|
207 script across any number of separate rules. |
|
208 |
|
209 There is no need to restart squid when modifying Lua rules. Changes are |
|
210 seen immediately. |
|
211 |
|
212 |
|
213 ### Environment |
|
214 |
|
215 * Global variable declarations are disabled, so scripts can't accidently stomp on each other. All variables must be declared with the 'local' keyword. |
|
216 * There is a global table called 'shared' you may use if you want to share data between separate scripts, or remember things in-between rule evaluations. |
|
217 * The details of the request can be found in a table, appropriately named 'request'. HTTP scheme, host, path, port, method, client_ip, and domain are all available by default from the request table. |
|
218 * Calling Lua's print() function emits debug information to stderr. Use a debug level of 2 or higher to see it. |
|
219 |
|
220 |
|
221 ### Return value |
|
222 |
|
223 The return value of the script is sent unmodified to squid, which should |
|
224 be a URL the request is rewritten to, with an optional redirect code |
|
225 prefix (301 or 302.) |
|
226 |
|
227 Omitting a return value, or returning 'nil' has the same effect as a negative |
|
228 rule match -- the original request is allowed through without any rewrite. |
|
229 |
|
230 |
|
231 An extremely simple Lua rule script can be found in the 'examples' |
|
232 directory, distributed with volta. |
|
233 |