diff -r bd746609ba46 -r d4ce82194b64 README --- a/README Mon Nov 07 10:43:09 2011 -0800 +++ b/README Wed Nov 09 15:54:37 2011 -0800 @@ -3,14 +3,173 @@ ===== What is volta? - - high performance / low resource redirector +-------------- + +Volta is a high performance, low resource URI rewriter for use with the +Squid caching proxy server (http://www.squid-cache.org/.) With it, you +can dynamically alter URI requests that pass through Squid based on +various criteria. + +It uses a state machine to parse URIs and rules, and a constant database +to store and access those rules. + + +Why is it called "volta"? +------------------------- -Why "volta"? - - latin term, turn +It's a type of old Italian music written in triple-time. Quick! + + +How fast is it? +--------------- + +On a 2Ghz Xeon 5130, it can process a million squid requests against +10000 rules in less than 8 seconds, using about 800k of ram. On an +1.8Ghz Intel E4300, it can do it in 3 seconds. + +Your mileage may vary, but for most all intents and purposes the answer +is "definitely fast enough." + Configuring squid +----------------- + +You must enable url rewriting from within the squid.conf file. + + url_rewrite_program /usr/local/bin/volta + +... and that's it. You may need some additional customization, like where +the volta database is stored on disk: + + url_rewrite_program /usr/local/bin/volta -f /var/db/squid/volta.db + +Busy servers: + +Make sure rewrite_concurrency is disabled, volta is single threaded. +Instead, just add more volta children. They are lightweight, so load em +up. A proxy at my $DAYJOB is in use by around 450 people, and we get by +nicely with 10 volta children. + + url_rewrite_concurrency 0 + url_rewrite_children 10 + Using volta +----------- -How to +See the INSTALL file for instructions on how to compile volta. + +Volta reads its rewrite rules from a local database. You can create the +rules in a text editor, then convert it to the database like so: + + % volta -c rules.txt + +You'll be left with a "volta.db" file in the current directory. Put it +wherever you please, and use the -f flag to point to it. + + +Rule file syntax +---------------- + +Volta's rule syntax is designed to be easy to parse by humans and +machines. Blank lines are skipped, as is any line that starts with the +'#' character, so you can keep the ascii version of your rules well +documented and in version control. + +When compiling the ruleset into the database format, volta detects +malformed rules and stops if there are any problems, leaving your +original database intact. You can change the ruleset at any time while +volta is running, and the new rules will take affect within about 10 +seconds. No need to restart squid! + +There are two types of rules -- positive matches, and negative matches. +Positive matches cause the rewrite, negative matches allow the original +request to pass. Rule order is consistent, top-down, first match wins. +Fields are separated by any amount of whitespace (spaces or tabs.) + + +### Positive matches: + + First field: the hostname to match. + + You can use an exact hostname (www.example.com), or the top level + domain (tld) if you want to match everything under a specific host + (example.com.) You can also use a single '*' to match every request, + though this essentially bypasses a lot of what makes volta quick, it + is included for completeness. You may have an unlimited amount of + rules per hostname. Hostnames are compared without case sensitivity. + + + Second field: the path to match. + + This can be an exact match ('/path/to/something.html'), a regular + expression ('\.(jpg|gif|png)$'), or a single '*' to match for any + path. Regular expressions are matches without case sensitivity. There + is currently no support for capturing, though this may be added in + a future release. + + + Third field: The redirect code and url to rewrite to. + Any pieces of a url that are omitted are automatically replaced + with the original request's element -- the exception is a hostname, + which is required. If you omit a redirect code, the URL rewrite is + transparent to the client. You can attach a 301: or 302: prefix to + cause a permanent or temporary code to be respectively sent, instead. + + +### Negative matches: + + First field: the hostname to match. + + See above -- all the same rules apply. + + + Second field: the path to match. + + See above -- all the same rules apply. + + + Third field: the 'negative' marker. + + This is simply the '-' character, that signals to volta that this is + a negative matching rule. + + +You can easily test your rules by running volta on the command line, and +pasting URLs into it. Boost the debug level (-d4) if you're having any issues. + + +Examples +-------- + +Rewrite all requests to Google to the SSL version: + + google.com * 302:https://www.google.com + + This will redirect the request "http://www.google.com/search?q=test" to + "https://www.google.com/search?q=test". + + +Transparently alter all uploaded images on imgur to be my face: :) + + i.imgur.com \.(gif|png|jpg)$ http://www.martini.nu/images/mahlon.jpg + + +Expand a local, non qualified hostname to a FQDN (useful alongside the +'dns_defnames' squid setting to enforce browser proxy behaviors): + + local-example * local-example.company.com + + +Cause all blog content except for 2011 posts to permanently redirect to +an archival page: + + martini.nu /blog/2011 - + martini.nu /blog 301:martini.nu/content-archived.html + + +Turn off rewriting for specific network segment or IP address: + + Squid has this ability built in -- see the 'url_rewrite_access' setting. +