# HG changeset patch # User Mahlon E. Smith # Date 1452581899 28800 # Node ID c24dbd004cbcbec3630796c0c0529da47fefeac4 # Parent fbfdaa5215a32ff48f566ce143aa1799bf51e081 Rename INSTALL and README for Bitbucket formatting. diff -r fbfdaa5215a3 -r c24dbd004cbc INSTALL --- a/INSTALL Tue Jun 09 16:40:09 2015 -0700 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,66 +0,0 @@ - -Building volta -=============== - -Volta should build with minimal effort. First, make sure you've got the -dependencies installed. - - -Dependencies ------------- - - - TinyCDB (http://www.corpit.ru/mjt/tinycdb.html) - - GNU make (http://www.gnu.org/software/make/) - - Lua (http://www.lua.org/) - -If available, install these dependencies from your OS packaging system -of choice. - - -Compiling ---------- - -Just type 'make'. Depending on your platform, GNU make may have been -installed as 'gmake' If you get any errors, try 'gmake' first. - -It should build without warnings. - - -Installation ------------- - -Volta doesn't contain any installation targets. You can put the binary -wherever makes sense on your system. Note that it should be in a -directory that is writable to Squid, unless you plan to store the -database separately. An example: - - mv volta /usr/local/bin - mkdir -p /var/db/squid - chown squid:squid /var/db/squid - volta -f /var/db/squid/volta.db - -I usually just drop it into the squid configuration directory and run it -from there. - -For usage information, see the README. - - -Development ------------ - -Volta source can be cloned via Mercurial. The repo is found at: - - http://code.martini.nu/volta - -You can use the 'debug' make target to compile a (very noisy) binary -that contains gdb symbols and perftool hooks. Set the CPUPROFILE -environment variable to "cpu.prof" to generate a profile. - -If you're hacking, you'll want these dependencies installed too. - - - ragel (http://www.complang.org/ragel/) - - mercurial (http://mercurial.selenic.com/) - - google-perftools (http://code.google.com/p/google-perftools/) - - graphviz (http://www.graphviz.org/) - - ctags (http://ctags.sourceforge.net/) - diff -r fbfdaa5215a3 -r c24dbd004cbc INSTALL.md --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/INSTALL.md Mon Jan 11 22:58:19 2016 -0800 @@ -0,0 +1,67 @@ + +Building volta +=============== + +Volta should build with minimal effort. First, make sure you've got the +dependencies installed. + + +Dependencies +------------ + + - TinyCDB (http://www.corpit.ru/mjt/tinycdb.html) + - GNU make (http://www.gnu.org/software/make/) + - Lua (http://www.lua.org/) + +If available, install these dependencies from your OS packaging system +of choice. + + +Compiling +--------- + +Just type 'make'. Depending on your platform, GNU make may have been +installed as 'gmake' If you get any errors, try 'gmake' first. + +It should build without warnings. + + +Installation +------------ + +Volta doesn't contain any installation targets. You can put the binary +wherever makes sense on your system. Note that it should be in a +directory that is writable to Squid, unless you plan to store the +database separately. An example: + + mv volta /usr/local/bin + mkdir -p /var/db/squid + chown squid:squid /var/db/squid + volta -f /var/db/squid/volta.db + +I usually just drop it into the squid configuration directory and run it +from there. + +For usage information, see the README. + + +Development +----------- + +Volta source can be cloned via Mercurial. The repo can be found at: + + http://code.martini.nu/volta (primary) + https://bitbucket.org/mahlon/volta (secondary) + +You can use the 'debug' make target to compile a (very noisy) binary +that contains gdb symbols and perftool hooks. Set the CPUPROFILE +environment variable to "cpu.prof" to generate a profile. + +If you're hacking, you'll want these dependencies installed too. + + - ragel (http://www.complang.org/ragel/) + - mercurial (http://mercurial.selenic.com/) + - google-perftools (http://code.google.com/p/google-perftools/) + - graphviz (http://www.graphviz.org/) + - ctags (http://ctags.sourceforge.net/) + diff -r fbfdaa5215a3 -r c24dbd004cbc README --- a/README Tue Jun 09 16:40:09 2015 -0700 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,236 +0,0 @@ - -Volta -===== - -What is volta? --------------- - -Volta is a high performance, low resource URI rewriter for use with the -Squid caching proxy server (http://www.squid-cache.org/.) With it, you -can dynamically alter URI requests that pass through Squid based on -various criteria. - -It uses a state machine to parse URIs and rules, and a constant database -to store and access those rules. It can then either perform conditional -rewrites internally, or by evaluating Lua scripts. - - -Why is it called "volta"? -------------------------- - -It's a type of old Italian music written in triple-time. Quick! - - -How fast is it? ---------------- - -On a 2Ghz Xeon 5130, it can process a million squid requests against -10000 rules in less than 8 seconds, using about 800k of ram. On an -1.8Ghz Intel E4300, it can do it in 3 seconds. - -Your mileage may vary, but for most all intents and purposes the answer -is "definitely fast enough." - - -Configuring squid ------------------ - -You must enable url rewriting from within the squid.conf file. - - url_rewrite_program /usr/local/bin/volta - -... and that's it. You may need some additional customization, like where -the volta database is stored on disk: - - url_rewrite_program /usr/local/bin/volta -f /var/db/squid/volta.db - -Busy servers: - -While Volta is lightweight enough to simply increase the amount of -rewriter children, it also supports Squid's rewrite_concurrency format -if you find that to be more efficient for your environment. Adjust to -taste. - - url_rewrite_children 5 startup=1 idle=2 concurrency=50 - - -Using volta ------------ - -See the INSTALL file for instructions on how to compile volta. - -Volta reads its rewrite rules from a local database. You can create the -rules in a text editor, then convert it to the database like so: - - % volta -c rules.txt - -You'll be left with a "volta.db" file in the current directory. Put it -wherever you please, and use the -f flag to point to it. - - -Rule file syntax ----------------- - -Volta's rule syntax is designed to be easy to parse by humans and -machines. Blank lines are skipped, as is any line that starts with the -'#' character, so you can keep the ascii version of your rules well -documented and in version control. There is no practical limit on the -number of rules in this database. - -When compiling the ruleset into the database format, volta detects -malformed rules and stops if there are any problems, leaving your -original database intact. You can change the ruleset at any time while -volta is running, and the new rules will take affect within about 10 -seconds. No need to restart squid! - -There are two types of rules -- positive matches, and negative matches. -Positive matches cause the rewrite, negative matches intentionally allow -the original request to pass. Rule order is consistent, top-down, first -match wins. Fields are separated by any amount of whitespace (spaces or -tabs.) - - -### Positive matches: - - First field: the hostname to match. - - You can use an exact hostname (www.example.com), or the top level - domain (tld) if you want to match everything under a specific host - (example.com.) You can also use a single '*' to match every request, - though this essentially bypasses a lot of what makes volta quick, it - is included for completeness. You may have an unlimited amount of - rules per hostname. Hostnames are compared without case sensitivity. - - - Second field: the path to match. - - This can be an exact match ('/path/to/something.html'), a regular - expression ('\.(jpg|gif|png)$'), or a single '*' to match for any - path. Regular expressions are matched without case sensitivity. There - is currently no internal support for captures, though you can use - a Lua rule (see below) for more complex processing. - - - Third field: The redirect code and url to rewrite to. - - Any pieces of a url that are omitted are automatically replaced - with the original request's element -- the exception is a hostname, - which is required. If you omit a redirect code, the URL rewrite is - transparent to the client. You can attach a 301: or 302: prefix to - cause a permanent or temporary code to be respectively sent, instead. - - If you require more complex processing than what volta provides - internally, you can also specify a path to a Lua script (prefixed - with 'lua:'.) See the 'Lua rules' section of this README for more - information. - - -### Negative matches: - - First field: the hostname to match. - - See above -- all the same rules apply. - - - Second field: the path to match. - - See above -- all the same rules apply. - - - Third field: the 'negative' marker. - - This is simply the '-' character, that signals to volta that this is - a negative matching rule. - - -You can easily test your rules by running volta on the command line, and -pasting URLs into it. Boost the debug level (-d4) if you're having any issues. - - -Examples --------- - -Rewrite all requests to Google to the SSL version: - - google.com * 302:https://www.google.com - - This will redirect the request "http://www.google.com/search?q=test" to - "https://www.google.com/search?q=test". - - -Transparently alter all uploaded images on imgur to be my face: :) - - i.imgur.com \.(gif|png|jpg)$ http://www.martini.nu/images/mahlon.jpg - - -Expand a local, non qualified hostname to a FQDN (useful alongside the -'dns_defnames' squid setting to enforce browser proxy behaviors): - - local-example * local-example.company.com - - -Cause all blog content except for 2011 posts to permanently redirect to -an archival page: - - martini.nu /blog/2011 - - martini.nu /blog 301:martini.nu/content-archived.html - - -Send all requests to reddit/r/WTF/* through a lua script for further processing. - - reddit.com /r/wtf lua:/path/to/a/lua-script.lua - - -Turn off rewriting for specific network segment or IP address: - - Squid has this ability built in -- see the 'url_rewrite_access' setting. - Alternatively, do the checks in lua. - - - -Lua Rules ---------- - -Volta has an embedded Lua interpreter that you can use to perform all -kinds of conditional rewrites. Read more about the syntax of the Lua -language here: http://www.lua.org/manual/5.1/ - -### Loading a script - -To use a Lua script, prefix the rewrite target of a volta rule with -'lua:'. The rest of the target is then treated as a path to the script. -(You can find an example in the Examples section of this README.) - -You can specify a path to either an ascii file, or Lua bytecode. (If -speed is an absolute premium, I'm seeing around a 25% performance -increase by using Lua bytecode files.) - -You can use different scripts for different rules, or use the same -script across any number of separate rules. - -There is no need to restart squid when modifying Lua rules. Changes are -seen immediately. - - -### Environment - -* Global variable declarations are disabled, so scripts can't accidently stomp on each other. All variables must be declared with the 'local' keyword. -* There is a global table called 'shared' you may use if you want to share data between separate scripts, or remember things in-between rule evaluations. -* The details of the request can be found in a table, appropriately named 'request'. HTTP scheme, host, path, port, method, client_ip, and domain are all available by default from the request table. -* Calling Lua's print() function emits debug information to stderr. Use a debug level of 2 or higher to see it. - - -### Return value - -The return value of the script is sent unmodified to squid, which should -be a URL the request is rewritten to, with an optional redirect code -prefix (301 or 302.) - -Omitting a return value, or returning 'nil' has the same effect as a negative -rule match -- the original request is allowed through without any rewrite. - - -An extremely simple Lua rule script can be found in the 'examples' -directory, distributed with volta. - - diff -r fbfdaa5215a3 -r c24dbd004cbc README.md --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.md Mon Jan 11 22:58:19 2016 -0800 @@ -0,0 +1,233 @@ + +Volta +===== + +What is volta? +-------------- + +Volta is a high performance, low resource URI rewriter for use with the +Squid caching proxy server (http://www.squid-cache.org/.) With it, you +can dynamically alter URI requests that pass through Squid based on +various criteria. + +It uses a state machine to parse URIs and rules, and a constant database +to store and access those rules. It can then either perform conditional +rewrites internally, or by evaluating Lua scripts. + + +Why is it called "volta"? +------------------------- + +It's a type of old Italian music written in triple-time. Quick! + + +How fast is it? +--------------- + +On a 2Ghz Xeon 5130, it can process a million squid requests against +10000 rules in less than 8 seconds, using about 800k of ram. On an +1.8Ghz Intel E4300, it can do it in 3 seconds. + +Your mileage may vary, but for most all intents and purposes the answer +is "definitely fast enough." + + +Configuring squid +----------------- + +You must enable url rewriting from within the squid.conf file. + + url_rewrite_program /usr/local/bin/volta + +... and that's it. You may need some additional customization, like where +the volta database is stored on disk: + + url_rewrite_program /usr/local/bin/volta -f /var/db/squid/volta.db + +Busy servers: + +While Volta is lightweight enough to simply increase the amount of +rewriter children, it also supports Squid's rewrite_concurrency format +if you find that to be more efficient for your environment. Adjust to +taste. + + url_rewrite_children 5 startup=1 idle=2 concurrency=50 + + +Using volta +----------- + +See the INSTALL file for instructions on how to compile volta. + +Volta reads its rewrite rules from a local database. You can create the +rules in a text editor, then convert it to the database like so: + + % volta -c rules.txt + +You'll be left with a "volta.db" file in the current directory. Put it +wherever you please, and use the -f flag to point to it. + + +Rule file syntax +---------------- + +Volta's rule syntax is designed to be easy to parse by humans and +machines. Blank lines are skipped, as is any line that starts with the +'#' character, so you can keep the ascii version of your rules well +documented and in version control. There is no practical limit on the +number of rules in this database. + +When compiling the ruleset into the database format, volta detects +malformed rules and stops if there are any problems, leaving your +original database intact. You can change the ruleset at any time while +volta is running, and the new rules will take affect within about 10 +seconds. No need to restart squid! + +There are two types of rules -- positive matches, and negative matches. +Positive matches cause the rewrite, negative matches intentionally allow +the original request to pass. Rule order is consistent, top-down, first +match wins. Fields are separated by any amount of whitespace (spaces or +tabs.) + + +### Positive matches: + +**First field**: *the hostname to match* + + You can use an exact hostname (www.example.com), or the top level + domain (tld) if you want to match everything under a specific host + (example.com.) You can also use a single '*' to match every request, + though this essentially bypasses a lot of what makes volta quick, it + is included for completeness. You may have an unlimited amount of + rules per hostname. Hostnames are compared without case sensitivity. + +**Second field**: *the path to match* + + This can be an exact match ('/path/to/something.html'), a regular + expression ('\.(jpg|gif|png)$'), or a single '*' to match for any + path. Regular expressions are matched without case sensitivity. There + is currently no internal support for captures, though you can use + a Lua rule (see below) for more complex processing. + +**Third field**: *the redirect code and url to rewrite to* + + Any pieces of a url that are omitted are automatically replaced + with the original request's element -- the exception is a hostname, + which is required. If you omit a redirect code, the URL rewrite is + transparent to the client. You can attach a 301: or 302: prefix to + cause a permanent or temporary code to be respectively sent, instead. + + If you require more complex processing than what volta provides + internally, you can also specify a path to a Lua script (prefixed + with 'lua:'.) See the 'Lua rules' section of this README for more + information. + + +### Negative matches: + +**First field**: *the hostname to match* + +See above -- all the same rules apply. + + +**Second field**: *the path to match* + +See above -- all the same rules apply. + + +**Third field**: *the 'negative' marker* + +This is simply the '-' character, that signals to volta that this is +a negative matching rule. + + +You can easily test your rules by running volta on the command line, and +pasting URLs into it. Boost the debug level (-d4) if you're having any issues. + + +Examples +-------- + +Rewrite all requests to Google to the SSL version: + + google.com * 302:https://www.google.com + + This will redirect the request "http://www.google.com/search?q=test" to + "https://www.google.com/search?q=test". + + +Transparently alter all uploaded images on imgur to be my face: :) + + i.imgur.com \.(gif|png|jpg)$ http://www.martini.nu/images/mahlon.jpg + + +Expand a local, non qualified hostname to a FQDN (useful alongside the +'dns_defnames' squid setting to enforce browser proxy behaviors): + + local-example * local-example.company.com + + +Cause all blog content except for 2011 posts to permanently redirect to +an archival page: + + martini.nu /blog/2011 - + martini.nu /blog 301:martini.nu/content-archived.html + + +Send all requests to reddit/r/WTF/* through a lua script for further processing. + + reddit.com /r/wtf lua:/path/to/a/lua-script.lua + + +Turn off rewriting for specific network segment or IP address: + + Squid has this ability built in -- see the 'url_rewrite_access' setting. + Alternatively, do the checks in lua. + + + +Lua Rules +--------- + +Volta has an embedded Lua interpreter that you can use to perform all +kinds of conditional rewrites. Read more about the syntax of the Lua +language here: http://www.lua.org/manual/5.1/ + +### Loading a script + +To use a Lua script, prefix the rewrite target of a volta rule with +'lua:'. The rest of the target is then treated as a path to the script. +(You can find an example in the Examples section of this README.) + +You can specify a path to either an ascii file, or Lua bytecode. (If +speed is an absolute premium, I'm seeing around a 25% performance +increase by using Lua bytecode files.) + +You can use different scripts for different rules, or use the same +script across any number of separate rules. + +There is no need to restart squid when modifying Lua rules. Changes are +seen immediately. + + +### Environment + +* Global variable declarations are disabled, so scripts can't accidently stomp on each other. All variables must be declared with the 'local' keyword. +* There is a global table called 'shared' you may use if you want to share data between separate scripts, or remember things in-between rule evaluations. +* The details of the request can be found in a table, appropriately named 'request'. HTTP scheme, host, path, port, method, client_ip, and domain are all available by default from the request table. +* Calling Lua's print() function emits debug information to stderr. Use a debug level of 2 or higher to see it. + + +### Return value + +The return value of the script is sent unmodified to squid, which should +be a URL the request is rewritten to, with an optional redirect code +prefix (301 or 302.) + +Omitting a return value, or returning 'nil' has the same effect as a negative +rule match -- the original request is allowed through without any rewrite. + + +An extremely simple Lua rule script can be found in the 'examples' +directory, distributed with volta. +