README
changeset 18 d4ce82194b64
parent 15 2706fc514dea
child 22 822094314703
child 26 7b28fb383da2
--- a/README	Mon Nov 07 10:43:09 2011 -0800
+++ b/README	Wed Nov 09 15:54:37 2011 -0800
@@ -3,14 +3,173 @@
 =====
 
 What is volta?
-	- high performance / low resource redirector
+--------------
+
+Volta is a high performance, low resource URI rewriter for use with the
+Squid caching proxy server (http://www.squid-cache.org/.)  With it, you
+can dynamically alter URI requests that pass through Squid based on
+various criteria.
+
+It uses a state machine to parse URIs and rules, and a constant database
+to store and access those rules.
+
+
+Why is it called "volta"?
+-------------------------
 
-Why "volta"?
-	- latin term, turn
+It's a type of old Italian music written in triple-time.  Quick!
+
+
+How fast is it?
+---------------
+
+On a 2Ghz Xeon 5130, it can process a million squid requests against
+10000 rules in less than 8 seconds, using about 800k of ram.  On an
+1.8Ghz Intel E4300, it can do it in 3 seconds.
+
+Your mileage may vary, but for most all intents and purposes the answer
+is "definitely fast enough."
+
 
 Configuring squid
+-----------------
+
+You must enable url rewriting from within the squid.conf file.
+
+	url_rewrite_program /usr/local/bin/volta
+
+... and that's it.  You may need some additional customization, like where
+the volta database is stored on disk:
+
+	url_rewrite_program /usr/local/bin/volta -f /var/db/squid/volta.db
+
+Busy servers:
+
+Make sure rewrite_concurrency is disabled, volta is single threaded.
+Instead, just add more volta children.  They are lightweight, so load em
+up.  A proxy at my $DAYJOB is in use by around 450 people, and we get by
+nicely with 10 volta children.
+
+	url_rewrite_concurrency 0
+	url_rewrite_children 10
+
 
 Using volta
+-----------
 
-How to
+See the INSTALL file for instructions on how to compile volta.
+
+Volta reads its rewrite rules from a local database.  You can create the
+rules in a text editor, then convert it to the database like so:
+
+	% volta -c rules.txt
+
+You'll be left with a "volta.db" file in the current directory.  Put it
+wherever you please, and use the -f flag to point to it.
+
+
+Rule file syntax
+----------------
+
+Volta's rule syntax is designed to be easy to parse by humans and
+machines.  Blank lines are skipped, as is any line that starts with the
+'#' character, so you can keep the ascii version of your rules well
+documented and in version control.
+
+When compiling the ruleset into the database format, volta detects
+malformed rules and stops if there are any problems, leaving your
+original database intact.  You can change the ruleset at any time while
+volta is running, and the new rules will take affect within about 10
+seconds.  No need to restart squid!
+
+There are two types of rules -- positive matches, and negative matches.
+Positive matches cause the rewrite, negative matches allow the original
+request to pass.  Rule order is consistent, top-down, first match wins.
+Fields are separated by any amount of whitespace (spaces or tabs.)
+
+
+### Positive matches:
+
+    First field: the hostname to match.
+
+      You can use an exact hostname (www.example.com), or the top level
+      domain (tld) if you want to match everything under a specific host
+      (example.com.)  You can also use a single '*' to match every request,
+      though this essentially bypasses a lot of what makes volta quick, it
+      is included for completeness.  You may have an unlimited amount of
+      rules per hostname.  Hostnames are compared without case sensitivity.
+
+
+    Second field: the path to match.
+
+	  This can be an exact match ('/path/to/something.html'), a regular
+	  expression ('\.(jpg|gif|png)$'), or a single '*' to match for any
+	  path. Regular expressions are matches without case sensitivity.  There
+	  is currently no support for capturing, though this may be added in
+	  a future release.
+
+
+    Third field: The redirect code and url to rewrite to.
 
+      Any pieces of a url that are omitted are automatically replaced
+      with the original request's element -- the exception is a hostname,
+      which is required.  If you omit a redirect code, the URL rewrite is
+      transparent to the client.  You can attach a 301: or 302: prefix to
+      cause a permanent or temporary code to be respectively sent, instead.
+
+
+### Negative matches:
+
+    First field: the hostname to match.
+
+	  See above -- all the same rules apply.
+
+
+    Second field: the path to match.
+
+	  See above -- all the same rules apply.
+
+
+	Third field: the 'negative' marker.
+
+	  This is simply the '-' character, that signals to volta that this is
+	  a negative matching rule.
+
+
+You can easily test your rules by running volta on the command line, and
+pasting URLs into it.   Boost the debug level (-d4) if you're having any issues.
+
+
+Examples
+--------
+
+Rewrite all requests to Google to the SSL version:
+
+    google.com * 302:https://www.google.com
+
+	This will redirect the request "http://www.google.com/search?q=test" to
+	"https://www.google.com/search?q=test".
+
+
+Transparently alter all uploaded images on imgur to be my face:  :)
+
+	i.imgur.com \.(gif|png|jpg)$ http://www.martini.nu/images/mahlon.jpg
+
+
+Expand a local, non qualified hostname to a FQDN (useful alongside the
+'dns_defnames' squid setting to enforce browser proxy behaviors):
+
+	local-example * local-example.company.com
+
+
+Cause all blog content except for 2011 posts to permanently redirect to
+an archival page:
+
+	martini.nu /blog/2011 -
+	martini.nu /blog 301:martini.nu/content-archived.html
+
+
+Turn off rewriting for specific network segment or IP address:
+
+	Squid has this ability built in -- see the 'url_rewrite_access' setting.
+