parser.rl
author Mahlon E. Smith <mahlon@laika.com>
Tue, 24 Jul 2012 12:12:07 -0700
changeset 29 c5d00a24af56
parent 22 822094314703
child 33 ba41bfbe87a2
permissions -rw-r--r--
Add support for Squid's 'url_rewriter_concurrency' pipelining.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
eac7211fe522 Initial commit of an experimental little Squid redirector.
Mahlon E. Smith <mahlon@martini.nu>
parents:
diff changeset
     1
/* vim: set noet nosta sw=4 ts=4 ft=ragel : */
2
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
     2
/*
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
     3
Copyright (c) 2011, Mahlon E. Smith <mahlon@martini.nu>
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
     4
All rights reserved.
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
     5
Redistribution and use in source and binary forms, with or without
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
     6
modification, are permitted provided that the following conditions are met:
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
     7
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
     8
    * Redistributions of source code must retain the above copyright
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
     9
      notice, this list of conditions and the following disclaimer.
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    10
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    11
    * Redistributions in binary form must reproduce the above copyright
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    12
      notice, this list of conditions and the following disclaimer in the
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    13
      documentation and/or other materials provided with the distribution.
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    14
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    15
    * Neither the name of Mahlon E. Smith nor the names of his
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    16
      contributors may be used to endorse or promote products derived
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    17
      from this software without specific prior written permission.
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    18
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    19
THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND ANY
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    20
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    21
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    22
DISCLAIMED. IN NO EVENT SHALL THE REGENTS AND CONTRIBUTORS BE LIABLE FOR ANY
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    23
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    24
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    25
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    26
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    27
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    28
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    29
*/
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    30
0
eac7211fe522 Initial commit of an experimental little Squid redirector.
Mahlon E. Smith <mahlon@martini.nu>
parents:
diff changeset
    31
#include "volta.h"
eac7211fe522 Initial commit of an experimental little Squid redirector.
Mahlon E. Smith <mahlon@martini.nu>
parents:
diff changeset
    32
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
    33
#define MARK_S( LBL ) p_parsed->tokens.LBL ## _start = p;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
    34
#define MARK_E( LBL ) p_parsed->tokens.LBL ## _length = p - ( *pe + p_parsed->tokens.LBL ## _start );
1
823d42546cea Dial in the Makefile and command line option parsing. Better debug
Mahlon E. Smith <mahlon@laika.com>
parents: 0
diff changeset
    35
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
    36
#define COPY_STR( LBL ) copy_string_token( p_parsed->tokens.LBL ## _start, p_parsed->tokens.LBL ## _length )
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
    37
/* #define COPY_IP4( LBL ) copy_ipv4_token(   p_request->tokens.LBL ## _start, p_request->tokens.LBL ## _length ) */
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
    38
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    39
/* 
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    40
 * Tokenize an incoming line from squid, returning a parsed and populated
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    41
 * structure to make redirection decisions against.  This pointer should
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
    42
 * be freed using finish_parsed() after use.
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    43
 * 
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    44
 * Squid documentation about redirectors:
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    45
 * ---------------------------------------------------------------------------
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    46
 * TAG: url_rewrite_program
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    47
 * Specify the location of the executable for the URL rewriter.
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    48
 * Since they can perform almost any function there isn't one included.
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    49
 * 
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    50
 * For each requested URL rewriter will receive on line with the format
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    51
 * 
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    52
 * URL <SP> client_ip "/" fqdn <SP> user <SP> method [<SP> kvpairs]<NL>
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    53
 * 
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    54
 * In the future, the rewriter interface will be extended with
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    55
 * key=value pairs ("kvpairs" shown above).  Rewriter programs
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    56
 * should be prepared to receive and possibly ignore additional
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    57
 * whitespace-separated tokens on each input line.
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    58
 * 
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    59
 * And the rewriter may return a rewritten URL. The other components of
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    60
 * the request line does not need to be returned (ignored if they are).
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    61
 * 
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    62
 * The rewriter can also indicate that a client-side redirect should
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    63
 * be performed to the new URL. This is done by prefixing the returned
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    64
 * URL with "301:" (moved permanently) or 302: (moved temporarily).
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    65
 * 
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    66
 * By default, a URL rewriter is not used.
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    67
 * ---------------------------------------------------------------------------
1
823d42546cea Dial in the Makefile and command line option parsing. Better debug
Mahlon E. Smith <mahlon@laika.com>
parents: 0
diff changeset
    68
*/
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
    69
parsed *
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
    70
parse_request( char *line )
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    71
{
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
    72
	/* machine required vars */
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    73
	unsigned short int cs = 1;
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    74
	char *p   = line;
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    75
	char *pe  = p + strlen(p);
13
23a242d7b7fa 1st iteration of volta actually doing something. Process the request,
Mahlon E. Smith <mahlon@laika.com>
parents: 12
diff changeset
    76
	char *eof = pe;
2
8c88756f81b0 Ensure that parsing can't be subverted by requests larger than the
Mahlon E. Smith <mahlon@martini.nu>
parents: 1
diff changeset
    77
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
    78
	/* the client request pointer */
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
    79
	parsed *p_parsed = init_parsed();
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
    80
	p_parsed->type   = REQUEST;
0
eac7211fe522 Initial commit of an experimental little Squid redirector.
Mahlon E. Smith <mahlon@martini.nu>
parents:
diff changeset
    81
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    82
%%{
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
    83
	machine request_parser;
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    84
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
    85
	action chid_start    { MARK_S(chid) }
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
    86
	action chid_finish   { MARK_E(chid) }
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    87
	action scheme_start  { MARK_S(scheme) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    88
	action scheme_finish { MARK_E(scheme) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    89
	action host_start    { MARK_S(host) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    90
	action host_finish   { MARK_E(host) }
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
    91
	action port_start    { p_parsed->tokens.port_start = p+1; } # strip leading colon
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    92
	action port_finish   { MARK_E(port) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    93
	action path_start    { MARK_S(path) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    94
	action path_finish   { MARK_E(path) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    95
	action meth_start    { MARK_S(meth) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    96
	action meth_finish   { MARK_E(meth) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    97
	action c_ip_start    { MARK_S(c_ip) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    98
	action c_ip_finish   { MARK_E(c_ip) }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
    99
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   100
	action host_error   { debug( 3, LOC, "Unable to parse the request hostname.\n" ); }
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   101
	action scheme_error { debug( 3, LOC, "Unable to parse the request scheme.\n" ); }
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   102
	action meth_error   { debug( 3, LOC, "Unable to parse the request method.\n" ); }
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   103
	action c_ip_error   { debug( 3, LOC, "Unable to parse the client IP address.\n" ); }
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   104
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   105
	# 
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   106
	# Squid line: URL <SP> client_ip "/" fqdn <SP> user <SP> method [<SP> kvpairs]<NL>
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   107
	# 
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   108
	# URI Syntax (RFC 3986) misc notes:
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   109
	#
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   110
	# - Scheme isn't passed to redirectors on CONNECT method requests
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   111
	#
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   112
	# - Hostname segments aren't supposed to be individually greater than 63 chars,
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   113
	#   and the hostname in total shouldn't exceed 255.  They also shouldn't be entirely
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   114
	#   made up of digits, or contain underscores.  In practice, these rules appear to
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   115
	#   be violated constantly by some pretty big sites.
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   116
	#   (( alnum ) | ( alnum . [a-zA-Z0-9\-]{0,63} . alnum )) & !( digit+ );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   117
	#
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   118
	# - ipv6 has some utterly insane rules (RFC 5952) in the name of "shortcuts", which
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   119
	#   only seem like shortcuts to someone writing IP addresses by hand.  Anyone that
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   120
	#   has to parse (or even just read) them has a bunch of seemingly arbitrary work
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   121
	#   dumped in their lap.  Heck, it's impossible to even search for an ipv6 address
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   122
	#   that contains zeros in a text editor, because you have no idea what how it might
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   123
	#   be represented.  Rad!
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   124
	#
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   125
	#   The parser just trusts any ipv6 squid hands us as being valid, without
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   126
	#   any real parsing/validation, other than it consists of hex digits and colons.
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   127
	#
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   128
	# - This parser originally validated path/query/fragment as well, but there were
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   129
	#   enough inconsistencies with unescaped chars and other real-life RFC deviations
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   130
	#   that I opted to just accept what we get from squid.
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   131
	#
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   132
	# - Redirectors aren't handed any userinfo (http://mahlon:password@example.com),
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   133
	#   so no need to check for that.
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   134
	#
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   135
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   136
	host_component = alnum | ( alnum [a-zA-Z0-9\-_]* alnum );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   137
	pchar          = ( alnum | [\-._~!$%&'()*+,;=] );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   138
	path_segment   = '/' ( any - space )*;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   139
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   140
	hostname       = host_component ( '.' host_component )* '.'?;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   141
	ipv4           = digit{1,3} '.' digit{1,3} '.' digit{1,3} '.' digit{1,3};
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   142
	ipv6           = ( xdigit | ':' )+;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   143
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   144
	channel_id     = ( digit+ space )      >chid_start   %chid_finish;
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   145
	scheme         = ( alpha{3,5} '://' )  >scheme_start %scheme_finish @!scheme_error;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   146
	host           = ( hostname | ipv4 )   >host_start   %host_finish   @!host_error;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   147
	port           = ( ':' digit{1,5} )    >port_start   %port_finish;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   148
	path           = path_segment*         >path_start   %path_finish;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   149
	client_ip      = ipv4                  >c_ip_start   %c_ip_finish   @!c_ip_error;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   150
	method         = upper+                >meth_start   %meth_finish   @!meth_error;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   151
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   152
	URL = (
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   153
		start:   channel_id?                            -> Url,
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   154
		Url:     scheme? host port? path?               -> final
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   155
 	);
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   156
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   157
	Squidvars = (
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   158
		start:
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   159
		Client:  space client_ip '/' ( hostname | '-' ) -> User,
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   160
		User:    space pchar+                           -> Method,
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   161
		Method:  space method                           -> KVPairs,
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   162
		KVPairs: ( space any+ )?                        -> final
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   163
 	);
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   164
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   165
	main := URL Squidvars? '\n';
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   166
}%%
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   167
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   168
	/* state machine */
0
eac7211fe522 Initial commit of an experimental little Squid redirector.
Mahlon E. Smith <mahlon@martini.nu>
parents:
diff changeset
   169
	%% write exec;
eac7211fe522 Initial commit of an experimental little Squid redirector.
Mahlon E. Smith <mahlon@martini.nu>
parents:
diff changeset
   170
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   171
	/*
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   172
	 * If we were given an invalid line, bail early after remembering
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   173
	 * the channel ID.
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   174
	 *
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   175
	 */
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   176
	if ( cs < %%{ write first_final; }%% ) {
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   177
		debug( 3, LOC, "Invalid request line (%d), skipped\n", v.timer.lines + 1 );
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   178
		debug( 4, LOC, "%s", line );
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   179
		p_parsed->chid = COPY_STR( chid );
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   180
		return( p_parsed );
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   181
	}
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   182
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   183
	debug( 6, LOC, "%s", line );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   184
	(void)populate_parsed( p_parsed );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   185
	return( p_parsed );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   186
}
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   187
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   188
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   189
/*
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   190
 * Tokenize a value string from a successful database lookup, returning a parsed
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   191
 * and populated structure. This pointer should be freed using finish_parsed() after use.
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   192
 *
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   193
 */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   194
parsed *
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   195
parse_rule( char *rewrite )
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   196
{
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   197
	/* machine required vars */
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   198
	unsigned short int cs = 1;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   199
	char *p   = rewrite;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   200
	char *pe  = p + strlen(p);
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   201
	char *eof = pe;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   202
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   203
	/* the client rule pointer */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   204
	parsed *p_parsed = init_parsed();
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   205
	p_parsed->type   = RULE;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   206
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   207
%%{
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   208
	machine rule_parser;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   209
15
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   210
	action match_start   { MARK_S(path_re) }
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   211
	action match_finish  { MARK_E(path_re) }
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   212
	action redir_start   { MARK_S(redir) }
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   213
	action redir_finish  { p_parsed->tokens.redir_length = 3; } # strip trailing colon
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   214
	action negate_finish { p_parsed->negate = 1; }
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   215
	action luapath_start { p_parsed->lua = 1; MARK_S(luapath) }
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   216
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   217
	action scheme_start   { MARK_S(scheme) }
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   218
	action scheme_finish  { MARK_E(scheme) }
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   219
	action host_start     { MARK_S(host) }
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   220
	action host_finish    { MARK_E(host) }
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   221
	action port_start     { p_parsed->tokens.port_start = p+1; } # strip leading colon
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   222
	action port_finish    { MARK_E(port) }
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   223
	action path_start     { MARK_S(path) }
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   224
	action path_finish    { MARK_E(path) }
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   225
	action luapath_finish { MARK_E(luapath) }
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   226
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   227
	action match_error   { debug( 3, LOC, "Unable to parse the rule path matcher.\n" ); }
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   228
	action host_error    { debug( 3, LOC, "Unable to parse the rule hostname.\n" ); }
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   229
	action luapath_error { debug( 3, LOC, "Unable to parse the lua path.\n" ); }
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   230
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   231
	host_component  = alnum | ( alnum [a-zA-Z0-9\-_]* alnum );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   232
	path_segment    = '/' ( any - space )*;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   233
15
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   234
	sep       = space+;
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   235
	hostname  = host_component ( '.' host_component )* '.'?;
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   236
	ipv4      = digit{1,3} '.' digit{1,3} '.' digit{1,3} '.' digit{1,3};
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   237
	ipv6      = ( xdigit | ':' )+;
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   238
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   239
	negate    = ( '-' )                 %negate_finish;
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   240
	path_re   = ( any - space )+        >match_start    %match_finish   @!match_error;
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   241
	luapath   = ( any - space )+        >luapath_start  %luapath_finish @!luapath_error;
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   242
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   243
	redir     = ( ('301' | '302') ':' ) >redir_start  %redir_finish;
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   244
	scheme    = ( alpha{3,5} '://' )    >scheme_start %scheme_finish;
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   245
	host      = ( hostname | ipv4 )     >host_start   %host_finish   @!host_error;
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   246
	port      = ( ':' digit{1,5} )      >port_start   %port_finish;
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   247
	path      = path_segment*           >path_start   %path_finish;
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   248
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   249
	rewrite   = ( redir? scheme? host port? path? );
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   250
	luarule   = ( 'lua:' luapath );
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   251
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   252
	main := path_re sep ( rewrite | negate | luarule );
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   253
}%%
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   254
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   255
	/* state machine */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   256
	%% write exec;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   257
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   258
	/* If we were given an invalid rule, bail early */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   259
	if ( cs < %%{ write first_final; }%% ) {
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   260
		free( p_parsed ), p_parsed = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   261
		debug( 3, LOC, "Invalid rule\n" );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   262
		debug( 4, LOC, "%s\n", rewrite );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   263
		return( NULL );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   264
	}
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   265
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   266
	(void)populate_parsed( p_parsed );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   267
	return( p_parsed );
0
eac7211fe522 Initial commit of an experimental little Squid redirector.
Mahlon E. Smith <mahlon@martini.nu>
parents:
diff changeset
   268
}
eac7211fe522 Initial commit of an experimental little Squid redirector.
Mahlon E. Smith <mahlon@martini.nu>
parents:
diff changeset
   269
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   270
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   271
/*
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   272
 * Tokenize a line from an ascii representation of the database, returning
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   273
 * a pointer to a parsed struct.  Used for creation of a new cdb file,
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   274
 * validating data prior to use.
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   275
 *
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   276
 */
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   277
struct db_input *
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   278
parse_dbinput( char *line )
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   279
{
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   280
	/* machine required vars */
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   281
	unsigned short int cs = 1;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   282
	char *p   = line;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   283
	char *pe  = p + strlen(p);
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   284
	char *eof = pe;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   285
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   286
	/* the db line input pointer */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   287
	struct db_input *dbline = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   288
	if ( (dbline = malloc( sizeof(struct db_input) )) == NULL ) {
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   289
		debug( 5, LOC, "Unable to allocate memory for db input: %s\n", strerror(errno) );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   290
		return( NULL );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   291
	}
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   292
	dbline->klen   = 0;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   293
	dbline->vlen   = 0;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   294
	dbline->kstart = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   295
	dbline->key    = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   296
	dbline->vstart = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   297
	dbline->val    = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   298
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   299
%%{
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   300
	machine dbinput_parser;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   301
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   302
	action key_start  { dbline->kstart = p; }
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   303
	action key_finish { dbline->klen = p - ( *pe + dbline->kstart ); }
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   304
	action key_error  { debug( 0, LOC, "Invalid key format\n" ); }
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   305
	action val_start  { dbline->vstart = p; }
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   306
	action val_finish { dbline->vlen = p - ( *pe + dbline->vstart ); }
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   307
	action val_error  { debug( 0, LOC, "Invalid rewrite value\n" ); }
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   308
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   309
	sep            = space+;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   310
	host_component = alnum | ( alnum [a-zA-Z0-9\-_]* alnum );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   311
	hostname       = host_component ( '.' host_component )* '.'?;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   312
	ipv4           = digit{1,3} '.' digit{1,3} '.' digit{1,3} '.' digit{1,3};
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   313
	token          = ( any - space )+;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   314
	host           = ( hostname | ipv4 );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   315
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   316
	key = ( host | '*' )      >key_start %key_finish @!key_error;
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   317
	val = ( token sep token ) >val_start %val_finish @!val_error;
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   318
	
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   319
	main:= key sep val '\n';
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   320
}%%
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   321
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   322
	/* state machine */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   323
	%% write exec;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   324
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   325
	/* If the input line was invalid, bail early */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   326
	if ( cs < %%{ write first_final; }%% ) {
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   327
		free( dbline ), dbline = NULL;
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   328
		return( NULL );
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   329
	}
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   330
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   331
	/* populate struct */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   332
	dbline->key = copy_string_token( dbline->kstart, dbline->klen );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   333
	dbline->val = copy_string_token( dbline->vstart, dbline->vlen );
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   334
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   335
	/* check the val to make sure it is a valid rewrite rule */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   336
	parsed *valstr = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   337
	valstr = parse_rule( dbline->val );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   338
	if ( valstr == NULL ) {
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   339
		free( dbline->key );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   340
		free( dbline->val );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   341
		free( dbline );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   342
		return( NULL );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   343
	}
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   344
	finish_parsed( valstr );
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   345
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   346
	return( dbline );
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   347
}
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   348
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   349
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   350
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   351
/*
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   352
 * Initialize and return a pointer to a new parser object.
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   353
 *
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   354
 */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   355
parsed *
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   356
init_parsed( void )
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   357
{
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   358
	parsed *p_parsed = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   359
	if ( (p_parsed = malloc( sizeof(parsed) )) == NULL ) {
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   360
		debug( 5, LOC, "Unable to allocate memory for parsed struct: %s\n", strerror(errno) );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   361
		return( NULL );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   362
	}
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   363
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   364
	p_parsed->valid     = 0;
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   365
	p_parsed->type      = 0;
18
d4ce82194b64 - 1st pass at documentation with the README.
Mahlon E. Smith <mahlon@laika.com>
parents: 15
diff changeset
   366
	p_parsed->negate    = 0;
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   367
	p_parsed->lua       = 0;
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   368
	p_parsed->chid      = NULL;
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   369
	p_parsed->path_re   = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   370
	p_parsed->redir     = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   371
	p_parsed->scheme    = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   372
	p_parsed->host      = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   373
	p_parsed->tld       = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   374
	p_parsed->port      = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   375
	p_parsed->path      = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   376
	p_parsed->user      = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   377
	p_parsed->method    = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   378
	p_parsed->client_ip = NULL;
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   379
	p_parsed->luapath   = NULL;
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   380
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   381
	p_parsed->tokens.chid_start     = NULL;
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   382
	p_parsed->tokens.path_re_start  = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   383
	p_parsed->tokens.redir_start    = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   384
	p_parsed->tokens.scheme_start   = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   385
	p_parsed->tokens.host_start     = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   386
	p_parsed->tokens.port_start     = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   387
	p_parsed->tokens.path_start     = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   388
	p_parsed->tokens.meth_start     = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   389
	p_parsed->tokens.c_ip_start     = NULL;
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   390
	p_parsed->tokens.luapath_start  = NULL;
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   391
	p_parsed->tokens.chid_length    = 0;
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   392
	p_parsed->tokens.path_re_length = 0;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   393
	p_parsed->tokens.redir_length   = 0;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   394
	p_parsed->tokens.scheme_length  = 0;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   395
	p_parsed->tokens.host_length    = 0;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   396
	p_parsed->tokens.port_length    = 0;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   397
	p_parsed->tokens.path_length    = 0;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   398
	p_parsed->tokens.meth_length    = 0;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   399
	p_parsed->tokens.c_ip_length    = 0;
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   400
	p_parsed->tokens.luapath_length = 0;
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   401
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   402
	return p_parsed;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   403
}
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   404
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   405
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   406
/*
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   407
 * Release memory used by the parsed struct.
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   408
 *
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   409
 */
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   410
void
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   411
finish_parsed( parsed *p_parsed )
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   412
{
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   413
	if ( p_parsed == NULL ) return;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   414
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   415
	free( p_parsed->scheme );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   416
	free( p_parsed->host );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   417
	free( p_parsed->path );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   418
	free( p_parsed->port );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   419
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   420
	if ( p_parsed->type == REQUEST ) {
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   421
		free( p_parsed->chid );
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   422
		free( p_parsed->tld );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   423
		free( p_parsed->method );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   424
		free( p_parsed->client_ip );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   425
	}
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   426
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   427
	if ( p_parsed->type == RULE ) {
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   428
		free( p_parsed->path_re );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   429
		free( p_parsed->redir );
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   430
		free( p_parsed->luapath );
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   431
	}
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   432
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   433
	free( p_parsed ), p_parsed = NULL;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   434
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   435
	return;
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   436
}
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   437
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   438
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   439
/*
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   440
 * Take the previously parsed token locations and copy them into the request struct.
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   441
 *
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   442
 */
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   443
void
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   444
populate_parsed( parsed *p_parsed )
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   445
{
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   446
	p_parsed->scheme = COPY_STR( scheme );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   447
	p_parsed->host   = COPY_STR( host );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   448
	p_parsed->path   = COPY_STR( path );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   449
	p_parsed->port   = COPY_STR( port );
13
23a242d7b7fa 1st iteration of volta actually doing something. Process the request,
Mahlon E. Smith <mahlon@laika.com>
parents: 12
diff changeset
   450
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   451
	if ( p_parsed->type == REQUEST ) {
29
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   452
		p_parsed->valid     = 1;
c5d00a24af56 Add support for Squid's 'url_rewriter_concurrency' pipelining.
Mahlon E. Smith <mahlon@laika.com>
parents: 22
diff changeset
   453
		p_parsed->chid      = COPY_STR( chid );
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   454
		p_parsed->method    = COPY_STR( meth );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   455
		p_parsed->client_ip = COPY_STR( c_ip );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   456
		/* p_request->client_ip = COPY_IP4( c_ip ); */
13
23a242d7b7fa 1st iteration of volta actually doing something. Process the request,
Mahlon E. Smith <mahlon@laika.com>
parents: 12
diff changeset
   457
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   458
		(void)lowercase_str( p_parsed->host, p_parsed->tokens.host_length );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   459
		(void)parse_tld( p_parsed );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   460
	}
13
23a242d7b7fa 1st iteration of volta actually doing something. Process the request,
Mahlon E. Smith <mahlon@laika.com>
parents: 12
diff changeset
   461
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   462
	if ( p_parsed->type == RULE ) {
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   463
		p_parsed->path_re = COPY_STR( path_re );
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   464
		p_parsed->redir   = COPY_STR( redir );
22
822094314703 Add the ability to optionally script rewrite logic using Lua.
Mahlon E. Smith <mahlon@martini.nu>
parents: 18
diff changeset
   465
		p_parsed->luapath = COPY_STR( luapath );
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   466
	}
10
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   467
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   468
	return;
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   469
}
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   470
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   471
d07309450285 Get the ragel line parser properly tokenizing the input lines. Add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 2
diff changeset
   472
/*
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   473
 * Pull the top level domain out of the requested hostname.
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   474
 *
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   475
 */
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   476
void
14
51eb85ae4de4 There isn't a fast way to look up ( value exists or null ) for every
Mahlon E. Smith <mahlon@martini.nu>
parents: 13
diff changeset
   477
parse_tld( parsed *p_request )
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   478
{
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   479
	unsigned short int cs = 5, mark = 0;
15
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   480
	char *p  = p_request->host;
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   481
	char *pe = p + p_request->tokens.host_length;
2706fc514dea Add whitelisting rules, to negate other matches if they come first in
Mahlon E. Smith <mahlon@martini.nu>
parents: 14
diff changeset
   482
	char *ts = 0, *te = 0, *eof = pe;
12
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   483
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   484
%%{
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   485
    machine tld_parser;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   486
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   487
	host_component = alnum | ( alnum [a-zA-Z0-9\-_]* alnum );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   488
	tld = ( host_component '.' host_component );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   489
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   490
	main := |*
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   491
		tld => { mark = ( p_request->tokens.host_length - (int)strlen(te) ); };
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   492
	*|;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   493
}%%
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   494
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   495
	/* It's far easier (and faster) to scan from left to right rather than
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   496
	   backtrack, so start by reversing the requested host string. */
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   497
	reverse_str( p_request->host );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   498
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   499
	/* scanner */
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   500
	%% write exec;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   501
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   502
	/* If there was a mark, then allocate memory and copy. */
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   503
	if ( mark != 0 ) {
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   504
		if ( (p_request->tld = calloc( mark + 1, sizeof(char) )) == NULL ) {
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   505
			debug( 5, LOC, "Unable to allocate memory for tld token: %s\n", strerror(errno) );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   506
			reverse_str( p_request->host );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   507
			return;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   508
		}
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   509
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   510
		memcpy( p_request->tld, p_request->host, mark );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   511
		reverse_str( p_request->tld );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   512
	}
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   513
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   514
	/* restore the hostname. */
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   515
	reverse_str( p_request->host );
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   516
	return;
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   517
}
191b3c25974a Clean up redundant parser actions via preprocessor macros, add a
Mahlon E. Smith <mahlon@martini.nu>
parents: 10
diff changeset
   518