* Rename 'markers' to 'token'

* Fix up the Rakefile's gem generation
* Add LICENSE
* Add a real README
This commit is contained in:
mahlon 2008-11-09 00:27:36 +00:00
parent f4051c5a35
commit 194fadda98
5 changed files with 183 additions and 138 deletions

29
chunker/LICENSE Normal file
View file

@ -0,0 +1,29 @@
Copyright (c) 2008, Mahlon E. Smith
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are
permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.
* Neither the name of the author, nor the names of contributors may be used to
endorse or promote products derived from this software without specific prior
written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View file

@ -1,7 +1,59 @@
The DATA constant Preface:
The problem Ruby provides an automatic constant called DATA, which is an IO object
that references all text in the current file under an __END__ token.
A workaround I find it convenient to use the __END__ area to store all sorts of
stuff, rather than have to worry about distributing separate files.
The problem:
The DATA constant is determined from whatever ruby believes $0 to be.
It doesn't work inside of other required libraries, so you'll see stuff
like this all the time:
END = File.open( __FILE__ ).read.split( /^__END__/, 2 ).last
It works, but it's more work than I want to do.
A workaround:
Chunker solves this by parsing __END__ tokens for you, and making it
available in the form of a 'DATA_END' constant. It installs this
constant into the class that includes Chunker, so you can use it again
and again, assuming you use a different file for each class.
It also automatically parses out other things that look like tokens, so
you can easily have multiple, distinct documents all embedded into the
__END__ block.
Usage:
There is no direct interface to Chunker. Just include it from a
class to have that file's __END__ data blocks magically become DATA_*
IO constants within that class.
Example:
This produces the string "Yep.\n".
require 'chunker'
class Foom
include Chunker
end
puts Foom.new.class.const_get( :DATA_WICKED ).read
__END__
Stuff in the END block!
__WOW__
Ultimate success!
__WICKED__
Yep.

View file

@ -7,19 +7,30 @@ require 'rubygems'
require 'pathname' require 'pathname'
require 'rake' require 'rake'
require 'rake/packagetask'
require 'rake/gempackagetask' require 'rake/gempackagetask'
require 'spec/rake/spectask' require 'spec/rake/spectask'
require 'rubygems/installer'
require 'rubygems/uninstaller'
###################################################################### ######################################################################
### P A T H S ### P A T H S A N D F I L E S
###################################################################### ######################################################################
BASEDIR = Pathname.new( __FILE__ ).expand_path.dirname.relative_path_from( Pathname.getwd ) BASEDIR = Pathname.new( __FILE__ ).expand_path.dirname.relative_path_from( Pathname.getwd )
TEXT_FILES = %w{ Rakefile README LICENSE }.collect {|f| BASEDIR + f }
SPECDIR = BASEDIR + 'spec' SPECDIR = BASEDIR + 'spec'
LIBDIR = BASEDIR + 'lib'
SPEC_FILES = Pathname.glob( SPECDIR + '**/*_spec.rb' ).reject {|f| f =~ /^\.svn/ } SPEC_FILES = Pathname.glob( SPECDIR + '**/*_spec.rb' ).reject {|f| f =~ /^\.svn/ }
LIBDIR = BASEDIR + 'lib'
LIB_FILES = Pathname.glob( LIBDIR + '**/*.rb').reject {|i| i =~ /\.svn/ }
RELEASE_FILES = TEXT_FILES + LIB_FILES + SPEC_FILES
###################################################################### ######################################################################
### H E L P E R S ### H E L P E R S
###################################################################### ######################################################################
@ -37,6 +48,7 @@ def find_pattern( file, pattern )
return ver.is_a?( String ) ? ver : 'UNKNOWN' return ver.is_a?( String ) ? ver : 'UNKNOWN'
end end
###################################################################### ######################################################################
### P A C K A G E C O N S T A N T S ### P A C K A G E C O N S T A N T S
###################################################################### ######################################################################
@ -44,31 +56,21 @@ end
PKG_NAME = 'chunker' PKG_NAME = 'chunker'
PKG_VERSION = find_pattern( LIBDIR + 'chunker.rb', /VERSION = ['"](\d\.\d(?:\/\d)?)['"]/ ) PKG_VERSION = find_pattern( LIBDIR + 'chunker.rb', /VERSION = ['"](\d\.\d(?:\/\d)?)['"]/ )
PKG_REVISION = find_pattern( LIBDIR + 'chunker.rb', /SVNRev = .+Rev: (\d+)/ ) PKG_REVISION = find_pattern( LIBDIR + 'chunker.rb', /SVNRev = .+Rev: (\d+)/ )
PKG_VERSION = begin PKG_FILE_NAME = "#{PKG_NAME}-#{PKG_VERSION}.#{PKG_REVISION}"
ver = nil
File.open( LIBDIR + 'chunker.rb' ) do |f|
ver = f.each do |line|
break $1 if line =~ /VERSION = ['"](\d\.\d(?:\/\d)?)['"]/
end
end
ver.is_a?( String ) ? ver : 'UNKNOWN'
end
RELEASE_NAME = "REL #{PKG_VERSION}"
PKG_FILE_NAME = "#{PKG_NAME}-#{PKG_VERSION}"
###################################################################### ######################################################################
### T A S K S ### T A S K S
###################################################################### ######################################################################
task :default => [:test] task :default => [ :test, :package ]
### Task: run rspec tests ### Task: run rspec tests
### ###
desc "Run tests" desc "Run tests"
Spec::Rake::SpecTask.new('test') do |task| Spec::Rake::SpecTask.new('test') do |task|
task.spec_files = FileList['spec/**/*.rb'] task.spec_files = SPEC_FILES
task.spec_opts = %w{ -c -fs } task.spec_opts = %w{ -c -fs }
end end
@ -85,118 +87,70 @@ end
### Task: Create gem from source ### Task: Create gem from source
### ###
gem = Gem::Specification.new do |gem| gem = Gem::Specification.new do |gem|
end pkg_build = PKG_REVISION || 0
Rake::GemPackageTask.new( gem ) do |pkg|
pkg.need_zip = true
pkg.need_tar = true
end
__END__
spec = Gem::Specification.new do |s|
s.platform = Gem::Platform::RUBY
s.summary = "Ruby based make-like utility."
s.name = 'rake'
s.version = PKG_VERSION
s.requirements << 'none'
s.require_path = 'lib'
s.autorequire = 'rake'
s.files = PKG_FILES
s.description = <<EOF
Rake is a Make-like program implemented in Ruby. Tasks
and dependencies are specified in standard Ruby syntax.
EOF
end
Rake::GemPackageTask.new(spec) do |pkg|
pkg.need_zip = true
pkg.need_tar = true
end
require 'rake/packagetask'
require 'rake/gempackagetask'
### Task: gem
gemspec = Gem::Specification.new do |gem|
pkg_build = get_svn_rev( BASEDIR ) || 0
gem.summary = "A convenience library for parsing __END__ tokens consistently."
gem.name = PKG_NAME gem.name = PKG_NAME
gem.version = "%s.%s" % [ PKG_VERSION, pkg_build ] gem.version = "%s.%s" % [ PKG_VERSION, pkg_build ]
gem.author = 'Mahlon E. Smith'
gem.summary = "ThingFish - A highly-accessable network datastore" gem.email = 'mahlon@martini.nu'
gem.description = "ThingFish is a network-accessable, searchable, extensible " + gem.homepage = 'http://projects.martini.nu/ruby-modules/wiki/chunker'
"datastore. It can be used to store chunks of data on the " + gem.rubyforge_project = 'mahlon'
"network in an application-independent way, associate the chunks " +
"with other chunks through metadata, and then search for the chunk " +
"you need later and fetch it again, all through a REST API over HTTP."
gem.authors = "Michael Granger and Mahlon E. Smith"
gem.email = "mgranger@laika.com, mahlon@laika.com"
gem.homepage = "http://opensource.laika.com/wiki/ThingFish"
gem.rubyforge_project = 'laika'
gem.has_rdoc = true gem.has_rdoc = true
gem.files = RELEASE_FILES. gem.files = RELEASE_FILES.
collect {|f| f.relative_path_from(BASEDIR).to_s } collect {|f| f.relative_path_from(BASEDIR).to_s }
gem.test_files = SPEC_FILES. gem.test_files = SPEC_FILES.
collect {|f| f.relative_path_from(BASEDIR).to_s } collect {|f| f.relative_path_from(BASEDIR).to_s }
gem.executables = BIN_FILES .
collect {|f| f.relative_path_from(BINDIR).to_s }
gem.add_dependency( 'uuidtools', '>= 1.0.0' ) gem.description = <<-EOF
gem.add_dependency( 'pluginfactory', '>= 1.0.3' ) Ruby provides an automatic constant called DATA, which is an IO object
end that references all text in the current file under an __END__ token.
Rake::GemPackageTask.new( gemspec ) do |task|
task.gem_spec = gemspec I find it convenient to use the __END__ area to store all sorts of
task.need_tar = false stuff, rather than have to worry about distributing separate files.
task.need_tar_gz = true
task.need_tar_bz2 = true The DATA constant is determined from whatever ruby believes $0 to be.
task.need_zip = true It doesn't work inside of other required libraries, so you'll see stuff
like this all the time:
END = File.open( __FILE__ ).read.split( /^__END__/, 2 ).last
It works, but it's more work than I want to do.
Chunker solves this by parsing __END__ tokens for you, and making it
available in the form of a 'DATA_END' constant. It installs this
constant into the class that includes Chunker, so you can use it again
and again, assuming you use a different file for each class.
It also automatically parses out other things that look like tokens, so
you can easily have multiple, distinct documents all embedded into the
__END__ block.
EOF
end end
Rake::GemPackageTask.new( gem ) do |pkg|
desc "Build the ThingFish gem and gems for all the standard plugins" pkg.need_zip = true
task :gems => [:gem] do pkg.need_tar = true
log "Building gems for plugins in: %s" % [PLUGINS.join(', ')] pkg.need_tar_bz2 = true
PLUGINS.each do |plugindir|
log plugindir.basename
cp BASEDIR + 'LICENSE', plugindir
Dir.chdir( plugindir ) do
system 'rake', 'gem'
end
fail unless $?.success?
pkgdir = plugindir + 'pkg'
gems = Pathname.glob( pkgdir + '*.gem' )
cp gems, PKGDIR
end
end end
### Task: install ### Task: install
###
task :install_gem => [ :package ] do task :install_gem => [ :package ] do
$stderr.puts $stderr.puts
installer = Gem::Installer.new( %{pkg/#{PKG_FILE_NAME}.gem} ) installer = Gem::Installer.new( "pkg/#{PKG_FILE_NAME}.gem" )
installer.install installer.install
end end
task :install => [ :install_gem ]
### Task: uninstall ### Task: uninstall
task :uninstall_gem => [:clean] do ###
uninstaller = Gem::Uninstaller.new( PKG_FILE_NAME ) task :uninstall_gem do
uninstaller = Gem::Uninstaller.new( PKG_NAME )
uninstaller.uninstall uninstaller.uninstall
end end
task :uninstall => [ :uninstall_gem ]

View file

@ -1,9 +1,17 @@
#!/usr/bin/ruby
# #
# Chunker! # Chunker: A convenience library for parsing __END__ tokens consistently.
# #
# Mahlon E. Smith <mahlon@martini.nu> # == Version
#
# $Id$
#
# == Author
#
# * Mahlon E. Smith <mahlon@martini.nu>
#
# :include: LICENSE
# #
### Namespace for the datablock parser. ### Namespace for the datablock parser.
### ###
@ -26,18 +34,18 @@ module Chunker
### Parser class for __END__ data blocks. ### Parser class for __END__ data blocks.
### Find each __MARKER__ within the __END__, and put each into a ### Find each __TOKEN__ within the __END__, and put each into a
### DATA_MARKER constant within the namespace that included us. ### DATA_TOKEN constant within the namespace that included us.
### ###
class DataParser class DataParser
# The mark for a DATA block. # The mark for a DATA block.
# #
END_MARKER = /^__END__\r?\n/ END_TOKEN = /^__END__\r?\n/
# The mark for a 'sub' block. # The mark for a 'sub' block.
# #
CHUNK_MARKER = /^__([A-Z\_0-9]+)__\r?\n/ CHUNK_TOKEN = /^__([A-Z\_0-9]+)__\r?\n/
### Constructor: Given a +klass+ and an +io+ to the class file, ### Constructor: Given a +klass+ and an +io+ to the class file,
@ -45,13 +53,13 @@ module Chunker
### ###
def initialize( klass, io ) def initialize( klass, io )
io.open if io.closed? io.open if io.closed?
end_string = io.read.split( END_MARKER, 2 ).last end_string = io.read.split( END_TOKEN, 2 ).last
@klass = klass @klass = klass
@scanner = StringScanner.new( end_string ) @scanner = StringScanner.new( end_string )
io.close io.close
if @scanner.check_until( CHUNK_MARKER ) if @scanner.check_until( CHUNK_TOKEN )
# put each chunk into its own constant # put each chunk into its own constant
self.extract_blocks self.extract_blocks
else else
@ -71,10 +79,10 @@ module Chunker
def extract_blocks def extract_blocks
label = nil label = nil
while @scanner.scan_until( CHUNK_MARKER ) and ! @scanner.eos? while @scanner.scan_until( CHUNK_TOKEN ) and ! @scanner.eos?
data = '' data = ''
# First pass, __END__ contents (until next marker, instead # First pass, __END__ contents (until next token, instead
# of entire data block.) # of entire data block.)
# #
if label.nil? if label.nil?
@ -85,8 +93,8 @@ module Chunker
else else
label = @scanner[1] label = @scanner[1]
if data = @scanner.scan_until( CHUNK_MARKER ) if data = @scanner.scan_until( CHUNK_TOKEN )
# Pull the next marker text out of the data, set up the next pass # Pull the next token text out of the data, set up the next pass
# #
data = data[ 0, data.length - @scanner[0].length ] data = data[ 0, data.length - @scanner[0].length ]
@scanner.pos = self.next_position @scanner.pos = self.next_position
@ -112,13 +120,15 @@ module Chunker
end end
### Included hook: Find the file path for how we arrived here, and open ### Hook included: Find the file path for how we arrived here, and open
### it as an IO object. __FILE__ won't work, so we find it via caller(). ### it as an IO object. Parse the IO for data block tokens.
### Start parsing this file for data blocks.
### ###
def self.included( klass ) def self.included( klass )
# klass.instance_eval{ __FILE__ } awww, nope. # klass.instance_eval{ __FILE__ } awww, nope.
# __FILE__ won't work here, so we find the filename via caller().
#
io = File.open( caller(1).last.sub(/:.*?$/, ''), 'r' ) io = File.open( caller(1).last.sub(/:.*?$/, ''), 'r' )
DataParser.new( klass, io ) DataParser.new( klass, io )
end end
end end

View file

@ -67,14 +67,14 @@ EO_FILE_TEXT
describe Chunker::DataParser do describe Chunker::DataParser do
it "doesn't include content above the __END__ marker" do it "doesn't include content above the __END__ token" do
klass = Class.new klass = Class.new
dp = Chunker::DataParser.new( klass, StringIO.new( FILE_TEXT_MULTIPLE )) dp = Chunker::DataParser.new( klass, StringIO.new( FILE_TEXT_MULTIPLE ))
dp.instance_variable_get( :@scanner ).string. dp.instance_variable_get( :@scanner ).string.
should_not =~ /This is stuff we shouldn't see/ should_not =~ /This is stuff we shouldn't see/
end end
it "doesn't contain the __END__ marker itself" do it "doesn't contain the __END__ token itself" do
klass = Class.new klass = Class.new
dp = Chunker::DataParser.new( klass, StringIO.new( FILE_TEXT )) dp = Chunker::DataParser.new( klass, StringIO.new( FILE_TEXT ))
dp.instance_variable_get( :@scanner ).string.should_not =~ /^__END__/ dp.instance_variable_get( :@scanner ).string.should_not =~ /^__END__/