* Rename 'markers' to 'token'

* Fix up the Rakefile's gem generation
* Add LICENSE
* Add a real README
This commit is contained in:
mahlon 2008-11-09 00:27:36 +00:00
parent f4051c5a35
commit 194fadda98
5 changed files with 183 additions and 138 deletions

29
chunker/LICENSE Normal file
View file

@ -0,0 +1,29 @@
Copyright (c) 2008, Mahlon E. Smith
All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are
permitted provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this
list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this
list of conditions and the following disclaimer in the documentation and/or
other materials provided with the distribution.
* Neither the name of the author, nor the names of contributors may be used to
endorse or promote products derived from this software without specific prior
written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING
NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

View file

@ -1,7 +1,59 @@
The DATA constant
Preface:
The problem
Ruby provides an automatic constant called DATA, which is an IO object
that references all text in the current file under an __END__ token.
A workaround
I find it convenient to use the __END__ area to store all sorts of
stuff, rather than have to worry about distributing separate files.
The problem:
The DATA constant is determined from whatever ruby believes $0 to be.
It doesn't work inside of other required libraries, so you'll see stuff
like this all the time:
END = File.open( __FILE__ ).read.split( /^__END__/, 2 ).last
It works, but it's more work than I want to do.
A workaround:
Chunker solves this by parsing __END__ tokens for you, and making it
available in the form of a 'DATA_END' constant. It installs this
constant into the class that includes Chunker, so you can use it again
and again, assuming you use a different file for each class.
It also automatically parses out other things that look like tokens, so
you can easily have multiple, distinct documents all embedded into the
__END__ block.
Usage:
There is no direct interface to Chunker. Just include it from a
class to have that file's __END__ data blocks magically become DATA_*
IO constants within that class.
Example:
This produces the string "Yep.\n".
require 'chunker'
class Foom
include Chunker
end
puts Foom.new.class.const_get( :DATA_WICKED ).read
__END__
Stuff in the END block!
__WOW__
Ultimate success!
__WICKED__
Yep.

View file

@ -7,19 +7,30 @@ require 'rubygems'
require 'pathname'
require 'rake'
require 'rake/packagetask'
require 'rake/gempackagetask'
require 'spec/rake/spectask'
require 'rubygems/installer'
require 'rubygems/uninstaller'
######################################################################
### P A T H S
### P A T H S A N D F I L E S
######################################################################
BASEDIR = Pathname.new( __FILE__ ).expand_path.dirname.relative_path_from( Pathname.getwd )
TEXT_FILES = %w{ Rakefile README LICENSE }.collect {|f| BASEDIR + f }
SPECDIR = BASEDIR + 'spec'
LIBDIR = BASEDIR + 'lib'
SPEC_FILES = Pathname.glob( SPECDIR + '**/*_spec.rb' ).reject {|f| f =~ /^\.svn/ }
LIBDIR = BASEDIR + 'lib'
LIB_FILES = Pathname.glob( LIBDIR + '**/*.rb').reject {|i| i =~ /\.svn/ }
RELEASE_FILES = TEXT_FILES + LIB_FILES + SPEC_FILES
######################################################################
### H E L P E R S
######################################################################
@ -37,6 +48,7 @@ def find_pattern( file, pattern )
return ver.is_a?( String ) ? ver : 'UNKNOWN'
end
######################################################################
### P A C K A G E C O N S T A N T S
######################################################################
@ -44,31 +56,21 @@ end
PKG_NAME = 'chunker'
PKG_VERSION = find_pattern( LIBDIR + 'chunker.rb', /VERSION = ['"](\d\.\d(?:\/\d)?)['"]/ )
PKG_REVISION = find_pattern( LIBDIR + 'chunker.rb', /SVNRev = .+Rev: (\d+)/ )
PKG_VERSION = begin
ver = nil
File.open( LIBDIR + 'chunker.rb' ) do |f|
ver = f.each do |line|
break $1 if line =~ /VERSION = ['"](\d\.\d(?:\/\d)?)['"]/
end
end
ver.is_a?( String ) ? ver : 'UNKNOWN'
end
RELEASE_NAME = "REL #{PKG_VERSION}"
PKG_FILE_NAME = "#{PKG_NAME}-#{PKG_VERSION}"
PKG_FILE_NAME = "#{PKG_NAME}-#{PKG_VERSION}.#{PKG_REVISION}"
######################################################################
### T A S K S
######################################################################
task :default => [:test]
task :default => [ :test, :package ]
### Task: run rspec tests
###
desc "Run tests"
Spec::Rake::SpecTask.new('test') do |task|
task.spec_files = FileList['spec/**/*.rb']
task.spec_files = SPEC_FILES
task.spec_opts = %w{ -c -fs }
end
@ -85,118 +87,70 @@ end
### Task: Create gem from source
###
gem = Gem::Specification.new do |gem|
end
Rake::GemPackageTask.new( gem ) do |pkg|
pkg.need_zip = true
pkg.need_tar = true
end
__END__
spec = Gem::Specification.new do |s|
s.platform = Gem::Platform::RUBY
s.summary = "Ruby based make-like utility."
s.name = 'rake'
s.version = PKG_VERSION
s.requirements << 'none'
s.require_path = 'lib'
s.autorequire = 'rake'
s.files = PKG_FILES
s.description = <<EOF
Rake is a Make-like program implemented in Ruby. Tasks
and dependencies are specified in standard Ruby syntax.
EOF
end
Rake::GemPackageTask.new(spec) do |pkg|
pkg.need_zip = true
pkg.need_tar = true
end
require 'rake/packagetask'
require 'rake/gempackagetask'
### Task: gem
gemspec = Gem::Specification.new do |gem|
pkg_build = get_svn_rev( BASEDIR ) || 0
pkg_build = PKG_REVISION || 0
gem.summary = "A convenience library for parsing __END__ tokens consistently."
gem.name = PKG_NAME
gem.version = "%s.%s" % [ PKG_VERSION, pkg_build ]
gem.summary = "ThingFish - A highly-accessable network datastore"
gem.description = "ThingFish is a network-accessable, searchable, extensible " +
"datastore. It can be used to store chunks of data on the " +
"network in an application-independent way, associate the chunks " +
"with other chunks through metadata, and then search for the chunk " +
"you need later and fetch it again, all through a REST API over HTTP."
gem.authors = "Michael Granger and Mahlon E. Smith"
gem.email = "mgranger@laika.com, mahlon@laika.com"
gem.homepage = "http://opensource.laika.com/wiki/ThingFish"
gem.rubyforge_project = 'laika'
gem.author = 'Mahlon E. Smith'
gem.email = 'mahlon@martini.nu'
gem.homepage = 'http://projects.martini.nu/ruby-modules/wiki/chunker'
gem.rubyforge_project = 'mahlon'
gem.has_rdoc = true
gem.files = RELEASE_FILES.
collect {|f| f.relative_path_from(BASEDIR).to_s }
gem.test_files = SPEC_FILES.
collect {|f| f.relative_path_from(BASEDIR).to_s }
gem.executables = BIN_FILES .
collect {|f| f.relative_path_from(BINDIR).to_s }
gem.add_dependency( 'uuidtools', '>= 1.0.0' )
gem.add_dependency( 'pluginfactory', '>= 1.0.3' )
end
Rake::GemPackageTask.new( gemspec ) do |task|
task.gem_spec = gemspec
task.need_tar = false
task.need_tar_gz = true
task.need_tar_bz2 = true
task.need_zip = true
gem.description = <<-EOF
Ruby provides an automatic constant called DATA, which is an IO object
that references all text in the current file under an __END__ token.
I find it convenient to use the __END__ area to store all sorts of
stuff, rather than have to worry about distributing separate files.
The DATA constant is determined from whatever ruby believes $0 to be.
It doesn't work inside of other required libraries, so you'll see stuff
like this all the time:
END = File.open( __FILE__ ).read.split( /^__END__/, 2 ).last
It works, but it's more work than I want to do.
Chunker solves this by parsing __END__ tokens for you, and making it
available in the form of a 'DATA_END' constant. It installs this
constant into the class that includes Chunker, so you can use it again
and again, assuming you use a different file for each class.
It also automatically parses out other things that look like tokens, so
you can easily have multiple, distinct documents all embedded into the
__END__ block.
EOF
end
desc "Build the ThingFish gem and gems for all the standard plugins"
task :gems => [:gem] do
log "Building gems for plugins in: %s" % [PLUGINS.join(', ')]
PLUGINS.each do |plugindir|
log plugindir.basename
cp BASEDIR + 'LICENSE', plugindir
Dir.chdir( plugindir ) do
system 'rake', 'gem'
end
fail unless $?.success?
pkgdir = plugindir + 'pkg'
gems = Pathname.glob( pkgdir + '*.gem' )
cp gems, PKGDIR
end
Rake::GemPackageTask.new( gem ) do |pkg|
pkg.need_zip = true
pkg.need_tar = true
pkg.need_tar_bz2 = true
end
### Task: install
###
task :install_gem => [ :package ] do
$stderr.puts
installer = Gem::Installer.new( %{pkg/#{PKG_FILE_NAME}.gem} )
installer = Gem::Installer.new( "pkg/#{PKG_FILE_NAME}.gem" )
installer.install
end
task :install => [ :install_gem ]
### Task: uninstall
task :uninstall_gem => [:clean] do
uninstaller = Gem::Uninstaller.new( PKG_FILE_NAME )
###
task :uninstall_gem do
uninstaller = Gem::Uninstaller.new( PKG_NAME )
uninstaller.uninstall
end
task :uninstall => [ :uninstall_gem ]

View file

@ -1,9 +1,17 @@
#!/usr/bin/ruby
#
# Chunker!
# Chunker: A convenience library for parsing __END__ tokens consistently.
#
# Mahlon E. Smith <mahlon@martini.nu>
# == Version
#
# $Id$
#
# == Author
#
# * Mahlon E. Smith <mahlon@martini.nu>
#
# :include: LICENSE
#
### Namespace for the datablock parser.
###
@ -26,18 +34,18 @@ module Chunker
### Parser class for __END__ data blocks.
### Find each __MARKER__ within the __END__, and put each into a
### DATA_MARKER constant within the namespace that included us.
### Find each __TOKEN__ within the __END__, and put each into a
### DATA_TOKEN constant within the namespace that included us.
###
class DataParser
# The mark for a DATA block.
#
END_MARKER = /^__END__\r?\n/
END_TOKEN = /^__END__\r?\n/
# The mark for a 'sub' block.
#
CHUNK_MARKER = /^__([A-Z\_0-9]+)__\r?\n/
CHUNK_TOKEN = /^__([A-Z\_0-9]+)__\r?\n/
### Constructor: Given a +klass+ and an +io+ to the class file,
@ -45,13 +53,13 @@ module Chunker
###
def initialize( klass, io )
io.open if io.closed?
end_string = io.read.split( END_MARKER, 2 ).last
end_string = io.read.split( END_TOKEN, 2 ).last
@klass = klass
@scanner = StringScanner.new( end_string )
io.close
if @scanner.check_until( CHUNK_MARKER )
if @scanner.check_until( CHUNK_TOKEN )
# put each chunk into its own constant
self.extract_blocks
else
@ -71,10 +79,10 @@ module Chunker
def extract_blocks
label = nil
while @scanner.scan_until( CHUNK_MARKER ) and ! @scanner.eos?
while @scanner.scan_until( CHUNK_TOKEN ) and ! @scanner.eos?
data = ''
# First pass, __END__ contents (until next marker, instead
# First pass, __END__ contents (until next token, instead
# of entire data block.)
#
if label.nil?
@ -85,8 +93,8 @@ module Chunker
else
label = @scanner[1]
if data = @scanner.scan_until( CHUNK_MARKER )
# Pull the next marker text out of the data, set up the next pass
if data = @scanner.scan_until( CHUNK_TOKEN )
# Pull the next token text out of the data, set up the next pass
#
data = data[ 0, data.length - @scanner[0].length ]
@scanner.pos = self.next_position
@ -112,13 +120,15 @@ module Chunker
end
### Included hook: Find the file path for how we arrived here, and open
### it as an IO object. __FILE__ won't work, so we find it via caller().
### Start parsing this file for data blocks.
### Hook included: Find the file path for how we arrived here, and open
### it as an IO object. Parse the IO for data block tokens.
###
def self.included( klass )
# klass.instance_eval{ __FILE__ } awww, nope.
# __FILE__ won't work here, so we find the filename via caller().
#
io = File.open( caller(1).last.sub(/:.*?$/, ''), 'r' )
DataParser.new( klass, io )
end
end

View file

@ -67,14 +67,14 @@ EO_FILE_TEXT
describe Chunker::DataParser do
it "doesn't include content above the __END__ marker" do
it "doesn't include content above the __END__ token" do
klass = Class.new
dp = Chunker::DataParser.new( klass, StringIO.new( FILE_TEXT_MULTIPLE ))
dp.instance_variable_get( :@scanner ).string.
should_not =~ /This is stuff we shouldn't see/
end
it "doesn't contain the __END__ marker itself" do
it "doesn't contain the __END__ token itself" do
klass = Class.new
dp = Chunker::DataParser.new( klass, StringIO.new( FILE_TEXT ))
dp.instance_variable_get( :@scanner ).string.should_not =~ /^__END__/