From c0cd35e2606f3a5d04b1bdb78bd5715c762a80ce Mon Sep 17 00:00:00 2001 From: "mahlon@martini.nu" Date: Sun, 14 Mar 2021 23:19:41 +0000 Subject: [PATCH] Multiple changes. - Complete first round of documentation. - Complete first round of tests and coverage. - Expand the thread benchmarker for testing metasync. - Add enumerators (each_key/each_value/each_pair) using cursors. - Remove keys() implementation in favor of using the emumerable. - Make deserialization more DRY. - Add an efficient length() method. - Add various Hash-alike methods. - General code cleanup for release. FossilOrigin-Name: 0d2bd3995f203c9ac1734ac3da32dd2f09efda9a394e554e6006e44dd07a33b0 --- .pryrc | 2 +- README.md | 460 ++++++++++++++++++++++++------------ experiments/thread_usage.rb | 82 ++++--- ext/mdbx_ext/database.c | 206 ++++++++++++---- lib/mdbx/database.rb | 135 ++++++++++- spec/lib/helper.rb | 1 + spec/mdbx/database_spec.rb | 179 +++++++++++++- spec/mdbx/stats_spec.rb | 34 +-- 8 files changed, 828 insertions(+), 271 deletions(-) diff --git a/.pryrc b/.pryrc index 4a3bcfc..d884a04 100644 --- a/.pryrc +++ b/.pryrc @@ -11,6 +11,6 @@ rescue Exception => e e.backtrace.join( "\n\t" ) end -# db = MDBX::Database.open( 'tmp/testdb' ) +db = MDBX::Database.open( 'tmp/testdb', max_collections: 100 ) diff --git a/README.md b/README.md index 2b7da8f..90757a7 100644 --- a/README.md +++ b/README.md @@ -25,64 +25,333 @@ sourcehut: This is a Ruby (MRI) binding for the libmdbx database library. libmdbx is an extremely fast, compact, powerful, embedded, transactional -key-value database, with permissive license. libmdbx has a specific set +key-value database, with a permissive license. libmdbx has a specific set of properties and capabilities, focused on creating unique lightweight solutions. - - Allows a swarm of multi-threaded processes to ACIDly read and update - several key-value maps and multimaps in a locally-shared database. - - - Provides extraordinary performance, minimal overhead through - Memory-Mapping and Olog(N) operations costs by virtue of B+ tree. - - - Requires no maintenance and no crash recovery since it doesn't use - WAL, but that might be a caveat for write-intensive workloads with - durability requirements. - - - Compact and friendly for fully embedding. Only ≈25KLOC of C11, - ≈64K x86 binary code of core, no internal threads neither server - process(es), but implements a simplified variant of the Berkeley DB - and dbm API. - - - Enforces serializability for writers just by single mutex and - affords wait-free for parallel readers without atomic/interlocked - operations, while writing and reading transactions do not block each - other. - - - Guarantee data integrity after crash unless this was explicitly - neglected in favour of write performance. - - - Supports Linux, Windows, MacOS, Android, iOS, FreeBSD, DragonFly, - Solaris, OpenSolaris, OpenIndiana, NetBSD, OpenBSD and other systems - compliant with POSIX.1-2008. - - - Historically, libmdbx is a deeply revised and extended descendant - of the amazing Lightning Memory-Mapped Database. libmdbx inherits - all benefits from LMDB, but resolves some issues and adds a set of - improvements. - - -### Examples - -[forthcoming] +For more information about libmdbx (features, limitations, etc), see the +[introduction](https://erthink.github.io/libmdbx/intro.html). ## Prerequisites * Ruby 2.6+ -* libmdbx (https://github.com/erthink/libmdbx) +* [libmdbx](https://github.com/erthink/libmdbx) ## Installation $ gem install mdbx +You may need to be specific if the libmdbx headers are located in a +nonstandard location for your operating system: + + $ gem install mdbx -- --with-opt-dir=/usr/local + + +## Usage + +Some quick concepts: + + - A **database** is contained in a file, normally contained in directory + with it's associated lockfile. + - Each database can optionally contain multiple named **collections**, + which can be thought of as distinct namespaces. + - Each collection can contain any number of **keys**, and their associated + **values**. + - A **snapshot** is a self-consistent read-only view of the database. + It remains consistent even if another thread or process writes changes. + - A **transaction** is a writable snapshot. Changes made within a + transaction are not seen by other snapshots until committed. + +### Open (and close) a database handle + +Open a database handle, creating an empty one if not already present. + +```ruby +db = MDBX::Database.open( "/path/to/file", options ) +db.close +``` + +In block form, the handle is automatically closed. + +```ruby +MDBX::Database.open( 'database' ) do |db| + puts db[ 'key1' ] +end # closed database +``` + + +### Read data + +You can use the database handle as a hash. Reading a value automatically +creates a snapshot, retrieves the value, and closes the snapshot before +returning it. + +```ruby +db[ 'key1' ] #=> val +``` + +All data reads require a snapshot (or transaction). + +The `snapshot` method creates a long-running snapshot manually. In +block form, the snapshot is automatically closed when the block exits. +Sharing a snapshot between reads is significantly faster when fetching +many values or in tight loops. + +```ruby +# read-only block +db.snapshot do + db[ 'key1' ] #=> val + ... +end # snapshot closed +``` + +You can also open and close a snapshot manually. + +```ruby +db.snapshot +db.values_at( 'key1', 'key2' ) #=> [ value, value ] +db.rollback +``` + +Technically, `snapshot` just sets the internal state and returns the +database handle - the handle is also yielded when using blocks. The +following 3 examples are identical, use whatever form you prefer. + +```ruby +snap = db.snapshot +snap[ 'key1' ] +snap.abort + +db.snapshot do |snap| + snap[ 'key1' ] +end + +db.snapshot do + db[ 'key1' ] +end +``` + +Attempting writes while within an open snapshot is an exception. + + +### Write data + +Writing data is also hash-like. Assigning a value to a key +automatically opens a writable transaction, stores the value, and +commits the transaction before returning. + +All keys are strings, or converted to a string automatically. + +```ruby +db[ 'key1' ] = val +db[ :key1 ] == db[ 'key1' ] #=> true +``` + +All data writes require a transaction. + +The `transaction` method creates a long-running transaction manually. In +block form, the transaction is automatically closed when the block exits. +Sharing a transaction between writes is significantly faster when +storing many values or in tight loops. + +```ruby +# read/write block +db.transaction do + db[ 'key1' ] = val +end # transaction committed and closed +``` + +You can also open and close a transaction manually. + +```ruby +db.transaction +db[ 'key1' ] = val +db.commit +``` + +Like snapshots, `transaction` just sets the internal state and returns +the database handle - the handle is also yielded when using blocks. The +following 3 examples are identical, use whatever form you prefer. + +```ruby +txn = db.transaction +txn[ 'key1' ] = true +txn.save + +db.transaction do |txn| + txn[ 'key1' ] = true +end + +db.transaction do + db[ 'key1' ] = true +end +``` + +### Delete data + +Just write a `nil` value to remove a key entirely, or like Hash, use the +`#delete` method: + +```ruby +db[ 'key1' ] = nil +``` + +```ruby +oldval = db.delete( 'key1' ) +``` + + +### Transactions + +Transactions are largely modelled after the +[Sequel](https://sequel.jeremyevans.net/rdoc/files/doc/transactions_rdoc.html) +transaction basics. + +While in a transaction block, if no exception is raised, the +transaction is automatically committed and closed when the block exits. + +```ruby +db[ 'key' ] = false + +db.transaction do # BEGIN + db[ 'key' ] = true +end # COMMIT + +db[ 'key' ] #=> true +``` + +If the block raises a MDBX::Rollback exception, the transaction is +rolled back, but no exception is raised outside the block: + +```ruby +db[ 'key' ] = false + +db.transaction do # BEGIN + db[ 'key' ] = true + raise MDBX::Rollback +end # ROLLBACK + +db[ 'key' ] #=> false +``` + +If any other exception is raised, the transaction is rolled back, and +the exception is raised outside the block: + +```ruby +db[ 'key' ] = false + +db.transaction do # BEGIN + db[ 'key' ] = true + raise ArgumentError +end # ROLLBACK + +# ArgumentError raised +``` + + +If you want to check whether you are currently in a transaction, use the +Database#in_transaction? method: + +```ruby +db.in_transaction? #=> false +db.transaction do + db.in_transaction? #=> true +end +``` + +MDBX writes are strongly serialized, and an open transaction blocks +other writers until it has completed. Snapshots have no such +serialization, and readers from separate processes do not interfere with +each other. Be aware of libmdbx behaviors while in open transactions. + + +### Collections + +A MDBX collection is a sub-database, or a namespace. In order to use +this feature, the database must be opened with the `max_collections` +option: + +```ruby +db = MDBX::Database.open( "/path/to/file", max_collections: 10 ) +``` + +Afterwards, you can switch collections at will. + +```ruby +db.collection( 'sub' ) +db.collection #=> 'sub' +db[ :key ] = true +db.main # switch to the top level +db[ :key ] #=> nil +``` + +In block form, the collection is reverted to the current collection when +the block was started: + +```ruby +db.collection( 'sub1' ) +db.collection( 'sub2' ) do + db[ :key ] = true +end # the collection is reverted to 'sub1' +``` + +Collections cannot be switched while a snapshot or transaction is open. + +Collection names are stored in the top-level database as keys. Attempts +to use these keys as regular values, or switching to a key that is not +a collection will result in an incompatibility error. While using +collections, It's probably wise to not store regular key/value data in a +top-level database to avoid this ambiguity. + + +### Value Serialization + +By default, all values are stored as Marshal data - this is the most +"Ruby" behavior, as you can store any Ruby object directly that supports +`Marshal.dump`. + +```ruby +db.serializer = ->( v ) { Marshal.dump( v ) } +db.deserializer = ->( v ) { Marshal.load( v ) } +``` + +For compatibility with databases used by other languages, or if your +needs are more specific, you can disable or override the default +serialization behaviors after opening the database. + +```ruby +# All values are JSON strings +db.serializer = ->( v ) { JSON.generate( v ) } +db.deserializer = ->( v ) { JSON.parse( v ) } +``` + +```ruby +# Disable all automatic serialization +db.serializer = nil +db.deserializer = nil +``` + +### Introspection + +Calling `statistics` on a database handle will provide a subset of +information about the build environment, the database environment, and +the currently connected clients. + + +## TODO + + - Expose more database/collection information to statistics + - Support libmdbx multiple values per key DUPSORT via `put`, `get` + Enumerators, and a 'value' argument for `delete`. + ## Contributing You can check out the current development source with Mercurial via its [home repo](https://code.martini.nu/ruby-mdbx), or with Git at its -[project page](https://gitlab.com/mahlon/ruby-mdbx). +[project mirror](https://gitlab.com/mahlon/ruby-mdbx). After checking out the source, run: @@ -128,116 +397,3 @@ CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - - - -Ruby MDBX -========= - -https://erthink.github.io/libmdbx/intro.html - -Notes on the libmdbx environment for ruby: - - - A **database** is contained in a file, normally wrapped in directory for it's associated lock. - - Each database can contain multiple named **collections**. - - Each collection can contain any number of **keys**, and their associated **values**. A collection may optionally support multiple values per key (or duplicate keys, which is the same thing). - - A **cursor** lets you iterate a collection's keys and values in order. - (Note, this should be enumerable and built in to the Ruby interface) - - A **snapshot** is a self-consistent read-only view of the database. It stays the same even if some other thread or process makes changes. *The only way to access keys and values is within a snapshot*. - - A **transaction** is a writable snapshot. Changes made within a transaction are private until committed. *The only way to modify the database is within a transaction*. - - -Example Usage ----------------- - -### Create database handle - -```ruby -db = MDBX::Database.create( "/path/to/file", options ) -db = MDBX::Database.open( "/path/to/file", options ) - -# perhaps a block mode that yields the handle, closing on block exit? -MDBX::Database.open( 'database' ) do |db| - puts db[ 'key1' ] -end -``` - -### Access data - -```ruby -db[ 'key1' ] #=> val -# In the backend, automatically creates the snapshot, retrieves the value, and removes the snapshot before returning. - -# read-only block -db.snapshot do - db[ 'key1' ] #=> val - ... -end -# This is much faster for retrieving many values - -# Maybe have a snapshot object that acts like a DB while it exists? -snap = db.snapshot -snap[ 'whatever' ] #=> data -snap.close -``` - -### Write data - -```ruby -db[ 'key1' ] = val -# In the backend, automatically creates a transaction, stores the value, and closes the transaction before returning. - -# writable block -db.transaction do - db[ 'key1' ] = val -end -# Much faster for writing many values, should commit on success or abort on any exception - -# Maybe have a transaction object that acts like a DB while it exists? -# ALL OTHER TRANSACTIONS will block until this is closed -txn = db.transaction -txn[ 'whatever' ] = data -txn.commit # or txn.abort -``` - -### Collections - -Identical interface to top-level databases. Just have to pull the collection first. - -```ruby -collection = db.collection( 'stuff' ) # raise if nonexistent -# This now works just like the main db object - -collection.transaction.do - ... -end -``` - -### Cleaning up - -```ruby -db.close -``` - - - -### Stats! - - -TODO -------- - - gem install mdbx -- --with-opt-dir=/usr/local - - - [ ] Multiple value per key -- .insert, .delete? iterator for multi-val keys - - [ ] each_pair? - - [ ] document how serialization works - - [ ] document everything, really - - [x] transaction/snapshot blocks - - [ ] Arbitrary keys instead of forcing to strings? - - [ ] Disallow collection switching if there is an open transaction - - - - - diff --git a/experiments/thread_usage.rb b/experiments/thread_usage.rb index 3b4d53b..bac77ff 100755 --- a/experiments/thread_usage.rb +++ b/experiments/thread_usage.rb @@ -7,10 +7,6 @@ require 'fileutils' include FileUtils - -db = MDBX::Database.open( 'tmpdb' ) -mtx = Mutex.new - at_exit do rm_r 'tmpdb' end @@ -20,31 +16,17 @@ WRITES_PER = 1000 puts "#{THREAD_COUNT} simultaneous threads, #{WRITES_PER} writes each:" -Benchmark.bm( 10 ) do |x| - x.report( " txn per write:" ) do - threads = [] - THREAD_COUNT.times do |i| - threads << Thread.new do - mtx.synchronize do - WRITES_PER.times do - key = "%02d-%d" % [ i, rand(1000) ] - db[ key ] = rand(1000) - end - end - end - end - threads.map( &:join ) - end +def run_bench( db, msg ) + mtx = Mutex.new + Benchmark.bm( 10 ) do |x| + puts msg + puts '-' * 60 - - # Long running transactions require a mutex across threads. - # - x.report( "txn per thread:" ) do - threads = [] - THREAD_COUNT.times do |i| - threads << Thread.new do - mtx.synchronize do - db.transaction do + x.report( " txn per write:" ) do + threads = [] + THREAD_COUNT.times do |i| + threads << Thread.new do + mtx.synchronize do WRITES_PER.times do key = "%02d-%d" % [ i, rand(1000) ] db[ key ] = rand(1000) @@ -52,18 +34,48 @@ Benchmark.bm( 10 ) do |x| end end end + threads.map( &:join ) end - threads.map( &:join ) - end - x.report( " unthreaded:" ) do - db.transaction do - ( THREAD_COUNT * WRITES_PER ).times do - key = "000-%d" % [ rand(1000) ] - db[ key ] = rand(1000) + + # Long running transactions require a mutex across threads. + # + x.report( "txn per thread:" ) do + threads = [] + THREAD_COUNT.times do |i| + threads << Thread.new do + mtx.synchronize do + db.transaction do + WRITES_PER.times do + key = "%02d-%d" % [ i, rand(1000) ] + db[ key ] = rand(1000) + end + end + end + end + end + threads.map( &:join ) + end + + x.report( " unthreaded:" ) do + db.transaction do + ( THREAD_COUNT * WRITES_PER ).times do + key = "000-%d" % [ rand(1000) ] + db[ key ] = rand(1000) + end end end end + + db.close + puts end +db = MDBX::Database.open( 'tmpdb' ) +run_bench( db, "Default database flags:" ) + +db = MDBX::Database.open( 'tmpdb', no_metasync: true ) +run_bench( db, "Disabled metasync:" ) + + diff --git a/ext/mdbx_ext/database.c b/ext/mdbx_ext/database.c index 56cab5f..332cb40 100644 --- a/ext/mdbx_ext/database.c +++ b/ext/mdbx_ext/database.c @@ -75,7 +75,7 @@ rmdbx_free( void *db ) /* - * Cleanly close an opened database from Ruby. + * Cleanly close an opened database. */ VALUE rmdbx_close( VALUE self ) @@ -154,6 +154,25 @@ rmdbx_open_env( VALUE self ) } +/* + * Open a cursor for iteration. + */ +void +rmdbx_open_cursor( rmdbx_db_t *db ) +{ + if ( ! db->state.open ) rb_raise( rmdbx_eDatabaseError, "Closed database." ); + if ( ! db->txn ) rb_raise( rmdbx_eDatabaseError, "No snapshot or transaction currently open." ); + + int rc = mdbx_cursor_open( db->txn, db->dbi, &db->cursor ); + if ( rc != MDBX_SUCCESS ) { + rmdbx_close_all( db ); + rb_raise( rmdbx_eDatabaseError, "Unable to open cursor: (%d) %s", rc, mdbx_strerror(rc) ); + } + + return; +} + + /* * Open a new database transaction. If a transaction is already * open, this is a no-op. @@ -165,7 +184,7 @@ rmdbx_open_txn( rmdbx_db_t *db, int rwflag ) { if ( db->txn ) return; - int rc = mdbx_txn_begin( db->env, NULL, rwflag, &db->txn); + int rc = mdbx_txn_begin( db->env, NULL, rwflag, &db->txn ); if ( rc != MDBX_SUCCESS ) { rmdbx_close_all( db ); rb_raise( rmdbx_eDatabaseError, "mdbx_txn_begin: (%d) %s", rc, mdbx_strerror(rc) ); @@ -254,7 +273,7 @@ rmdbx_rb_closetxn( VALUE self, VALUE write ) * * Empty the current collection on disk. If collections are not enabled * or the database handle is set to the top-level (main) db - this - * deletes *all data* on disk. Fair warning, this is not recoverable! + * deletes *all records* from the database. This is not recoverable! */ VALUE rmdbx_clear( VALUE self ) @@ -315,40 +334,133 @@ rmdbx_val_for( VALUE self, VALUE arg ) } -/* call-seq: - * db.keys => [ 'key1', 'key2', ... ] - * - * Return an array of all keys in the current collection. +/* + * Deserialize and return a value. */ VALUE -rmdbx_keys( VALUE self ) +rmdbx_deserialize( VALUE self, VALUE val ) +{ + VALUE deserialize_proc = rb_iv_get( self, "@deserializer" ); + if ( ! NIL_P( deserialize_proc ) ) + val = rb_funcall( deserialize_proc, rb_intern("call"), 1, val ); + + return val; +} + + +/* call-seq: + * db.each_key {|key| block } => self + * + * Calls the block once for each key, returning self. + * A transaction must be opened prior to use. + */ +VALUE +rmdbx_each_key( VALUE self ) { UNWRAP_DB( self, db ); - VALUE rv = rb_ary_new(); MDBX_val key, data; - int rc; - if ( ! db->state.open ) rb_raise( rmdbx_eDatabaseError, "Closed database." ); + rmdbx_open_cursor( db ); + RETURN_ENUMERATOR( self, 0, 0 ); - rmdbx_open_txn( db, MDBX_TXN_RDONLY ); - rc = mdbx_cursor_open( db->txn, db->dbi, &db->cursor); - - if ( rc != MDBX_SUCCESS ) { - rmdbx_close( self ); - rb_raise( rmdbx_eDatabaseError, "Unable to open cursor: (%d) %s", rc, mdbx_strerror(rc) ); - } - - rc = mdbx_cursor_get( db->cursor, &key, &data, MDBX_FIRST ); - if ( rc == MDBX_SUCCESS ) { - rb_ary_push( rv, rb_str_new( key.iov_base, key.iov_len ) ); - while ( mdbx_cursor_get( db->cursor, &key, &data, MDBX_NEXT ) == 0 ) { - rb_ary_push( rv, rb_str_new( key.iov_base, key.iov_len ) ); + if ( mdbx_cursor_get( db->cursor, &key, &data, MDBX_FIRST ) == MDBX_SUCCESS ) { + rb_yield( rb_str_new( key.iov_base, key.iov_len ) ); + while ( mdbx_cursor_get( db->cursor, &key, &data, MDBX_NEXT ) == MDBX_SUCCESS ) { + rb_yield( rb_str_new( key.iov_base, key.iov_len ) ); } } mdbx_cursor_close( db->cursor ); db->cursor = NULL; + return self; +} + + +/* call-seq: + * db.each_value {|value| block } => self + * + * Calls the block once for each value, returning self. + * A transaction must be opened prior to use. + */ +VALUE +rmdbx_each_value( VALUE self ) +{ + UNWRAP_DB( self, db ); + MDBX_val key, data; + + rmdbx_open_cursor( db ); + RETURN_ENUMERATOR( self, 0, 0 ); + + if ( mdbx_cursor_get( db->cursor, &key, &data, MDBX_FIRST ) == MDBX_SUCCESS ) { + VALUE rv = rb_str_new( data.iov_base, data.iov_len ); + rb_yield( rmdbx_deserialize( self, rv ) ); + + while ( mdbx_cursor_get( db->cursor, &key, &data, MDBX_NEXT ) == MDBX_SUCCESS ) { + rv = rb_str_new( data.iov_base, data.iov_len ); + rb_yield( rmdbx_deserialize( self, rv ) ); + } + } + + mdbx_cursor_close( db->cursor ); + db->cursor = NULL; + return self; +} + + +/* call-seq: + * db.each_pair {|key, value| block } => self + * + * Calls the block once for each key and value, returning self. + * A transaction must be opened prior to use. + */ +VALUE +rmdbx_each_pair( VALUE self ) +{ + UNWRAP_DB( self, db ); + MDBX_val key, data; + + rmdbx_open_cursor( db ); + RETURN_ENUMERATOR( self, 0, 0 ); + + if ( mdbx_cursor_get( db->cursor, &key, &data, MDBX_FIRST ) == MDBX_SUCCESS ) { + VALUE rkey = rb_str_new( key.iov_base, key.iov_len ); + VALUE rval = rb_str_new( data.iov_base, data.iov_len ); + rb_yield( rb_assoc_new( rkey, rmdbx_deserialize( self, rval ) ) ); + + while ( mdbx_cursor_get( db->cursor, &key, &data, MDBX_NEXT ) == MDBX_SUCCESS ) { + rkey = rb_str_new( key.iov_base, key.iov_len ); + rval = rb_str_new( data.iov_base, data.iov_len ); + rb_yield( rb_assoc_new( rkey, rmdbx_deserialize( self, rval ) ) ); + } + } + + mdbx_cursor_close( db->cursor ); + db->cursor = NULL; + return self; +} + + +/* call-seq: + * db.length -> Integer + * + * Returns the count of keys in the currently selected collection. + */ +VALUE +rmdbx_length( VALUE self ) +{ + UNWRAP_DB( self, db ); + MDBX_stat mstat; + + if ( ! db->state.open ) rb_raise( rmdbx_eDatabaseError, "Closed database." ); + rmdbx_open_txn( db, MDBX_TXN_RDONLY ); + + int rc = mdbx_dbi_stat( db->txn, db->dbi, &mstat, sizeof(mstat) ); + if ( rc != MDBX_SUCCESS ) + rb_raise( rmdbx_eDatabaseError, "mdbx_dbi_stat: (%d) %s", rc, mdbx_strerror(rc) ); + + VALUE rv = LONG2FIX( mstat.ms_entries ); rmdbx_close_txn( db, RMDBX_TXN_ROLLBACK ); + return rv; } @@ -356,31 +468,27 @@ rmdbx_keys( VALUE self ) /* call-seq: * db[ 'key' ] => value * - * Convenience method: return a single value for +key+ immediately. + * Return a single value for +key+ immediately. */ VALUE rmdbx_get_val( VALUE self, VALUE key ) { int rc; - VALUE deserialize_proc; UNWRAP_DB( self, db ); if ( ! db->state.open ) rb_raise( rmdbx_eDatabaseError, "Closed database." ); - rmdbx_open_txn( db, MDBX_TXN_RDONLY ); MDBX_val ckey = rmdbx_key_for( key ); MDBX_val data; + VALUE rv; rc = mdbx_get( db->txn, db->dbi, &ckey, &data ); rmdbx_close_txn( db, RMDBX_TXN_ROLLBACK ); switch ( rc ) { case MDBX_SUCCESS: - deserialize_proc = rb_iv_get( self, "@deserializer" ); - VALUE rv = rb_str_new( data.iov_base, data.iov_len ); - if ( ! NIL_P( deserialize_proc ) ) - return rb_funcall( deserialize_proc, rb_intern("call"), 1, rv ); - return rv; + rv = rb_str_new( data.iov_base, data.iov_len ); + return rmdbx_deserialize( self, rv ); case MDBX_NOTFOUND: return Qnil; @@ -395,7 +503,7 @@ rmdbx_get_val( VALUE self, VALUE key ) /* call-seq: * db[ 'key' ] = value * - * Convenience method: set a single value for +key+ + * Set a single value for +key+. */ VALUE rmdbx_put_val( VALUE self, VALUE key, VALUE val ) @@ -404,7 +512,6 @@ rmdbx_put_val( VALUE self, VALUE key, VALUE val ) UNWRAP_DB( self, db ); if ( ! db->state.open ) rb_raise( rmdbx_eDatabaseError, "Closed database." ); - rmdbx_open_txn( db, MDBX_TXN_READWRITE ); MDBX_val ckey = rmdbx_key_for( key ); @@ -453,8 +560,9 @@ rmdbx_stats( VALUE self ) /* * call-seq: - * db.collection( 'collection_name' ) => db - * db.collection( nil ) => db (main) + * db.collection -> (collection name, or nil if in main) + * db.collection( 'collection_name' ) -> db + * db.collection( nil ) -> db (main) * * Gets or sets the sub-database "collection" that read/write * operations apply to. @@ -464,7 +572,7 @@ rmdbx_stats( VALUE self ) * * db.collection( 'collection_name' ) do * [ ... ] - * end => reverts to the previous collection name + * end # reverts to the previous collection name * */ VALUE @@ -482,22 +590,19 @@ rmdbx_set_subdb( int argc, VALUE *argv, VALUE self ) /* Provide a friendlier error message if max_collections is 0. */ if ( db->settings.max_collections == 0 ) - rb_raise( rmdbx_eDatabaseError, "Unable to change collection: collections are not enabled." ); + rb_raise( rmdbx_eDatabaseError, "Unable to change collection: collections are not enabled." ); /* All transactions must be closed when switching database handles. */ if ( db->txn ) rb_raise( rmdbx_eDatabaseError, "Unable to change collection: transaction open" ); - /* Retain the prior database collection if a - * block was passed. */ - if ( rb_block_given_p() ) { - if ( db->subdb != NULL ) { - prev_db = (char *) malloc( strlen(db->subdb) + 1 ); - strcpy( prev_db, db->subdb ); - } + /* Retain the prior database collection if a block was passed. + */ + if ( rb_block_given_p() && db->subdb != NULL ) { + prev_db = (char *) malloc( strlen(db->subdb) + 1 ); + strcpy( prev_db, db->subdb ); } - rb_iv_set( self, "@collection", subdb ); db->subdb = NIL_P( subdb ) ? NULL : StringValueCStr( subdb ); rmdbx_close_dbi( db ); @@ -507,11 +612,11 @@ rmdbx_set_subdb( int argc, VALUE *argv, VALUE self ) haven't written anything to the new collection yet. */ - /* Revert to the previous collection after the block is done. */ + /* Revert to the previous collection after the block is done. + */ if ( rb_block_given_p() ) { rb_yield( self ); if ( db->subdb != prev_db ) { - rb_iv_set( self, "@collection", prev_db ? rb_str_new_cstr(prev_db) : Qnil ); db->subdb = prev_db; rmdbx_close_dbi( db ); } @@ -633,7 +738,10 @@ rmdbx_init_database() rb_define_method( rmdbx_cDatabase, "closed?", rmdbx_closed_p, 0 ); rb_define_method( rmdbx_cDatabase, "in_transaction?", rmdbx_in_transaction_p, 0 ); rb_define_method( rmdbx_cDatabase, "clear", rmdbx_clear, 0 ); - rb_define_method( rmdbx_cDatabase, "keys", rmdbx_keys, 0 ); + rb_define_method( rmdbx_cDatabase, "each_key", rmdbx_each_key, 0 ); + rb_define_method( rmdbx_cDatabase, "each_value", rmdbx_each_value, 0 ); + rb_define_method( rmdbx_cDatabase, "each_pair", rmdbx_each_pair, 0 ); + rb_define_method( rmdbx_cDatabase, "length", rmdbx_length, 0 ); rb_define_method( rmdbx_cDatabase, "[]", rmdbx_get_val, 1 ); rb_define_method( rmdbx_cDatabase, "[]=", rmdbx_put_val, 2 ); diff --git a/lib/mdbx/database.rb b/lib/mdbx/database.rb index 6604171..f1cacff 100644 --- a/lib/mdbx/database.rb +++ b/lib/mdbx/database.rb @@ -19,13 +19,16 @@ class MDBX::Database ### ### MDBX::Database.open( path, options ) do |db| ### db[ 'key' ] = value - ### end + ### end # closed! ### ### Passing options modify various database behaviors. See the libmdbx ### documentation for detailed information. ### ### ==== Options ### + ### Unless otherwise mentioned, option keys are symbols, and values + ### are boolean. + ### ### [:mode] ### Whe creating a new database, set permissions to this 4 digit ### octal number. Defaults to `0644`. Set to `0` to never automatically @@ -50,11 +53,11 @@ class MDBX::Database ### Reject any write attempts while using this database handle. ### ### [:exclusive] - ### Access is restricted to this process handle. Other attempts + ### Access is restricted to the first opening process. Other attempts ### to use this database (even in readonly mode) are denied. ### ### [:compat] - ### Avoid incompatibility errors when opening an in-use database with + ### Skip compatibility checks when opening an in-use database with ### unknown or mismatched flag values. ### ### [:writemap] @@ -131,12 +134,17 @@ class MDBX::Database return self.collection( nil ) end - # Allow for some common nomenclature. alias_method :namespace, :collection + alias_method :size, :length + alias_method :each, :each_pair + # + # Transaction methods + # + ### Open a new mdbx read/write transaction. In block form, - ### the transaction is automatically committed. + ### the transaction is automatically committed when the block ends. ### ### Raising a MDBX::Rollback exception from within the block ### automatically rolls the transaction back. @@ -162,14 +170,14 @@ class MDBX::Database ### Open a new mdbx read only snapshot. In block form, - ### the snapshot is automatically closed. + ### the snapshot is automatically closed when the block ends. ### def snapshot( &block ) self.transaction( commit: false, &block ) end - ### Close any open transactions, abandoning all changes. + ### Close any open transaction, abandoning all changes. ### def rollback return self.close_transaction( false ) @@ -177,7 +185,7 @@ class MDBX::Database alias_method :abort, :rollback - ### Close any open transactions, writing all changes. + ### Close any open transaction, writing all changes. ### def commit return self.close_transaction( true ) @@ -185,6 +193,117 @@ class MDBX::Database alias_method :save, :commit + # + # Hash-alike methods + # + + ### Return the entirety of database contents as an Array of array + ### pairs. + ### + def to_a + self.snapshot do + return self.each_pair.to_a + end + end + + + ### Return the entirety of database contents as a Hash. + ### + def to_h + self.snapshot do + return self.each_pair.to_h + end + end + + + ### Returns +true+ if the current collection has no data. + ### + def empty? + return self.size.zero? + end + + + ### Returns the value for the given key, if found. + ### If key is not found and no block was given, returns nil. + ### If key is not found and a block was given, yields key to the + ### block and returns the block's return value. + ### + def fetch( key, &block ) + val = self[ key ] + if block_given? + return block.call( key ) if val.nil? + else + return val if val + raise KeyError, "key not found: %p" % [ key ] + end + end + + + ### Deletes the entry for the given key and returns its associated + ### value. If no block is given and key is found, deletes the entry + ### and returns the associated value. If no block given and key is + ### not found, returns nil. + ### + ### If a block is given and key is found, ignores the block, deletes + ### the entry, and returns the associated value. If a block is given + ### and key is not found, calls the block and returns the block's + ### return value. + ### + def delete( key, &block ) + val = self[ key ] + return block.call( key ) if block_given? && val.nil? + + self[ key ] = nil + return val + end + + + ### Returns a new Array containing all keys in the collection. + ### + def keys + self.snapshot do + return self.each_key.to_a + end + end + + + ### Returns a new Hash object containing the entries for the given + ### keys. Any given keys that are not found are ignored. + ### + def slice( *keys ) + self.snapshot do + return keys.each_with_object( {} ) do |key, acc| + val = self[ key ] + acc[ key ] = val if val + end + end + end + + + ### Returns a new Array containing all values in the collection. + ### + def values + self.snapshot do + return self.each_value.to_a + end + end + + + ### Returns a new Array containing values for the given +keys+. + ### + def values_at( *keys ) + self.snapshot do + return keys.each_with_object( [] ) do |key, acc| + acc << self[ key ] + end + end + end + + + # + # Utility methods + # + ### Return a hash of various metadata for the current database. ### def statistics diff --git a/spec/lib/helper.rb b/spec/lib/helper.rb index f5198d2..02cd46c 100644 --- a/spec/lib/helper.rb +++ b/spec/lib/helper.rb @@ -21,6 +21,7 @@ end require 'pathname' require 'rspec' +require 'json' require 'mdbx' diff --git a/spec/mdbx/database_spec.rb b/spec/mdbx/database_spec.rb index 4093243..dc67144 100644 --- a/spec/mdbx/database_spec.rb +++ b/spec/mdbx/database_spec.rb @@ -59,8 +59,23 @@ RSpec.describe( MDBX::Database ) do }. to raise_exception( MDBX::DatabaseError, /environment is already used/ ) end + end - it "can remove a key by setting its value to nil" do + + context 'hash-alike methods' do + + let!( :db ) { described_class.open( TEST_DATABASE.to_s ) } + + before( :each ) do + db.clear + end + + after( :each ) do + db.close + end + + + it "can remove an entry by setting a key's value to nil" do db[ 'test' ] = "hi" expect( db['test'] ).to eq( 'hi' ) @@ -68,12 +83,75 @@ RSpec.describe( MDBX::Database ) do expect( db['test'] ).to be_nil end + it 'can remove an entry via delete()' do + val = 'hi' + db[ 'test' ] = val + expect( db['test'] ).to eq( val ) + + oldval = db.delete( 'test' ) + expect( oldval ).to eq( val ) + expect( db['test'] ).to be_nil + end + + it 'returns a the delete() block if a key is not found' do + db.clear + expect( db.delete( 'test' ) ).to be_nil + rv = db.delete( 'test' ) {|key| "Couldn't find %p key!" % [ key ] } + expect( rv ).to eq( "Couldn't find \"test\" key!" ) + end + it "can return an array of its keys" do db[ 'key1' ] = true db[ 'key2' ] = true db[ 'key3' ] = true expect( db.keys ).to include( 'key1', 'key2', 'key3' ) end + + it 'knows when there is data present' do + expect( db.empty? ).to be_truthy + db[ 'bloop' ] = 1 + expect( db.empty? ).to be_falsey + end + + it "can convert to an array" do + 3.times{|i| db[i] = i } + expect( db.to_a ).to eq([ ["0",0], ["1",1], ["2",2] ]) + end + + it "can convert to a hash" do + 3.times{|i| db[i] = i } + expect( db.to_h ).to eq({ "0"=>0, "1"=>1, "2"=>2 }) + end + + it "retrieves a value via fetch()" do + db[ 'test' ] = true + expect( db.fetch('test') ).to be_truthy + end + + it "executes a fetch() block if the key was not found" do + rv = false + db.fetch( 'nopenopenope' ) { rv = true } + expect( rv ).to be_truthy + end + + it "raises KeyError if fetch()ing without a block to a nonexistent key" do + expect{ db.fetch(:nopenopenope) }.to raise_exception( KeyError, /key not found/ ) + end + + it "can return a sliced hash" do + ( 'a'..'z' ).each{|c| db[c] = c } + expect( db.slice( 'a', 'f' ) ).to eq( 'a' => 'a', 'f' => 'f' ) + end + + it "can return an array of specific values" do + ( 'a'..'z' ).each{|c| db[c] = c * 3 } + expect( db.values_at('e', 'nopenopenope', 'g') ).to eq( ['eee', nil, 'ggg'] ) + end + + it "can return an array of all values" do + ( 'a'..'z' ).each{|c| db[c] = c * 2 } + expect( db.values ).to include( 'aa', 'hh', 'tt' ) + end end @@ -93,6 +171,18 @@ RSpec.describe( MDBX::Database ) do } .to raise_exception( /not enabled/ ) end + it "knows it's length" do + db.collection( 'size1' ) + 10.times {|i| db[i] = true } + db.collection( 'size2' ) + 25.times {|i| db[i] = true } + + db.collection( 'size1' ) + expect( db.length ).to be( 10 ) + db.collection( 'size2' ) + expect( db.length ).to be( 25 ) + end + it "disallows regular key/val storage for namespace keys" do db.collection( 'bucket' ) db[ 'okay' ] = 1 @@ -223,5 +313,92 @@ RSpec.describe( MDBX::Database ) do expect( db[ 1 ] ).to be_falsey end end + + + context "iterators" do + + let( :db ) { + described_class.open( TEST_DATABASE.to_s, max_collections: 5 ).collection( 'iter' ) + } + + before( :each ) do + 3.times {|i| db[i] = "#{i}-val" } + end + + after( :each ) do + db.close + end + + it "raises an exception if the caller didn't open a transaction first" do + expect{ db.each_key }.to raise_exception( MDBX::DatabaseError, /no .*currently open/i ) + expect{ db.each_value }.to raise_exception( MDBX::DatabaseError, /no .*currently open/i ) + expect{ db.each_pair }.to raise_exception( MDBX::DatabaseError, /no .*currently open/i ) + end + + context "(with a transaction)" do + + before( :each ) { db.snapshot } + after( :each ) { db.abort } + + it "returns an iterator without a block" do + iter = db.each_key + expect( iter ).to be_a( Enumerator ) + expect( iter.to_a.size ).to be( 3 ) + end + + it "can iterate through keys" do + rv = db.each_key.with_object([]){|k, acc| acc << k } + expect( db.each_key.to_a ).to eq( rv ) + end + + it "can iterate through values" do + rv = db.each_value.with_object([]){|v, acc| acc << v } + expect( rv ).to eq( %w[ 0-val 1-val 2-val ] ) + end + + it "can iterate through key/value pairs" do + expect( db.each_pair.to_a.first ).to eq([ "0", "0-val" ]) + expect( db.each_pair.to_a.last ).to eq([ "2", "2-val" ]) + end + end + end + + + context "serialization" do + + let( :db ) { + described_class.open( TEST_DATABASE.to_s ) + } + + after( :each ) do + db.close + end + + it "uses Marshalling as default" do + db.deserializer = nil + hash = { a_hash: true } + db[ 'test' ] = hash + expect( db['test'] ).to eq( Marshal.dump( hash ) ) + end + + it "can be disabled completely" do + db.serializer = nil + db.deserializer = nil + + db[ 'test' ] = "doot" + db[ 'test2' ] = [1,2,3].to_s + expect( db['test'] ).to eq( "doot" ) + expect( db['test2'] ).to eq( "[1, 2, 3]" ) + end + + it "can be arbitrarily changed" do + db.serializer = ->( v ) { JSON.generate(v) } + db.deserializer = ->( v ) { JSON.parse(v) } + + hash = { "a_hash" => true } + db[ 'test' ] = hash + expect( db['test'] ).to eq( hash ) + end + end end diff --git a/spec/mdbx/stats_spec.rb b/spec/mdbx/stats_spec.rb index f0f9635..53e96ac 100644 --- a/spec/mdbx/stats_spec.rb +++ b/spec/mdbx/stats_spec.rb @@ -28,30 +28,14 @@ RSpec.describe( MDBX::Database ) do expect( build[:options] ).to be_a( Hash ) end + it "returns readers in use" do + readers = stats[ :readers ] + expect( stats.dig(:environment, :readers_in_use) ).to eq( readers.size ) + expect( readers.first[:pid] ).to eq( $$ ) + end + + it "returns datafile attributes" do + expect( stats.dig(:environment, :datafile, :type) ).to eq( "dynamic" ) + end end - -__END__ -{:environment=> - {:pagesize=>4096, - :last_txnid=>125, - :last_reader_txnid=>125, - :maximum_readers=>122, - :readers_in_use=>1, - :datafile=> - {:size_current=>65536, - :pages=>16, - :type=>"dynamic", - :size_lower=>12288, - :size_upper=>1048576, - :growth_step=>65536, - :shrink_threshold=>131072}}, - :readers=> - [{:slot=>0, - :pid=>45436, - :thread=>34374651904, - :txnid=>0, - :lag=>0, - :bytes_used=>0, - :bytes_retired=>0}]} -}