Age | Commit message (Collapse) | Author |
|
Set has been an autoloaded standard library since Ruby 3.2.
The standard library Set is less efficient than it could be, as it
uses Hash for storage, which stores unnecessary values for each key.
Implementation details:
* Core Set uses a modified version of `st_table`, named `set_table`.
than `s/st_/set_/`, the main difference is that the stored records
do not have values, making them 1/3 smaller. `st_table_entry` stores
`hash`, `key`, and `record` (value), while `set_table_entry` only
stores `hash` and `key`. This results in large sets using ~33% less
memory compared to stdlib Set. For small sets, core Set uses 12% more
memory (160 byte object slot and 64 malloc bytes, while stdlib set
uses 40 for Set and 160 for Hash). More memory is used because
the set_table is embedded and 72 bytes in the object slot are
currently wasted. Hopefully we can make this more efficient and have
it stored in an 80 byte object slot in the future.
* All methods are implemented as cfuncs, except the pretty_print
methods, which were moved to `lib/pp.rb` (which is where the
pretty_print methods for other core classes are defined). As is
typical for core classes, internal calls call C functions and
not Ruby methods. For example, to check if something is a Set,
`rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the
related object.
* Almost all methods use the same algorithm that the pure-Ruby
implementation used. The exception is when calling `Set#divide` with a
block with 2-arity. The pure-Ruby method used tsort to implement this.
I developed an algorithm that only allocates a single intermediate
hash and does not need tsort.
* The `flatten_merge` protected method is no longer necessary, so it
is not implemented (it could be).
* Similar to Hash/Array, subclasses of Set are no longer reflected in
`inspect` output.
* RDoc from stdlib Set was moved to core Set, with minor updates.
This includes a comprehensive benchmark suite for all public Set
methods. As you would expect, the native version is faster in the
vast majority of cases, and multiple times faster in many cases.
There are a few cases where it is significantly slower:
* Set.new with no arguments (~1.6x)
* Set#compare_by_identity for small sets (~1.3x)
* Set#clone for small sets (~1.5x)
* Set#dup for small sets (~1.7x)
These are slower as Set does not currently use the AR table
optimization that Hash does, so a new set_table is initialized for
each call. I'm not sure it's worth the complexity to have an AR
table-like optimization for small sets (for hashes it makes sense,
as small hashes are used everywhere in Ruby).
The rbs and repl_type_completor bundled gems will need updates to
support core Set. The pull request marks them as allowed failures.
This passes all set tests with no changes. The following specs
needed modification:
* Modifying frozen set error message (changed for the better)
* `Set#divide` when passed a 2-arity block no longer yields the same
object as both the first and second argument (this seems like an issue
with the previous implementation).
* Set-like objects that override `is_a?` such that `is_a?(Set)` return
`true` are no longer treated as Set instances.
* `Set.allocate.hash` is no longer the same as `nil.hash`
* `Set#join` no longer calls `Set#to_a` (it calls the underlying C
function).
* `Set#flatten_merge` protected method is not implemented.
Previously, `set.rb` added a `SortedSet` autoload, which loads
`set/sorted_set.rb`. This replaces the `Set` autoload in `prelude.rb`
with a `SortedSet` autoload, but I recommend removing it and
`set/sorted_set.rb`.
This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`,
reflecting that switch to a core class. This does not move the spec
files, as I'm not sure how they should be handled.
Internally, this uses the st_* types and functions as much as
possible, and only adds set_* types and functions as needed.
The underlying set_table implementation is stored in st.c, but
there is no public C-API for it, nor is there one planned, in
order to keep the ability to change the internals going forward.
For internal uses of st_table with Qtrue values, those can
probably be replaced with set_table. To do that, include
internal/set_table.h. To handle symbol visibility (rb_ prefix),
internal/set_table.h uses the same macro approach that
include/ruby/st.h uses.
The Set class (rb_cSet) and all methods are defined in set.c.
There isn't currently a C-API for the Set class, though C-API
functions can be added as needed going forward.
Implements [Feature #21216]
Co-authored-by: Jean Boussier <[email protected]>
Co-authored-by: Oliver Nutter <[email protected]>
|
|
https://github.com/ruby/set/commit/f88ecdef6b
|
|
Improved performance by ensuring that identical `Set` objects are
processed only once.
https://github.com/ruby/set/commit/cadb686e93
|
|
https://github.com/ruby/set/commit/1c3cded76a
|
|
|
|
https://github.com/ruby/set/commit/ea95c5a3d2
|
|
ref. https://github.com/ruby/ruby/pull/11453
https://github.com/ruby/set/commit/3cf6d11bd2
|
|
https://github.com/ruby/set/commit/d6cab5bcc8
|
|
https://github.com/ruby/set/commit/c63047c2ce
|
|
https://github.com/ruby/set/commit/32a9689696
|
|
https://github.com/ruby/set/commit/64dad673d8
|
|
https://github.com/ruby/set/commit/efc8c8c9f5
|
|
https://github.com/ruby/set/commit/bc59f85f2f
|
|
https://github.com/ruby/set/commit/d39b33f463
|
|
https://github.com/ruby/set/commit/ca1c9532a9
|
|
https://github.com/ruby/set/commit/becaca994d
|
|
remove redundant paragraph at set.rb
Notes:
Merged-By: hsbt <[email protected]>
|
|
Using the block param in a boolean context like this cause it to be
allocated.
Using it with an `if` or `unless` was optimized in 3.2
(https://github.com/ruby/ruby/pull/6286) but using it with `or`
or `and` wasn't.
```ruby
def foo(&block)
block or return 1
end
puts RubyVM::InstructionSequence.of(method(:foo)).disasm
== disasm: #<ISeq:foo@(irb):11 (11,0)-(13,3)> (catch: false)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: 0, kw: -1@-1, kwrest: -1])
[ 1] block@0<Block>
0000 getblockparam block@0, 0 ( 12)[LiCa]
0003 dup
0004 branchif 10
0006 pop
0007 putobject_INT2FIX_1_
0008 leave [Re]
0009 putnil
0010 leave
```
versus
```
def foo(&block)
return 1 if block
end
puts RubyVM::InstructionSequence.of(method(:foo)).disasm
== disasm: #<ISeq:foo@(irb):15 (15,0)-(17,3)> (catch: false)
local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: 0, kw: -1@-1, kwrest: -1])
[ 1] block@0<Block>
0000 getblockparamproxy block@0, 0 ( 16)[LiCa]
0003 branchunless 7
0005 putobject_INT2FIX_1_
0006 leave ( 17)[Re]
0007 putnil ( 16)
0008 leave
```
https://github.com/ruby/set/commit/e89da977d4
|
|
|
|
https://github.com/ruby/set/commit/71a876ae81
|
|
https://github.com/ruby/set/commit/292baacb60
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/5563
|
|
https://github.com/ruby/set/commit/f467028cdb
|
|
https://github.com/ruby/set/commit/76b056c3b9
|
|
https://github.com/ruby/set/commit/dd787a3988
|
|
https://github.com/ruby/set/commit/35b69e9d69
|
|
https://github.com/ruby/set/commit/1a73ab9047
|
|
Implements [Feature #17838]
https://github.com/ruby/set/commit/d9b389bafa
|
|
https://github.com/ruby/set/commit/257dc452a7
|
|
https://github.com/ruby/set/commit/8f4c62768d
|
|
https://github.com/ruby/set/commit/254d927c8c
|
|
https://github.com/ruby/set/commit/ab81354de1
|
|
- Eliminate warnings
- Convert rdoc to markdown
|
|
- SortedSet has been removed for dependency and performance reasons.
- Set#join is added as a shorthand for `.to_a.join`.
- Set#<=> is added.
https://github.com/ruby/set/blob/v1.0.0/CHANGELOG.md
|
|
https://github.com/ruby/set/commit/447974a374
|
|
It required RBTree to perform decently and the external dependency was
not suitable for a standard library. The pure ruby fallback
implementation was originally meant to be an example of how to write a
subclass of Set, and its poor performance was not suitable for use in
production.
I decided it should be distributed as an external library instead of
bundling it with Set.
https://github.com/ruby/set/commit/dfcc8e568b
|
|
In Ruby 2.x, initialize_copy does not take a freeze option.
https://github.com/ruby/set/commit/3da6c309df
|
|
Fixes [Bug #15834]
Notes:
Merged: https://github.com/ruby/ruby/pull/3215
|
|
This freezes the clone even if the receiver is not frozen. It
is only for consistency with freeze: false not freezing the clone
even if the receiver is frozen.
Because Object#clone is now partially implemented in Ruby and
not fully implemented in C, freeze: nil must be supported to
provide the default behavior of only freezing the clone if the
receiver is frozen.
This requires modifying delegate and set, to set freeze: nil
instead of freeze: true as the keyword parameter for
initialize_clone. Those are the two libraries in stdlib that
override initialize_clone.
Implements [Feature #16175]
Notes:
Merged: https://github.com/ruby/ruby/pull/2969
|
|
This makes it possible to initialize_clone to correctly not freeze
internal state if the freeze: false keyword is passed to clone.
If clone is called with freeze: true or no keyword, do not pass
a second argument to initialize_clone to keep backwards
compatibility.
This makes it so that external libraries that override
initialize_clone but do not support the freeze keyword will fail
with ArgumentError if passing freeze: false to clone. I think that
is better than the current behavior, which succeeds but results in
an unfrozen object with frozen internals.
Fix related issues in set and delegate in stdlib.
Fixes [Bug #14266]
Notes:
Merged: https://github.com/ruby/ruby/pull/2816
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/2003
|
|
This removes the related tests, and puts the related specs behind
version guards. This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
Notes:
Merged: https://github.com/ruby/ruby/pull/2476
|
|
The first time SortedSet#initialize is called, it overwrites
itself, then recalls #initialize, which results in calling the
subclass's initialize, not the current initialize.
Just inline the default initialize behavior to avoid this issue.
No test for this as it can only be triggered the very first time
that SortedSet#initialize is called.
Fixes [Bug #15830]
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66875 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* In Enumerable, Enumerator::Lazy, Array, Hash and Set
[Feature #13784] [ruby-core:82285]
* Share specs for the various #select#select! methods and
reuse them for #filter/#filter!.
* Add corresponding filter tests for select tests.
* Update NEWS.
[Fix GH-1824]
From: Alexander Patrick <[email protected]>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@62575 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60881 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* lib/set.rb: [DOC] add examples for Set#replace,
add examples for creating a set from a hash with duplicates,
simplify and fix style of some other examples, fix typos.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60879 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
GitHub PR: https://github.com/ruby/ruby/pull/1752 [Fix GH-1752]
Submitted by: @Ana06 <[email protected]>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60824 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* lib/set.rb: [DOC] improve description and examples for Set#===.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60573 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
This method resets the internal state of a set after modification to
existing elements, reindexing and deduplicating them. [Feature #6589]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60360 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|