Age | Commit message (Collapse) | Author |
|
Set has been an autoloaded standard library since Ruby 3.2.
The standard library Set is less efficient than it could be, as it
uses Hash for storage, which stores unnecessary values for each key.
Implementation details:
* Core Set uses a modified version of `st_table`, named `set_table`.
than `s/st_/set_/`, the main difference is that the stored records
do not have values, making them 1/3 smaller. `st_table_entry` stores
`hash`, `key`, and `record` (value), while `set_table_entry` only
stores `hash` and `key`. This results in large sets using ~33% less
memory compared to stdlib Set. For small sets, core Set uses 12% more
memory (160 byte object slot and 64 malloc bytes, while stdlib set
uses 40 for Set and 160 for Hash). More memory is used because
the set_table is embedded and 72 bytes in the object slot are
currently wasted. Hopefully we can make this more efficient and have
it stored in an 80 byte object slot in the future.
* All methods are implemented as cfuncs, except the pretty_print
methods, which were moved to `lib/pp.rb` (which is where the
pretty_print methods for other core classes are defined). As is
typical for core classes, internal calls call C functions and
not Ruby methods. For example, to check if something is a Set,
`rb_obj_is_kind_of` is used, instead of calling `is_a?(Set)` on the
related object.
* Almost all methods use the same algorithm that the pure-Ruby
implementation used. The exception is when calling `Set#divide` with a
block with 2-arity. The pure-Ruby method used tsort to implement this.
I developed an algorithm that only allocates a single intermediate
hash and does not need tsort.
* The `flatten_merge` protected method is no longer necessary, so it
is not implemented (it could be).
* Similar to Hash/Array, subclasses of Set are no longer reflected in
`inspect` output.
* RDoc from stdlib Set was moved to core Set, with minor updates.
This includes a comprehensive benchmark suite for all public Set
methods. As you would expect, the native version is faster in the
vast majority of cases, and multiple times faster in many cases.
There are a few cases where it is significantly slower:
* Set.new with no arguments (~1.6x)
* Set#compare_by_identity for small sets (~1.3x)
* Set#clone for small sets (~1.5x)
* Set#dup for small sets (~1.7x)
These are slower as Set does not currently use the AR table
optimization that Hash does, so a new set_table is initialized for
each call. I'm not sure it's worth the complexity to have an AR
table-like optimization for small sets (for hashes it makes sense,
as small hashes are used everywhere in Ruby).
The rbs and repl_type_completor bundled gems will need updates to
support core Set. The pull request marks them as allowed failures.
This passes all set tests with no changes. The following specs
needed modification:
* Modifying frozen set error message (changed for the better)
* `Set#divide` when passed a 2-arity block no longer yields the same
object as both the first and second argument (this seems like an issue
with the previous implementation).
* Set-like objects that override `is_a?` such that `is_a?(Set)` return
`true` are no longer treated as Set instances.
* `Set.allocate.hash` is no longer the same as `nil.hash`
* `Set#join` no longer calls `Set#to_a` (it calls the underlying C
function).
* `Set#flatten_merge` protected method is not implemented.
Previously, `set.rb` added a `SortedSet` autoload, which loads
`set/sorted_set.rb`. This replaces the `Set` autoload in `prelude.rb`
with a `SortedSet` autoload, but I recommend removing it and
`set/sorted_set.rb`.
This moves `test/set/test_set.rb` to `test/ruby/test_set.rb`,
reflecting that switch to a core class. This does not move the spec
files, as I'm not sure how they should be handled.
Internally, this uses the st_* types and functions as much as
possible, and only adds set_* types and functions as needed.
The underlying set_table implementation is stored in st.c, but
there is no public C-API for it, nor is there one planned, in
order to keep the ability to change the internals going forward.
For internal uses of st_table with Qtrue values, those can
probably be replaced with set_table. To do that, include
internal/set_table.h. To handle symbol visibility (rb_ prefix),
internal/set_table.h uses the same macro approach that
include/ruby/st.h uses.
The Set class (rb_cSet) and all methods are defined in set.c.
There isn't currently a C-API for the Set class, though C-API
functions can be added as needed going forward.
Implements [Feature #21216]
Co-authored-by: Jean Boussier <[email protected]>
Co-authored-by: Oliver Nutter <[email protected]>
|
|
https://github.com/ruby/pp/commit/efe5bc878f
|
|
The array allocation was because the keyword splat expression is
not recognized as safe by the compiler. Also avoid unnecessary
>= method call per element. This uses a private constant to
avoid unnecessary work at runtime.
I assume the only reason this code is needed is because v may
end with a ruby2_keywords hash that we do not want to treat as
keywords.
This issue was found by the performance warning in Ruby feature
21274.
https://github.com/ruby/pp/commit/3bf6df0e5c
|
|
(https://github.com/ruby/pp/pull/38)
https://github.com/ruby/pp/commit/5b5d483ac2
|
|
https://github.com/ruby/pp/commit/979f9d972d
|
|
https://github.com/ruby/pp/commit/3e4b7c03b0
Co-authored-by: Nobuyoshi Nakada <[email protected]>
|
|
https://github.com/ruby/pp/commit/6d9c0f255a
|
|
https://github.com/ruby/pp/commit/e787cd9139
|
|
https://github.com/ruby/pp/commit/dbf177d0fc
|
|
https://github.com/ruby/pp/commit/812933668d
|
|
|
|
https://github.com/ruby/pp/commit/af2229e8e6
|
|
Right now attempting to pretty print a BasicObject or any other
object lacking a few core Object methods will result in an error
```
Error: test_basic_object(PPTestModule::PPInspectTest): NoMethodError: undefined method `is_a?' for an instance of BasicObject
lib/pp.rb:192:in `pp'
lib/pp.rb:97:in `block in pp'
lib/pp.rb:158:in `guard_inspect_key'
lib/pp.rb:97:in `pp'
test/test_pp.rb:131:in `test_basic_object'
128:
129: def test_basic_object
130: a = BasicObject.new
=> 131: assert_match(/\A#<BasicObject:0x[\da-f]+>\n\z/, PP.pp(a, ''.dup))
132: end
133: end
134:
```
With some fairly small changes we can fallback to `Object#inspect`
which is better than an error.
https://github.com/ruby/pp/commit/4e9f6c2de0
|
|
[Bug #20808]
The previous implementation assumed all members are accessible,
but it's possible for users to change the visibility of members or
to entirely remove the accessor.
https://github.com/ruby/pp/commit/fb19501434
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/10924
|
|
The method which prints single pair of a hash, to make extending
pretty printing Hash easier, apart from Hash construct itself.
https://github.com/ruby/pp/commit/3fcf2d1142
|
|
So that the `pp` method can work in inherited classes with that
class.
https://github.com/ruby/pp/commit/f204df3aad
|
|
Instead of displaying the start of the range as nil
https://github.com/ruby/pp/commit/1df210d903
|
|
anonymous
* It would be "#<data a=42>" (double space) instead of "#<data a=42>" (like #inspect).
https://github.com/ruby/pp/commit/bed72bfcb8
|
|
* Data#members might not be defined, instead it might be defined
on Data subclasses or a module included there. This is notably the
case on TruffleRuby which defines it there for optimization purposes.
In fact the mere presence of Data#members implies a megamorphic call
inside, so it seems best to avoid relying on its existence.
https://github.com/ruby/pp/commit/6a97d36fbb
|
|
https://github.com/ruby/pp/commit/ed602b9f2b
|
|
https://github.com/ruby/pp/commit/6e086e6df9
|
|
* Remove mention to `require 'pp'` for `pretty_inspect`
* Mention the need to add `require 'pp'` to customize
`#pretty_print(pp)` method
|
|
https://github.com/ruby/pp/commit/3d0e65e79f
|
|
https://github.com/ruby/pp/commit/343a20d721
|
|
https://github.com/ruby/pp/commit/cad3cc762c
|
|
The use of `etc.so` here requires that etc is always implemented
as a C extension on-disk. However at least one impl – JRuby –
currently implements it as an internal extension, loaded via a
Ruby script. This require should simply use the base name of the
library, `etc`, to allow Ruby-based implementations to load as
well.
https://github.com/ruby/pp/commit/2061f994e0
|
|
This class does not exist in any implementation except CRuby.
I would recommend moving this code somewhere else, like a separate
file loaded only on CRuby or into CRuby itself. For now this
change is sufficient to load the library on other implementations.
https://github.com/ruby/pp/commit/7d5a220f64
|
|
According to nobu, Errno::EBAD is raised on Windows.
|
|
The error is raised on Solaris
http://rubyci.s3.amazonaws.com/solaris10-gcc/ruby-master/log/20211130T030003Z.fail.html.gz
```
1) Failure:
TestRubyOptions#test_require [/export/home/users/chkbuild/cb-gcc/tmp/build/20211130T030003Z/ruby/test/ruby/test_rubyoptions.rb:265]:
pid 7386 exit 1
| /export/home/users/chkbuild/cb-gcc/tmp/build/20211130T030003Z/ruby/lib/pp.rb:67:in `winsize': Invalid argument - <STDOUT> (Errno::EINVAL)
```
|
|
[Feature #12913]
|
|
https://github.com/ruby/pp/commit/3ee131ae92
|
|
Before:
```
$ ri sharing_detection=
= .sharing_detection=
(from ruby core)
=== Implementation from PP
------------------------------------------------------------------------
sharing_detection=(b)
------------------------------------------------------------------------
Returns the sharing detection flag as a boolean value. It is false by
default.
```
After:
```
$ ri sharing_detection=
= .sharing_detection=
(from ruby core)
=== Implementation from PP
------------------------------------------------------------------------
sharing_detection=(b)
------------------------------------------------------------------------
Sets the sharing detection flag to b.
```
|
|
`@sharing_detection` is only one obstruction to support pp on
non-main ractors, so make it ractor-local.
Notes:
Merged: https://github.com/ruby/ruby/pull/3973
|
|
This causes problems because the hash is passed to a block not
accepting keywords. Because the hash is empty and keyword flagged,
it is removed before calling the block. This doesn't cause an
ArgumentError because it is a block and not a lambda. Just like
any other block not passed required arguments, arguments not
passed are set to nil.
Issues like this are a strong reason not to have ruby2_keywords
by default.
Fixes [Bug #16519]
Notes:
Merged: https://github.com/ruby/ruby/pull/2855
|
|
Fixes [Bug #13144]
Co-Authored-By: Nobuyoshi Nakada <[email protected]>
|
|
This removes the related tests, and puts the related specs behind
version guards. This affects all code in lib, including some
libraries that may want to support older versions of Ruby.
Notes:
Merged: https://github.com/ruby/ruby/pull/2476
|
|
We track recursion in order to not infinite loop in ==, inspect, and
similar methods by keeping a thread-local 1 or 2 level hash. This allows
us to track when we have seen the same object (ex. using inspect) or
same pair of objects (ex. using ==) in this stack before and to treat
that differently.
Previously both levels of this Hash used the object's memory_id as a key
(using object_id would be slow and wasteful). Unfortunately, prettyprint
(pp.rb) uses this thread local variable to "pretend" to be inspect and
inherit its same recursion behaviour.
This commit changes the top-level hash to be an identity hash and to use
objects as keys instead of their object_ids.
I'd like to have also converted the 2nd level hash to an ident hash, but
it would have prevented an optimization which avoids allocating a 2nd
level hash for only a single element, which we want to keep because it's
by far the most common case.
So the new format of this hash is:
{ object => true } (not paired)
{ lhs_object => rhs_object_memory_id } (paired, single object)
{ lhs_object => { rhs_object_memory_id => true, ... } } (paired, many objects)
We must also update pp.rb to match this (using identity hashes).
Notes:
Merged: https://github.com/ruby/ruby/pull/2644
|
|
Related to [Feature #15955].
|
|
[ruby-core:87945] [Feature #14912]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@67586 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
`pp(1..)` should print `"(1..)"` instead of `"(1..nil)"`.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66143 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* ast.c (rb_ast_node_type): simplified to return a Symbol without
"NODE_" prefix.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66142 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@66140 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
- Followup of https://bugs.ruby-lang.org/issues/14123
From: Prathamesh Sonpatki <[email protected]>
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61310 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Because there is now the same guard in prelude.rb (alias pp pp).
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61111 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61082 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* lib/pp.rb (pp): move pp alias before its rdoc, not to prevent
parsing.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61080 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
Avoid a race condition which a context switch
occur after replacing Kernel#pp but before
defining PP class.
Following patch, inserting sleep, makes
this problem reproducible.
```
Index: lib/pp.rb
===================================================================
--- lib/pp.rb (revision 60960)
+++ lib/pp.rb (working copy)
@@ -26,6 +26,7 @@ module Kernel
end
undef __pp_backup__ if method_defined?(:__pp_backup__)
module_function :pp
+ sleep 1 # thread context switch
end
##
```
With the above patch, "uninitialized constant Kernel::PP" can
happen as as follows.
```
% ./ruby -w -Ilib -e '
t1 = Thread.new {
Thread.current.report_on_exception = true
pp :foo1
}
t2 = Thread.new {
Thread.current.report_on_exception = true
sleep 0.5
pp :foo2
}
t1.join rescue nil
t2.join rescue nil
'
#<Thread:0x000055dbf926eaa0@-e:6 run> terminated with exception:
Traceback (most recent call last):
3: from -e:9:in `block in <main>'
2: from /home/ruby/tst2/ruby/lib/pp.rb:22:in `pp'
1: from /home/ruby/tst2/ruby/lib/pp.rb:22:in `each'
/home/ruby/tst2/ruby/lib/pp.rb:23:in `block in pp': uninitialized constant Kernel::PP (NameError)
:foo1
```
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60961 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60948 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
[Feature #14123]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60944 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|