Age | Commit message (Collapse) | Author |
|
Fixes [Bug #21201]
This change addresses a performance regression where defining methods
inside `refine` blocks caused severe slowdowns. The issue was due to
`rb_clear_all_refinement_method_cache()` triggering a full object
space scan via `rb_objspace_each_objects` to find and invalidate
affected callcaches, which is very inefficient.
To fix this, I introduce `vm->cc_refinement_table` to track
callcaches related to refinements. This allows us to invalidate
only the necessary callcaches without scanning the entire heap,
resulting in significant performance improvement.
Notes:
Merged: https://github.com/ruby/ruby/pull/13077
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13458
|
|
Whenever we run into an inline cache miss when we try to set
an ivar, we may need to take the global lock, just to be able to
lookup inside `shape->edges`.
To solve that, when we're in multi-ractor mode, we can treat
the `shape->edges` as immutable. When we need to add a new
edge, we first copy the table, and then replace it with
CAS.
This increases memory allocations, however we expect that
creating new transitions becomes increasingly rare over time.
```ruby
class A
def initialize(bool)
@a = 1
if bool
@b = 2
else
@c = 3
end
end
def test
@d = 4
end
end
def bench(iterations)
i = iterations
while i > 0
A.new(true).test
A.new(false).test
i -= 1
end
end
if ARGV.first == "ractor"
ractors = 8.times.map do
Ractor.new do
bench(20_000_000 / 8)
end
end
ractors.each(&:take)
else
bench(20_000_000)
end
```
The above benchmark takes 27 seconds in Ractor mode on Ruby 3.4,
and only 1.7s with this branch.
Co-Authored-By: Étienne Barrié <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/13441
|
|
* Added `Ractor::Port`
* `Ractor::Port#receive` (support multi-threads)
* `Rcator::Port#close`
* `Ractor::Port#closed?`
* Added some methods
* `Ractor#join`
* `Ractor#value`
* `Ractor#monitor`
* `Ractor#unmonitor`
* Removed some methods
* `Ractor#take`
* `Ractor.yield`
* Change the spec
* `Racotr.select`
You can wait for multiple sequences of messages with `Ractor::Port`.
```ruby
ports = 3.times.map{ Ractor::Port.new }
ports.map.with_index do |port, ri|
Ractor.new port,ri do |port, ri|
3.times{|i| port << "r#{ri}-#{i}"}
end
end
p ports.each{|port| pp 3.times.map{port.receive}}
```
In this example, we use 3 ports, and 3 Ractors send messages to them respectively.
We can receive a series of messages from each port.
You can use `Ractor#value` to get the last value of a Ractor's block:
```ruby
result = Ractor.new do
heavy_task()
end.value
```
You can wait for the termination of a Ractor with `Ractor#join` like this:
```ruby
Ractor.new do
some_task()
end.join
```
`#value` and `#join` are similar to `Thread#value` and `Thread#join`.
To implement `#join`, `Ractor#monitor` (and `Ractor#unmonitor`) is introduced.
This commit changes `Ractor.select()` method.
It now only accepts ports or Ractors, and returns when a port receives a message or a Ractor terminates.
We removes `Ractor.yield` and `Ractor#take` because:
* `Ractor::Port` supports most of similar use cases in a simpler manner.
* Removing them significantly simplifies the code.
We also change the internal thread scheduler code (thread_pthread.c):
* During barrier synchronization, we keep the `ractor_sched` lock to avoid deadlocks.
This lock is released by `rb_ractor_sched_barrier_end()`
which is called at the end of operations that require the barrier.
* fix potential deadlock issues by checking interrupts just before setting UBF.
https://bugs.ruby-lang.org/issues/21262
Notes:
Merged: https://github.com/ruby/ruby/pull/13445
|
|
We don't free the method table for FrozenCore since it is converted to
an iclass and doesn't have the iclass_is_origin flag set. This causes a
memory leak to be reported during RUBY_FREE_AT_EXIT:
14 dyld 0x19f13ab98 start + 6076
13 miniruby 0x100644928 main + 96 main.c:62
12 miniruby 0x10064498c rb_main + 48 main.c:42
11 miniruby 0x10073be0c ruby_init + 16 eval.c:98
10 miniruby 0x10073bc6c ruby_setup + 232 eval.c:87
9 miniruby 0x100786b98 rb_call_inits + 168 inits.c:63
8 miniruby 0x1009b5010 Init_VM + 212 vm.c:4017
7 miniruby 0x10067aae8 rb_class_new + 44 class.c:834
6 miniruby 0x10067a04c rb_class_boot + 48 class.c:748
5 miniruby 0x10067a250 class_initialize_method_table + 32 class.c:721
4 miniruby 0x1009412a8 rb_id_table_create + 24 id_table.c:98
3 miniruby 0x100759fac ruby_xmalloc + 24 gc.c:5201
2 miniruby 0x10075fc14 ruby_xmalloc_body + 52 gc.c:5211
1 miniruby 0x1007726b4 rb_gc_impl_malloc + 92 default.c:8141
0 libsystem_malloc.dylib 0x19f30d12c _malloc_zone_malloc_instrumented_or_legacy + 152
Notes:
Merged: https://github.com/ruby/ruby/pull/13457
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13439
|
|
[Bug #21352]
`rb_objspace_free_objects` may need to check objects shapes
to know how to free them.
|
|
This commit allows building YJIT and ZJIT simultaneously, a "combo
build". Previously, `./configure --enable-yjit --enable-zjit` failed. At
runtime, though, only one of the two can be enabled at a time.
Add a root Cargo workspace that contains both the yjit and zjit crate.
The common Rust build integration mechanisms are factored out into
defs/jit.mk.
Combo YJIT+ZJIT dev builds are supported; if either JIT uses
`--enable-*=dev`, both of them are built in dev mode.
The combo build requires Cargo, but building one JIT at a time with only
rustc in release build remains supported.
Notes:
Merged: https://github.com/ruby/ruby/pull/13262
|
|
in same ractor
Rework ractors so that any ractor action (Ractor.receive, Ractor#send, Ractor.yield, Ractor#take,
Ractor.select) will operate on the thread that called the action. It will put that thread to sleep if
it's a blocking function and it needs to put it to sleep, and the awakening action (Ractor.yield,
Ractor#send) will wake up the blocked thread.
Before this change every blocking ractor action was associated with the ractor struct and its fields.
If a ractor called Ractor.receive, its wait status was wait_receiving, and when another ractor calls
r.send on it, it will look for that status in the ractor struct fields and wake it up. The problem was that
what if 2 threads call blocking ractor actions in the same ractor. Imagine if 1 thread has called Ractor.receive
and another r.take. Then, when a different ractor calls r.send on it, it doesn't know which ruby thread is associated
to which ractor action, so what ruby thread should it schedule? This change moves some fields onto the ruby thread
itself so that ruby threads are the ones that have ractor blocking statuses, and threads are then specifically scheduled
when unblocked.
Fixes [#17624]
Fixes [#21037]
Notes:
Merged: https://github.com/ruby/ruby/pull/12633
|
|
- `rb_thread_fd_close` is deprecated and now a no-op.
- IO operations (including close) no longer take a vm-wide lock.
Notes:
Merged-By: ioquatix <[email protected]>
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13283
|
|
Ivars will longer be the only thing stored inline
via shapes, so keeping the `iv_index` and `ivptr` names
would be confusing.
Instance variables won't be the only thing stored inline
via shapes, so keeping the `ivptr` name would be confusing.
`field` encompass anything that can be stored in a VALUE array.
Similarly, `gen_ivtbl` becomes `gen_fields_tbl`.
Notes:
Merged: https://github.com/ruby/ruby/pull/13159
|
|
Now that we have a hash-set implementation we can use that
instead of a hash-table with a static value.
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13142
|
|
(https://github.com/Shopify/zjit/pull/99)
* Disable ZJIT profiling at call-threshold
* Stop referencing ZJIT instructions in codegen
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
* Add --zjit-profile-interval option
* Fix min to max
* Avoid rewriting instructions for --zjit-call-threshold=1
* Rename the option to --zjit-num-profiles
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
(https://github.com/Shopify/zjit/pull/39)
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
(https://github.com/Shopify/zjit/pull/30)
* Implement FixnumAdd and stub PatchPoint/GuardType
Co-authored-by: Max Bernstein <[email protected]>
Co-authored-by: Maxime Chevalier-Boisvert <[email protected]>
* Clone Target for arm64
* Use $create instead of use create
Co-authored-by: Alan Wu <[email protected]>
* Fix misindentation from suggested changes
* Drop an unneeded variable for mut
* Load operand into a register only if necessary
---------
Co-authored-by: Max Bernstein <[email protected]>
Co-authored-by: Maxime Chevalier-Boisvert <[email protected]>
Co-authored-by: Alan Wu <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
(https://github.com/Shopify/zjit/pull/16)
* Add zjit_* instructions to profile the interpreter
* Rename FixnumPlus to FixnumAdd
* Update a comment about Invalidate
* Rename Guard to GuardType
* Rename Invalidate to PatchPoint
* Drop unneeded debug!()
* Plan on profiling the types
* Use the output of GuardType as type refined outputs
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
As a preparation for introducing a profiling layer, we need to be able
to raise the threshold to run a few cycles for profiling.
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/13131
|
|
This implements a hash set which is wait-free for lookup and lock-free
for insert (unless resizing) to use for fstring de-duplication.
As highlighted in https://bugs.ruby-lang.org/issues/19288, heavy use of
fstrings (frozen interned strings) can significantly reduce the
parallelism of Ractors.
I tried a few other approaches first: using an RWLock, striping a series
of RWlocks (partitioning the hash N-ways to reduce lock contention), and
putting a cache in front of it. All of these improved the situation, but
were unsatisfying as all still required locks for writes (and granular
locks are awkward, since we run the risk of needing to reach a vm
barrier) and this table is somewhat write-heavy.
My main reference for this was Cliff Click's talk on a lock free
hash-table for java https://www.youtube.com/watch?v=HJ-719EGIts. It
turns out this lock-free hash set is made easier to implement by a few
properties:
* We only need a hash set rather than a hash table (we only need keys,
not values), and so the full entry can be written as a single VALUE
* As a set we only need lookup/insert/delete, no update
* Delete is only run inside GC so does not need to be atomic (It could
be made concurrent)
* I use rb_vm_barrier for the (rare) table rebuilds (It could be made
concurrent) We VM lock (but don't require other threads to stop) for
table rebuilds, as those are rare
* The conservative garbage collector makes deferred replication easy,
using a T_DATA object
Another benefits of having a table specific to fstrings is that we
compare by value on lookup/insert, but by identity on delete, as we only
want to remove the exact string which is being freed. This is faster and
provides a second way to avoid the race condition in
https://bugs.ruby-lang.org/issues/21172.
This is a pretty standard open-addressing hash table with quadratic
probing. Similar to our existing st_table or id_table. Deletes (which
happen on GC) replace existing keys with a tombstone, which is the only
type of update which can occur. Tombstones are only cleared out on
resize.
Unlike st_table, the VALUEs are stored in the hash table itself
(st_table's bins) rather than as a compact index. This avoids an extra
pointer dereference and is possible because we don't need to preserve
insertion order. The table targets a load factor of 2 (it is enlarged
once it is half full).
Notes:
Merged: https://github.com/ruby/ruby/pull/12921
|
|
It looks like stat_insn_usage was introduced with YARV, but as far as I
can tell the field has never been used. I think we should remove the
field since we don't use it.
Notes:
Merged: https://github.com/ruby/ruby/pull/13100
|
|
Proc objects are now traversed like other objects when making them
shareable.
Fixes [Bug #19372]
Fixes [Bug #19374]
Notes:
Merged: https://github.com/ruby/ruby/pull/12977
|
|
Previously, vm_make_env_each() (used during proc
creation and for the debug inspector C API) picked up the
non-GC-allocated iseq that rb_vm_push_frame_fname() creates,
which led to a SEGV when the GC tried to mark the non GC object.
Put a real iseq imemo instead. Speed should be about the same since
the old code also did a imemo allocation and a malloc allocation.
Real iseq allows ironing out the special-casing of dummy frames in
rb_execution_context_mark() and rb_execution_context_update(). A check
is added to RubyVM::ISeq#eval, though, to stop attempts to run dummy
iseqs.
[Bug #21180]
Co-authored-by: Aaron Patterson <[email protected]>
Notes:
Merged: https://github.com/ruby/ruby/pull/12898
|
|
Also, Binding#local_variable_get and #local_variable_set rejects an
access to numbered parameters.
[Bug #20965] [Bug #21049]
Notes:
Merged: https://github.com/ruby/ruby/pull/12746
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12740
|
|
We can just always return the jit_entry since it will be initialized to
NULL. There is no reason to specifically return NULL if yjit / rjit are
disabled
Notes:
Merged: https://github.com/ruby/ruby/pull/12729
|
|
We shouldn't directly set the flags of an object because there could be
other flags set that would be erased. Instead, we can unset T_MASK and
set T_ICLASS isntead.
Notes:
Merged: https://github.com/ruby/ruby/pull/12667
|
|
We can use rb_gc_vm_weak_table_foreach for reference updating of weak tables
in the default GC.
Notes:
Merged: https://github.com/ruby/ruby/pull/12629
|
|
The TLS across .so issue seems related to Arm64, but not Darwin.
Notes:
Merged: https://github.com/ruby/ruby/pull/12593
|
|
Frames with VM_FRAME_MAGIC_DUMMY pushed by rb_vm_push_frame_fname have
allocated iseq, so we should not reference update it.
Notes:
Merged: https://github.com/ruby/ruby/pull/12371
|
|
[Bug #20950]
ifunc proc has the ep allocated in the cfunc_proc_t which is the data of
the TypedData object. If an ifunc proc is duplicated, the ep points to
the ep of the source object. If the source object is freed, then the ep
of the duplicated object now points to a freed memory region. If we try
to use the ep we could crash.
For example, the following script crashes:
p = { a: 1 }.to_proc
100.times do
p = p.dup
GC.start
p.call
rescue ArgumentError
end
This commit changes ifunc proc to also duplicate the ep when it is duplicated.
Notes:
Merged: https://github.com/ruby/ruby/pull/12319
|
|
The macro provided by symbol.h uses STATIC_ID2SYM
when it can which speeds up methods that declare keyword args.
Co-authored-by: Alan Wu <[email protected]>
Co-authored-by: Takashi Kokubun (k0kubun) <[email protected]>
Co-authored-by: Maxime Chevalier-Boisvert <[email protected]>
Co-authored-by: Aaron Patterson <[email protected]>
Notes:
Merged-By: maximecb <[email protected]>
|
|
In this context, `th` must not be NULL
Notes:
Merged: https://github.com/ruby/ruby/pull/12253
|
|
* Add opt_duparray_send insn to skip the allocation on `#include?`
If the method isn't going to modify the array we don't need to copy it.
This avoids the allocation / array copy for things like `[:a, :b].include?(x)`.
This adds a BOP for include? and tracks redefinition for it on Array.
Co-authored-by: Andrew Novoselac <[email protected]>
* YJIT: Implement opt_duparray_send include_p
Co-authored-by: Andrew Novoselac <[email protected]>
* Update opt_newarray_send to support simple forms of include?(arg)
Similar to opt_duparray_send but for non-static arrays.
* YJIT: Implement opt_newarray_send include_p
---------
Co-authored-by: Andrew Novoselac <[email protected]>
Notes:
Merged-By: maximecb <[email protected]>
|
|
Https://learn.microsoft.com/en-us/cpp/build/reference/zc-inline-remove-unreferenced-comdat?view=msvc-140
> If `/Zc:inline` is specified, the compiler enforces the C++11
> requirement that all functions declared inline must have a definition
> available in the same translation unit if they're used.
Notes:
Merged: https://github.com/ruby/ruby/pull/12107
|
|
When we run with RUBY_FREE_AT_EXIT, there's a false-positive memory leak
reported in YJIT because the METHOD_CODEGEN_TABLE is never freed. This
commit adds rb_yjit_free_at_exit that is called at shutdown when
RUBY_FREE_AT_EXIT is set.
Reported memory leak:
==699816== 1,104 bytes in 1 blocks are possibly lost in loss record 1 of 1
==699816== at 0x484680F: malloc (vg_replace_malloc.c:446)
==699816== by 0x155B3E: UnknownInlinedFun (unix.rs:14)
==699816== by 0x155B3E: UnknownInlinedFun (stats.rs:36)
==699816== by 0x155B3E: UnknownInlinedFun (stats.rs:27)
==699816== by 0x155B3E: alloc (alloc.rs:98)
==699816== by 0x155B3E: alloc_impl (alloc.rs:181)
==699816== by 0x155B3E: allocate (alloc.rs:241)
==699816== by 0x155B3E: do_alloc<alloc::alloc::Global> (alloc.rs:15)
==699816== by 0x155B3E: new_uninitialized<alloc::alloc::Global> (mod.rs:1750)
==699816== by 0x155B3E: fallible_with_capacity<alloc::alloc::Global> (mod.rs:1788)
==699816== by 0x155B3E: prepare_resize<alloc::alloc::Global> (mod.rs:2864)
==699816== by 0x155B3E: resize_inner<alloc::alloc::Global> (mod.rs:3060)
==699816== by 0x155B3E: reserve_rehash_inner<alloc::alloc::Global> (mod.rs:2950)
==699816== by 0x155B3E: hashbrown::raw::RawTable<T,A>::reserve_rehash (mod.rs:1231)
==699816== by 0x5BC39F: UnknownInlinedFun (mod.rs:1179)
==699816== by 0x5BC39F: find_or_find_insert_slot<(usize, fn(&mut yjit::codegen::JITState, &mut yjit::backend::ir::Assembler, *const yjit::cruby::autogened::rb_callinfo, *const yjit::cruby::autogened::rb_callable_method_entry_struct, core::option::Option<yjit::codegen::BlockHandler>, i32, core::option::Option<yjit::cruby::VALUE>) -> bool), alloc::alloc::Global, hashbrown::map::equivalent_key::{closure_env#0}<usize, usize, fn(&mut yjit::codegen::JITState, &mut yjit::backend::ir::Assembler, *const yjit::cruby::autogened::rb_callinfo, *const yjit::cruby::autogened::rb_callable_method_entry_struct, core::option::Option<yjit::codegen::BlockHandler>, i32, core::option::Option<yjit::cruby::VALUE>) -> bool>, hashbrown::map::make_hasher::{closure_env#0}<usize, fn(&mut yjit::codegen::JITState, &mut yjit::backend::ir::Assembler, *const yjit::cruby::autogened::rb_callinfo, *const yjit::cruby::autogened::rb_callable_method_entry_struct, core::option::Option<yjit::codegen::BlockHandler>, i32, core::option::Option<yjit::cruby::VALUE>) -> bool, std::hash::random::RandomState>> (mod.rs:1413)
==699816== by 0x5BC39F: hashbrown::map::HashMap<K,V,S,A>::insert (map.rs:1754)
==699816== by 0x57C5C6: insert<usize, fn(&mut yjit::codegen::JITState, &mut yjit::backend::ir::Assembler, *const yjit::cruby::autogened::rb_callinfo, *const yjit::cruby::autogened::rb_callable_method_entry_struct, core::option::Option<yjit::codegen::BlockHandler>, i32, core::option::Option<yjit::cruby::VALUE>) -> bool, std::hash::random::RandomState> (map.rs:1104)
==699816== by 0x57C5C6: yjit::codegen::reg_method_codegen (codegen.rs:10521)
==699816== by 0x57C295: yjit::codegen::yjit_reg_method_codegen_fns (codegen.rs:10464)
==699816== by 0x5C6B07: rb_yjit_init (yjit.rs:40)
==699816== by 0x393723: ruby_opt_init (ruby.c:1820)
==699816== by 0x393723: ruby_opt_init (ruby.c:1767)
==699816== by 0x3957D4: prism_script (ruby.c:2215)
==699816== by 0x3957D4: process_options (ruby.c:2538)
==699816== by 0x396065: ruby_process_options (ruby.c:3166)
==699816== by 0x236E56: ruby_options (eval.c:117)
==699816== by 0x15BAED: rb_main (main.c:43)
==699816== by 0x15BAED: main (main.c:62)
After this patch, there are no more memory leaks reported when running
RUBY_FREE_AT_EXIT with Valgrind on an empty Ruby script:
$ RUBY_FREE_AT_EXIT=1 valgrind --leak-check=full ruby -e ""
...
==700357== HEAP SUMMARY:
==700357== in use at exit: 0 bytes in 0 blocks
==700357== total heap usage: 36,559 allocs, 36,559 frees, 6,064,783 bytes allocated
==700357==
==700357== All heap blocks were freed -- no leaks are possible
Notes:
Merged-By: maximecb <[email protected]>
|
|
Many libraries should be loaded on the main ractor because of
setting constants with unshareable objects and so on.
This patch allows to call `requore` on non-main Ractors by
asking the main ractor to call `require` on it. The calling ractor
waits for the result of `require` from the main ractor.
If the `require` call failed with some reasons, an exception
objects will be deliverred from the main ractor to the calling ractor
if it is copy-able.
Same on `require_relative` and `require` by `autoload`.
Now `Ractor.new{pp obj}` works well (the first call of `pp` requires
`pp` library implicitly).
[Feature #20627]
Notes:
Merged: https://github.com/ruby/ruby/pull/11142
|
|
introduce
- rb_threadptr_interrupt_exec
- rb_ractor_interrupt_exec
to intercept the thread/ractor execution.
Notes:
Merged: https://github.com/ruby/ruby/pull/11142
|
|
to show unused block warning strictly.
```ruby
class C
def f = nil
end
class D
def f = yield
end
[C.new, D.new].each{|obj| obj.f{}}
```
In this case, `D#f` accepts a block. However `C#f` doesn't
accept a block. There are some cases passing a block with
`obj.f{}` where `obj` is `C` or `D`. To avoid warnings on
such cases, "unused block warning" will be warned only if
there is not same name which accepts a block.
On the above example, `C.new.f{}` doesn't show any warnings
because there is a same name `D#f` which accepts a block.
We call this default behavior as "relax mode".
`strict_unused_block` new warning category changes from
"relax mode" to "strict mode", we don't check same name
methods and `C.new.f{}` will be warned.
[Feature #15554]
Notes:
Merged: https://github.com/ruby/ruby/pull/12005
|
|
* YJIT: Replace Array#each only when YJIT is enabled
* Add comments about BUILTIN_ATTR_C_TRACE
* Make Ruby Array#each available with --yjit as well
* Fix all paths that expect a C location
* Use method_basic_definition_p to detect patches
* Copy a comment about C_TRACE flag to compilers
* Rephrase a comment about add_yjit_hook
* Give METHOD_ENTRY_BASIC flag to Array#each
* Add --yjit-c-builtin option
* Allow inconsistent source_location in test-spec
* Refactor a check of BUILTIN_ATTR_C_TRACE
* Set METHOD_ENTRY_BASIC without touching vm->running
Notes:
Merged-By: maximecb <[email protected]>
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/11970
|
|
|
|
|