summaryrefslogtreecommitdiff
path: root/vm.c
AgeCommit message (Collapse)Author
2023-12-22Free default_rand_key after freeing RactorsJohn Hawthorn
Ractor's free iterates through its TLS keys so we need to keep this memory available until after Ractors are freed. Minimal reproduction: RUBY_FREE_AT_EXIT=1 ./miniruby -e rand
2023-12-20Correct free_on_exit env var to free_at_exitHParker
2023-12-18[DOC] No document for internal or debug methodsNobuyoshi Nakada
2023-12-15Introduce --parser runtime flagHParker
Introduce runtime flag for specifying the parser, ``` ruby --parser=prism ``` also update the description: ``` $ ruby --parser=prism --version ruby 3.3.0dev (2023-12-08T04:47:14Z add-parser-runtime.. 0616384c9f) +PRISM [x86_64-darwin23] ``` [Bug #20044]
2023-12-15free ractors with ractor_freeHParker
Previously with RUBY_FREE_ON_EXIT, ractors where being xfree-ed which is incorrect since they are not xmalloced. Instead we can free ractors with ractor free during shutdown. This change only effects main ractor freeing when RUBY_FREE_ON_EXIT is set. Co-authored-by: John Hawthorn <[email protected]>
2023-12-10Change the semantics of rb_postponed_job_registerKJ Tsanaktsidis
Our current implementation of rb_postponed_job_register suffers from some safety issues that can lead to interpreter crashes (see bug #1991). Essentially, the issue is that jobs can be called with the wrong arguments. We made two attempts to fix this whilst keeping the promised semantics, but: * The first one involved masking/unmasking when flushing jobs, which was believed to be too expensive * The second one involved a lock-free, multi-producer, single-consumer ringbuffer, which was too complex The critical insight behind this third solution is that essentially the only user of these APIs are a) internal, or b) profiling gems. For a), none of the usages actually require variable data; they will work just fine with the preregistration interface. For b), generally profiling gems only call a single callback with a single piece of data (which is actually usually just zero) for the life of the program. The ringbuffer is complex because it needs to support multi-word inserts of job & data (which can't be atomic); but nobody actually even needs that functionality, really. So, this comit: * Introduces a pre-registration API for jobs, with a GVL-requiring rb_postponed_job_prereigster, which returns a handle which can be used with an async-signal-safe rb_postponed_job_trigger. * Deprecates rb_postponed_job_register (and re-implements it on top of the preregister function for compatability) * Moves all the internal usages of postponed job register pre-registration
2023-12-08Thread specific storage APIsKoichi Sasada
This patch introduces thread specific storage APIs for tools which use `rb_internal_thread_event_hook` APIs. * `rb_internal_thread_specific_key_create()` to create a tool specific thread local storage key and allocate the storage if not available. * `rb_internal_thread_specific_set()` sets a data to thread and tool specific storage. * `rb_internal_thread_specific_get()` gets a data in thread and tool specific storage. Note that `rb_internal_thread_specific_get|set(thread_val, key)` can be called without GVL and safe for async signal and safe for multi-threading (native threads). So you can call it in any internal thread event hooks. Further more you can call it from other native threads. Of course `thread_val` should be living while accessing the data from this function. Note that you should not forget to clean up the set data.
2023-12-07Free everything at shutdownAdam Hess
when the RUBY_FREE_ON_SHUTDOWN environment variable is set, manually free memory at shutdown. Co-authored-by: Nobuyoshi Nakada <[email protected]> Co-authored-by: Peter Zhu <[email protected]>
2023-12-07Fix potential compaction issue in env_copy()Alan Wu
`src_ep[VM_ENV_DATA_INDEX_ME_CREF]` was read out and held without marking across the allocation in vm_env_new(). In case vm_env_new() ran compaction, an invalid reference could have been written into `copied_env`. It might've been hard to actually produce a crash with this issue due to the pinning marking of the field in rb_execution_context_mark().
2023-12-07Add missing write barrier to env_copy()Alan Wu
Previously, the following crashed with `vm_assert_env:imemo_type_p(obj, imemo_env)` due to missing a missing WB: o = Object.new def o.foo(n) freeze GC.stress = 1 # inflate block nesting get an imemo_env for each level n.tap do |i| i.tap do |local| return Ractor.make_shareable(-> do local + i + n end) end end ensure GC.stress = false GC.verify_internal_consistency end p o.foo(1)[] By the time the recursive env_copy() call returns, `copied_env` could have aged or have turned greyed, so we need a WB for the `ep[VM_ENV_DATA_INDEX_SPECVAL]` assignment which adds an edge. Fix: 674eb7df7f409099f33da77293d9658e09b470d6
2023-12-06Revert "allow enabling Prism via flag or env var"HParker
This reverts commit 9b76c7fc89460ed8e9be40e4037c1d68395c0f6d.
2023-12-05allow enabling Prism via flag or env varHParker
Enable Prism using either --prism ruby --prism test.rb or via env var RUBY_PRISM=1 ruby test.rb
2023-12-05Make env_copy compaction safePeter Zhu
The original order of events is: 1. Allocate env_body. 2. Fill env_body using elements in src_env, and it performs operations that can trigger a GC. 3. Create the copied_env using vm_env_new. However, if GC compaction runs during step 2, then copied_env would not have yet been created and objects on env_body could move but it would not be reference updated. This commit changes the the order to be (1), (3), (2).
2023-11-30Fix imemo_env corruption under auto compactionAlan Wu
Previously, vm_make_env_each() did: 1. ALLOC env_body 2. Copy locals into env_body 3. Allocate imemo_env 4. Set up imemo_env with env_body If compaction runs during (3), locals copied to env_body could be moved and the imemo_env could end up with invalid references. Move (2) down so it reads references after potential movement.
2023-11-15Adjust spaces [ci skip]Nobuyoshi Nakada
2023-11-15Remove invariant conditionNobuyoshi Nakada
The `while` loop condition dereferences `cfp` and no `break` there, `cfp` cannot be NULL just after the loop.
2023-11-13[wasm] allocate Asyncify setjmp buffer in heapYuta Saito
`rb_jmpbuf_t` type is considerably large due to inline-allocated Asyncify buffer, and it leads to stack overflow even with small number of C-method call frames. This commit allocates the Asyncify buffer used by `rb_wasm_setjmp` in heap to mitigate the issue. This patch introduces a new type `rb_vm_tag_jmpbuf_t` to abstract the representation of a jump buffer, and init/deinit hook points to manage lifetime of the buffer. These changes are effectively NFC for non-wasm platforms.
2023-10-24Use a functional red-black tree for indexing the shapesAaron Patterson
This is an experimental commit that uses a functional red-black tree to create an index of the ancestor shapes. It uses an Okasaki style functional red black tree: https://www.cs.tufts.edu/comp/150FP/archive/chris-okasaki/redblack99.pdf This tree is advantageous because: * It offers O(n log n) insertions and O(n log n) lookups. * It shares memory with previous "versions" of the tree When we insert a node in the tree, only the parts of the tree that need to be rebalanced are newly allocated. Parts of the tree that don't need to be rebalanced are not reallocated, so "new trees" are able to share memory with old trees. This is in contrast to a sorted set where we would have to duplicate the set, and also resort the set on each insertion. I've added a new stat to RubyVM.stat so we can understand how the red black tree increases.
2023-10-19Partly revert a change in #8705Takashi Kokubun
Having this variable actually helps the performance of non-JITed calls. ----- ----------- ---------- ---------- ---------- ------------- ------------ bench before (ms) stddev (%) after (ms) stddev (%) after 1st itr before/after fib 241.9 0.5 225.4 1.0 1.06 1.07 ----- ----------- ---------- ---------- ---------- ------------- ------------ (benchmarked with --yjit-cold-threshold=0)
2023-10-19Call rb_jit_cont_init() even earlierTakashi Kokubun
To fix https://github.com/ruby/ruby/actions/runs/6581593578/job/17881779994
2023-10-19YJIT: Add RubyVM::YJIT.enable (#8705)Takashi Kokubun
2023-10-12YJIT: port call threshold logic from Rust to C for performance (#8628)Maxime Chevalier-Boisvert
* Port call threshold logic from Rust to C for performance * Prefix global/field names with yjit_ * Fix linker error * Fix preprocessor condition for rb_yjit_threshold_hit * Fix third linker issue * Exclude yjit_calls_at_interv from RJIT bindgen --------- Co-authored-by: Takashi Kokubun <[email protected]>
2023-10-12M:N thread scheduler for RactorsKoichi Sasada
This patch introduce M:N thread scheduler for Ractor system. In general, M:N thread scheduler employs N native threads (OS threads) to manage M user-level threads (Ruby threads in this case). On the Ruby interpreter, 1 native thread is provided for 1 Ractor and all Ruby threads are managed by the native thread. From Ruby 1.9, the interpreter uses 1:1 thread scheduler which means 1 Ruby thread has 1 native thread. M:N scheduler change this strategy. Because of compatibility issue (and stableness issue of the implementation) main Ractor doesn't use M:N scheduler on default. On the other words, threads on the main Ractor will be managed with 1:1 thread scheduler. There are additional settings by environment variables: `RUBY_MN_THREADS=1` enables M:N thread scheduler on the main ractor. Note that non-main ractors use the M:N scheduler without this configuration. With this configuration, single ractor applications run threads on M:1 thread scheduler (green threads, user-level threads). `RUBY_MAX_CPU=n` specifies maximum number of native threads for M:N scheduler (default: 8). This patch will be reverted soon if non-easy issues are found. [Bug #19842]
2023-10-03YJIT: add heuristic to avoid compiling cold ISEQs (#8522)Maxime Chevalier-Boisvert
* YJIT: Add counter to measure how often we compile "cold" ISEQs (#535) Fix counter name in DEFAULT_COUNTERS YJIT: add --yjit-cold-threshold, don't compile cold ISEQs YJIT: increase default cold threshold to 200_000 Remove rb_yjit_call_threshold() Remove conflict markers Fix compilation errors Threshold 1 should compile immediately Debug deadlock issue with test_ractor Fix call threshold issue with tests * Revert exception threshold logic. Document option in yjid.md * (void) for 0 parameter functions in C99 * Rename iseq_entry_cold => cold_iseq_entry * Document --yjit-cold-threshold in ruby.c * Update doc/yjit/yjit.md Co-authored-by: Jean byroot Boussier <[email protected]> * Shorten help string to appease test * Address bug found by Kokubun. Reorder logic. --------- Co-authored-by: Alan Wu <[email protected]> Co-authored-by: Jean byroot Boussier <[email protected]>
2023-09-28Move IO#readline to RubyAaron Patterson
This commit moves IO#readline to Ruby. In order to call C functions, keyword arguments must be converted to hashes. Prior to this commit, code like `io.readline(chomp: true)` would allocate a hash. This commits moves the keyword "denaturing" to Ruby, allowing us to send positional arguments to the C API and avoiding the hash allocation. Here is an allocation benchmark for the method: ``` x = GC.stat(:total_allocated_objects) File.open("/usr/share/dict/words") do |f| f.readline(chomp: true) until f.eof? end p ALLOCATIONS: GC.stat(:total_allocated_objects) - x ``` Before this commit, the output was this: ``` $ make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin22-fake ./test.rb {:ALLOCATIONS=>707939} ``` Now it is this: ``` $ make run ./miniruby -I./lib -I. -I.ext/common -r./arm64-darwin22-fake ./test.rb {:ALLOCATIONS=>471962} ``` [Bug #19890] [ruby-core:114803]
2023-09-28Change RNode structure from union to structyui-knk
All kind of AST nodes use same struct RNode, which has u1, u2, u3 union members for holding different kind of data. This has two problems. 1. Low flexibility of data structure Some nodes, for example NODE_TRUE, don’t use u1, u2, u3. On the other hand, NODE_OP_ASGN2 needs more than three union members. However they use same structure definition, need to allocate three union members for NODE_TRUE and need to separate NODE_OP_ASGN2 into another node. This change removes the restriction so make it possible to change data structure by each node type. 2. No compile time check for union member access It’s developer’s responsibility for using correct member for each node type when it’s union. This change clarifies which node has which type of fields and enables compile time check. This commit also changes node_buffer_elem_struct buf management to handle different size data with alignment.
2023-09-25Dump backtraces to an arbitrary streamNobuyoshi Nakada
2023-09-19[Bug #18257] Register the class path of FrozenCore to markNobuyoshi Nakada
ICLASS does not have the path usually, so it needs to be registered separately.
2023-09-19Stop exposing FrozenCore in headersNobuyoshi Nakada
Revert commit "Directly allocate FrozenCore as an ICLASS", 813a5f4fc46a24ca1695d23c159250b9e1080ac7.
2023-08-31Prevent rb_gc_mark_values from pinning objectsMatt Valentine-House
This is an internal only function not exposed to the C extension API. It's only use so far is from rb_vm_mark, where it's used to mark the values in the vm->trap_list.cmd array. There shouldn't be any reason why these cannot move. This commit allows them to move by updating their references during the reference updating step of compaction. To do this we've introduced another internal function rb_gc_update_values as a partner to rb_gc_mark_values. This allows us to refactor rb_gc_mark_values to not pin Notes: Merged: https://github.com/ruby/ruby/pull/8341
2023-08-24Remove cfp parameter from hook_before_rewind()Alan Wu
It's only used once, and it has to equal `ec->cfp`, so just use that.
2023-08-24Make cfp constant and use it more in vm_exec_handle_exception()Alan Wu
For writing THROW_DATA_VAL, being able to see that it's writing to the same frame after modifying PC and SP is nice.
2023-08-23Stop incrementing `jit_entry_calls` once threshold is hitJean Boussier
Otherwise the ISeq page will constantly be written into preventing it from being shared. Notes: Merged: https://github.com/ruby/ruby/pull/8259
2023-08-08YJIT: Compile exception handlers (#8171)Takashi Kokubun
Co-authored-by: Maxime Chevalier-Boisvert <[email protected]> Notes: Merged-By: k0kubun <[email protected]>
2023-08-08Share duplicate code between Wasm and the othersNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/8182
2023-08-06Turn `jit_exec` and `jit_compile` into macros if disabledNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/8181
2023-08-04Just suppress a warning for non-Emscripten Wasm buildTakashi Kokubun
Revert "Revert "Skip calling jit_exec on Wasm"" This reverts commit 2e94610f70baca4af004202f288a6b5dd10889ca. It's not about whether it's optimized away or not. I just don't want to leave and maintain the callsite (e.g. signature) in the path where YJIT is never built.
2023-08-05Revert "Skip calling jit_exec on Wasm"Nobuyoshi Nakada
This reverts commit e80752f9bbc5228dba3066cd95a81e2e496bd9d7. RJIT and YJIT are never enabled on Wasm. When both are disabled, `jit_exec` is defined to return `Qundef` constantly, and is optimized away. Notes: Merged: https://github.com/ruby/ruby/pull/8176
2023-08-04Skip calling jit_exec on WasmTakashi Kokubun
We often break Wasm build when we modify how jit_exec works. I'm planning to modify it again soon. We actually don't support running Ruby JIT on Wasm, so it doesn't seem worth the maintenance effort.
2023-07-27Remove an obsoleted initialization from WasmTakashi Kokubun
2023-07-27Remove an unused argument in vm_exec_coreTakashi Kokubun
2023-07-27Clean up OPT_STACK_CACHING (#8132)Takashi Kokubun
Notes: Merged-By: k0kubun <[email protected]>
2023-07-25Adjust brace nestingNobuyoshi Nakada
2023-07-17Remove __bp__ and speed-up bmethod calls (#8060)Alan Wu
Remove rb_control_frame_t::__bp__ and optimize bmethod calls This commit removes the __bp__ field from rb_control_frame_t. It was introduced to help MJIT, but since MJIT was replaced by RJIT, we can use vm_base_ptr() to compute it from the SP of the previous control frame instead. Removing the field avoids needing to set it up when pushing new frames. Simply removing __bp__ would cause crashes since RJIT and YJIT used a slightly different stack layout for bmethod calls than the interpreter. At the moment of the call, the two layouts looked as follows: ┌────────────┐ ┌────────────┐ │ frame_base │ │ frame_base │ ├────────────┤ ├────────────┤ │ ... │ │ ... │ ├────────────┤ ├────────────┤ │ args │ │ args │ ├────────────┤ └────────────┘<─prev_frame_sp │ receiver │ prev_frame_sp─>└────────────┘ RJIT & YJIT interpreter Essentially, vm_base_ptr() needs to compute the address to frame_base given prev_frame_sp in the diagrams. The presence of the receiver created an off-by-one situation. Make the interpreter use the layout the JITs use for iseq-to-iseq bmethod calls. Doing so removes unnecessary argument shifting and vm_exec_core() re-entry from the interpreter, yielding a speed improvement visible through `benchmark/vm_defined_method.yml`: patched: 7578743.1 i/s master: 4796596.3 i/s - 1.58x slower C-to-iseq bmethod calls now store one more VALUE than before, but that should have negligible impact on overall performance. Note that re-entering vm_exec_core() used to be necessary for firing TracePoint events, but that's no longer the case since 9121e57a5f50bc91bae48b3b91edb283bf96cb6b. Closes ruby/ruby#6428
2023-07-17YJIT: refactoring to allow for fancier call threshold logic (#8078)Maxime Chevalier-Boisvert
* YJIT: refactoring to allow for fancier call threshold logic * Avoid potentially compiling functions multiple times. * Update vm.c Co-authored-by: Alan Wu <[email protected]> --------- Co-authored-by: Alan Wu <[email protected]> Notes: Merged-By: maximecb <[email protected]>
2023-07-08macos: symbols for `rb_execution_context_t` should be internalNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/8040
2023-06-17Replace parser & node compile_option from Hash to bit fieldyui-knk
This commit reduces dependency to CRuby object. Notes: Merged: https://github.com/ruby/ruby/pull/7950
2023-06-14Directly allocate FrozenCore as an ICLASSPeter Zhu
It's a bad idea to overwrite the flags as the garbage collector may have set other flags. Notes: Merged: https://github.com/ruby/ruby/pull/7940
2023-06-12[Feature #19719] Universal Parseryui-knk
Introduce Universal Parser mode for the parser. This commit includes these changes: * Introduce `UNIVERSAL_PARSER` macro. All of CRuby related functions are passed via `struct rb_parser_config_struct` when this macro is enabled. * Add CI task with 'cppflags=-DUNIVERSAL_PARSER' for ubuntu. Notes: Merged: https://github.com/ruby/ruby/pull/7927
2023-05-10[Bug #19597] Freeze script nameNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/7709