summaryrefslogtreecommitdiff
path: root/iseq.c
AgeCommit message (Collapse)Author
12 daysFix memory leak in Prism's RubyVM::InstructionSequence.newPeter Zhu
[Bug #21394] There are two ways to make RubyVM::InstructionSequence.new raise which would cause the options->scopes to leak memory: 1. Passing in any (non T_FILE) object where the to_str raises. 2. Passing in a T_FILE object where String#initialize_dup raises. This is because rb_io_path dups the string. Example 1: 10.times do 100_000.times do RubyVM::InstructionSequence.new(nil) rescue TypeError end puts `ps -o rss= -p #{$$}` end Before: 13392 17104 20256 23920 27264 30432 33584 36752 40032 43232 After: 9392 11072 11648 11648 11648 11712 11712 11712 11744 11744 Example 2: require "tempfile" MyError = Class.new(StandardError) String.prepend(Module.new do def initialize_dup(_) if $raise_on_dup raise MyError else super end end end) Tempfile.create do |f| 10.times do 100_000.times do $raise_on_dup = true RubyVM::InstructionSequence.new(f) rescue MyError else raise "MyError was not raised during RubyVM::InstructionSequence.new" end puts `ps -o rss= -p #{$$}` ensure $raise_on_dup = false end end Before: 14080 18512 22000 25184 28320 31600 34736 37904 41088 44256 After: 12016 12464 12880 12880 12880 12912 12912 12912 12912 12912 Notes: Merged: https://github.com/ruby/ruby/pull/13496
2025-05-29Read {max_iv,variation}_count from prime classextJohn Hawthorn
MAX_IV_COUNT is a hint which determines the size of variable width allocation we should use for a given class. We don't need to scope this by namespace, if we end up with larger builtin objects on some namespaces that isn't a user-visible problem, just extra memory use. Similarly variation_count is used to track if a given object has had too many branches in shapes it has used, and to use too_complex when that happens. That's also just a hint, so we can use the same value across namespaces without it being visible to users. Previously variation_count was being incremented (written to) on the RCLASS_EXT_READABLE ext, which seems incorrect if we wanted it to be different across namespaces Notes: Merged: https://github.com/ruby/ruby/pull/13434
2025-05-15Ensure shape_id is never used on T_IMEMOJean Boussier
It doesn't make sense to set ivars or anything shape related on a T_IMEMO. Co-Authored-By: John Hawthorn <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/13347
2025-05-12Cast up `int` instruction code to `VALUE`Nobuyoshi Nakada
Fix Visual C warnings: ``` iseq.c(3793): warning C4312: 'type cast': conversion from 'int' to 'void *' of greater size iseq.c(3794): warning C4312: 'type cast': conversion from 'int' to 'void *' of greater size ``` Notes: Merged: https://github.com/ruby/ruby/pull/13304
2025-05-11namespace on readSatoshi Tagomori
2025-04-28Add comments for cryptic functions in iseq.cTakashi Kokubun
2025-04-28ZJIT: Drop trace_zjit_* instructions (#13189)Takashi Kokubun
Notes: Merged-By: k0kubun <[email protected]>
2025-04-26Use `set_table` to track const cachesJean Boussier
Now that we have a `set_table` implementation, we can use it to track const caches and save some memory. We could even save some more memory if `numtable` didn't store a copy of the `hash` and instead recomputed it every time, but this is a quick win. Notes: Merged: https://github.com/ruby/ruby/pull/13184
2025-04-19Fix style [ci skip]Nobuyoshi Nakada
2025-03-17Avoid pinning `storage_head` in `iseq_mark_and_move` (#12880)Eileen M. Uchitelle
* Avoid pinning `storage_head` in `iseq_mark_and_move` This refactor changes the behavior of `iseq_mark_and_move` to avoid pinning the `storage_head`. Previously pinning was required because they could be gc'd during `iseq_set_sequence` it would be possible to end up with a half build array of instructions. However, in order to implement a moving immix algorithm we can't pin these objects so this rafactoring changes the code to mark and move. To accomplish this, it was required to add `iseq_size`, `iseq_encoded`, and the `mark_bits` union to the `iseq_compile_data` struct. In addition `iseq_compile_data` sets a bool for whether there is a single or list of mark bits. While this change is needed for moving immix, it should be better for Ruby's GC as well. * Don't allocate mark_offset_bits for one word If only one word is needed, we don't need to allocate mark_offset_bits and can instead directly write to it. --------- Co-authored-by: Peter Zhu <[email protected]> Notes: Merged-By: eileencodes <[email protected]>
2025-03-12Push a real iseq in rb_vm_push_frame_fname()Alan Wu
Previously, vm_make_env_each() (used during proc creation and for the debug inspector C API) picked up the non-GC-allocated iseq that rb_vm_push_frame_fname() creates, which led to a SEGV when the GC tried to mark the non GC object. Put a real iseq imemo instead. Speed should be about the same since the old code also did a imemo allocation and a malloc allocation. Real iseq allows ironing out the special-casing of dummy frames in rb_execution_context_mark() and rb_execution_context_update(). A check is added to RubyVM::ISeq#eval, though, to stop attempts to run dummy iseqs. [Bug #21180] Co-authored-by: Aaron Patterson <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/12898
2025-03-12Have `ast` live longer in ISeq.compile_file to fix GC stress crashAlan Wu
Previously, live range of `ast_value` ended on the call right before rb_ast_dispose(), which led to premature collection and use-after-free. We observed this crashing on -O3, -DVM_CHECK_MODE, with GCC 11.4.0 on Ubuntu. Co-authored-by: Aaron Patterson <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/12898
2025-02-13[Feature #21116] Extract RJIT as a third-party gemNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/12740
2025-01-13Proc#parameters: Show anonymous optionals as `[:opt]`Alan Wu
Have this for lead parameters as well as parameters after rest ("post"). [Bug #20974] Notes: Merged: https://github.com/ruby/ruby/pull/12547
2025-01-07Correctly set node_id on iseq locationAaron Patterson
The iseq location object has a slot for node ids. parse.y was correctly populating that field but Prism was not. This commit populates the field with the ast node id for that iseq [Bug #21014] Notes: Merged: https://github.com/ruby/ruby/pull/12527
2025-01-02[DOC] Exclude 'Method' from RDoc's autolinkingNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/12496
2024-12-19Prefix asan_poison_object with rbPeter Zhu
Notes: Merged: https://github.com/ruby/ruby/pull/12385
2024-11-29Fix use-after-free in constant cachePeter Zhu
[Bug #20921] When we create a cache entry for a constant, the following sequence of events could happen: - vm_track_constant_cache is called to insert a constant cache. - In vm_track_constant_cache, we first look up the ST table for the ID of the constant. Assume the ST table exists because another iseq also holds a cache entry for this ID. - We then insert into this ST table with the iseq_inline_constant_cache. - However, while inserting into this ST table, it allocates memory, which could trigger a GC. Assume that it does trigger a GC. - The GC frees the one and only other iseq that holds a cache entry for this ID. - In remove_from_constant_cache, it will appear that the ST table is now empty because there are no more iseq with cache entries for this ID, so we free the ST table. - We complete GC and continue our st_insert. However, this ST table has been freed so we now have a use-after-free. This issue is very hard to reproduce, because it requires that the GC runs at a very specific time. However, we can make it show up by applying this patch which runs GC right before the st_insert to mimic the st_insert triggering a GC: diff --git a/vm_insnhelper.c b/vm_insnhelper.c index 3cb23f06f0..a93998136a 100644 --- a/vm_insnhelper.c +++ b/vm_insnhelper.c @@ -6338,6 +6338,10 @@ vm_track_constant_cache(ID id, void *ic) rb_id_table_insert(const_cache, id, (VALUE)ics); } + if (id == rb_intern("MyConstant")) rb_gc(); + st_insert(ics, (st_data_t) ic, (st_data_t) Qtrue); } And if we run this script: Object.const_set("MyConstant", "Hello!") my_proc = eval("-> { MyConstant }") my_proc.call my_proc = eval("-> { MyConstant }") my_proc.call We can see that ASAN outputs a use-after-free error: ==36540==ERROR: AddressSanitizer: heap-use-after-free on address 0x606000049528 at pc 0x000102f3ceac bp 0x00016d607a70 sp 0x00016d607a68 READ of size 8 at 0x606000049528 thread T0 #0 0x102f3cea8 in do_hash st.c:321 #1 0x102f3ddd0 in rb_st_insert st.c:1132 #2 0x103140700 in vm_track_constant_cache vm_insnhelper.c:6345 #3 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #4 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #5 0x1030bc1e0 in vm_exec_core insns.def:263 #6 0x1030b55fc in rb_vm_exec vm.c:2585 #7 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #8 0x102a82588 in rb_ec_exec_node eval.c:281 #9 0x102a81fe0 in ruby_run_node eval.c:319 #10 0x1027f3db4 in rb_main main.c:43 #11 0x1027f3bd4 in main main.c:68 #12 0x183900270 (<unknown module>) 0x606000049528 is located 8 bytes inside of 56-byte region [0x606000049520,0x606000049558) freed by thread T0 here: #0 0x104174d40 in free+0x98 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54d40) #1 0x102ada89c in rb_gc_impl_free default.c:8183 #2 0x102ada7dc in ruby_sized_xfree gc.c:4507 #3 0x102ac4d34 in ruby_xfree gc.c:4518 #4 0x102f3cb34 in rb_st_free_table st.c:663 #5 0x102bd52d8 in remove_from_constant_cache iseq.c:119 #6 0x102bbe2cc in iseq_clear_ic_references iseq.c:153 #7 0x102bbd2a0 in rb_iseq_free iseq.c:166 #8 0x102b32ed0 in rb_imemo_free imemo.c:564 #9 0x102ac4b44 in rb_gc_obj_free gc.c:1407 #10 0x102af4290 in gc_sweep_plane default.c:3546 #11 0x102af3bdc in gc_sweep_page default.c:3634 #12 0x102aeb140 in gc_sweep_step default.c:3906 #13 0x102aeadf0 in gc_sweep_rest default.c:3978 #14 0x102ae4714 in gc_sweep default.c:4155 #15 0x102af8474 in gc_start default.c:6484 #16 0x102afbe30 in garbage_collect default.c:6363 #17 0x102ad37f0 in rb_gc_impl_start default.c:6816 #18 0x102ad3634 in rb_gc gc.c:3624 #19 0x1031406ec in vm_track_constant_cache vm_insnhelper.c:6342 #20 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #21 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #22 0x1030bc1e0 in vm_exec_core insns.def:263 #23 0x1030b55fc in rb_vm_exec vm.c:2585 #24 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #25 0x102a82588 in rb_ec_exec_node eval.c:281 #26 0x102a81fe0 in ruby_run_node eval.c:319 #27 0x1027f3db4 in rb_main main.c:43 #28 0x1027f3bd4 in main main.c:68 #29 0x183900270 (<unknown module>) previously allocated by thread T0 here: #0 0x104174c04 in malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x54c04) #1 0x102ada0ec in rb_gc_impl_malloc default.c:8198 #2 0x102acee44 in ruby_xmalloc gc.c:4438 #3 0x102f3c85c in rb_st_init_table_with_size st.c:571 #4 0x102f3c900 in rb_st_init_table st.c:600 #5 0x102f3c920 in rb_st_init_numtable st.c:608 #6 0x103140698 in vm_track_constant_cache vm_insnhelper.c:6337 #7 0x1030b91d8 in vm_ic_track_const_chain vm_insnhelper.c:6356 #8 0x1030b8cf8 in rb_vm_opt_getconstant_path vm_insnhelper.c:6424 #9 0x1030bc1e0 in vm_exec_core insns.def:263 #10 0x1030b55fc in rb_vm_exec vm.c:2585 #11 0x1030fe0ac in rb_iseq_eval_main vm.c:2851 #12 0x102a82588 in rb_ec_exec_node eval.c:281 #13 0x102a81fe0 in ruby_run_node eval.c:319 #14 0x1027f3db4 in rb_main main.c:43 #15 0x1027f3bd4 in main main.c:68 #16 0x183900270 (<unknown module>) This commit fixes this bug by adding a inserting_constant_cache_id field to the VM, which stores the ID that is currently being inserted and, in remove_from_constant_cache, we don't free the ST table for ID equal to this one. Co-Authored-By: Alan Wu <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/12203
2024-11-28Avoid an operation on a pointer after freeYusuke Endoh
A follow-up to ef59175a68c448fe334125824b477a9e1d5629bc. That commit uses `&body->local_table[...]` but `body->local_table` is already freed. I think it is an undefined behavior to calculate a pointer that exceeds the bound by more than 1. This change moves the free of `body->local_table` after the calculation. Coverity Scan found this issue. Notes: Merged: https://github.com/ruby/ruby/pull/12194
2024-11-13Move Array#map to RubyTakashi Kokubun
Co-Authored-By: Aaron Patterson <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/12074
2024-11-08Fix memory leak in prism when syntax error in iseq compilationPeter Zhu
If there's a syntax error during iseq compilation then prism would leak memory because it would not free the pm_parse_result_t. This commit changes pm_iseq_new_with_opt to have a rb_protect to catch when an error is raised, and return NULL and set error_state to a value that can be raised by calling rb_jump_tag after memory has been freed. For example: 10.times do 10_000.times do eval("/[/=~s") rescue SyntaxError end puts `ps -o rss= -p #{$$}` end Before: 39280 68736 99232 128864 158896 188208 217344 246304 275376 304592 After: 12192 13200 14256 14848 16000 16000 16000 16064 17232 17952 Notes: Merged: https://github.com/ruby/ruby/pull/12036
2024-10-16RubyVM::InstructionSequence.of Thread::Backtrace::LocationKevin Newton
This would be useful for debugging. Notes: Merged: https://github.com/ruby/ruby/pull/11896
2024-10-04Fix intermediate array off-by-one errorKevin Newton
Co-authored-by: Adam Hess <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/11800
2024-10-02Mark iseq keyword default values during compilationPeter Zhu
During compilation, we write keyword default values into the iseq, so we should mark it to ensure it does not get GC'd. This might fix issues on ASAN like http://ci.rvm.jp/logfiles/brlog.trunk_asan.20240927-194923 ==805257==ERROR: AddressSanitizer: use-after-poison on address 0x7b7e5e3e2828 at pc 0x5e09ac4822f8 bp 0x7ffde56b0140 sp 0x7ffde56b0138 READ of size 8 at 0x7b7e5e3e2828 thread T0 #0 0x5e09ac4822f7 in RB_BUILTIN_TYPE include/ruby/internal/value_type.h:191:30 #1 0x5e09ac4822f7 in rbimpl_RB_TYPE_P_fastpath include/ruby/internal/value_type.h:352:19 #2 0x5e09ac4822f7 in gc_mark gc/default.c:4488:9 #3 0x5e09ac51011e in rb_iseq_mark_and_move iseq.c:361:17 #4 0x5e09ac4b85c4 in rb_imemo_mark_and_move imemo.c:386:9 #5 0x5e09ac467544 in rb_gc_mark_children gc.c:2508:9 #6 0x5e09ac482c24 in gc_mark_children gc/default.c:4673:5 #7 0x5e09ac482c24 in gc_mark_stacked_objects gc/default.c:4694:9 #8 0x5e09ac482c24 in gc_mark_stacked_objects_all gc/default.c:4732:12 #9 0x5e09ac48c7f9 in gc_marks_rest gc/default.c:5755:9 #10 0x5e09ac48c7f9 in gc_marks gc/default.c:5870:9 #11 0x5e09ac48c7f9 in gc_start gc/default.c:6517:13 Notes: Merged: https://github.com/ruby/ruby/pull/11755
2024-10-02Make default parser enum and define getter/setterNobuyoshi Nakada
Notes: Merged: https://github.com/ruby/ruby/pull/11761
2024-09-16[PRISM] Assume an eval context for RubyVM::ISEQ compileKevin Newton
Fixes [Bug #20741] Notes: Merged: https://github.com/ruby/ruby/pull/11632
2024-08-29[PRISM] Handle RubyVM.keep_script_linesKevin Newton
Notes: Merged: https://github.com/ruby/ruby/pull/11501
2024-08-21[PRISM] Implement unused block warningeileencodes
Related: ruby/prism#2935 Notes: Merged: https://github.com/ruby/ruby/pull/11415
2024-08-15Show anonymous and ambiguous params in ISeq disassemblyKevin Newton
Previously, in the disasesmbly for ISeqs, there's no way to know if the anon_rest, anon_kwrest, or ambiguous_param0 flags are set. This commit extends the names of the rest, kwrest, and lead params to display this information. They are relevant for the ISeqs' runtime behavior. Notes: Merged: https://github.com/ruby/ruby/pull/11237 Merged-By: XrXr
2024-08-11compile.c: don't allocate empty default values listJean Boussier
It just wastes memory. Notes: Merged: https://github.com/ruby/ruby/pull/11361
2024-07-02Resize arrays in `rb_ary_freeze` and use it for freezing arrayseileencodes
While working on a separate issue we found that in some cases `ary_heap_realloc` was being called on frozen arrays. To fix this, this change does the following: 1) Updates `rb_ary_freeze` to assert the type is an array, return if already frozen, and shrink the capacity if it is not embedded, shared or a shared root. 2) Replaces `rb_obj_freeze` with `rb_ary_freeze` when the object is always an array. 3) In `ary_heap_realloc`, ensure the new capa is set with `ARY_SET_CAPA`. Previously the change in capa was not set. 4) Adds an assertion to `ary_heap_realloc` that the array is not frozen. Some of this work was originally done in https://github.com/ruby/ruby/pull/2640, referencing this issue https://bugs.ruby-lang.org/issues/16291. There didn't appear to be any objections to this PR, it appears to have simply lost traction. The original PR made changes to arrays and strings at the same time, this PR only does arrays. Also it was old enough that rather than revive that branch I've made a new one. I added Lourens as co-author in addtion to Aaron who helped me with this patch. The original PR made this change for performance reasons, and while that's still true for this PR, the goal of this PR is to avoid calling `ary_heap_realloc` on frozen arrays. The capacity should be shrunk _before_ the array is frozen, not after. Co-authored-by: Aaron Patterson <[email protected]> Co-Authored-By: methodmissing <[email protected]>
2024-06-30Add RB_GC_GUARD for ast_valueyui-knk
I think this change fixes the following assertion failure: ``` [BUG] unexpected rb_parser_ary_data_type (2114076960) for script lines ``` It seems that `ast_value` is collected then `rb_parser_build_script_lines_from` touches invalid memory address. This change prevents `ast_value` from being collected by RB_GC_GUARD.
2024-06-18Optimized forwarding callers and calleesAaron Patterson
This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls. Calls it optimizes look like this: ```ruby def bar(a) = a def foo(...) = bar(...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) = bar(1, 2, ...) # optimized foo(123) ``` ```ruby def bar(*a) = a def foo(...) list = [1, 2] bar(*list, ...) # optimized end foo(123) ``` All variants of the above but using `super` are also optimized, including a bare super like this: ```ruby def foo(...) super end ``` This patch eliminates intermediate allocations made when calling methods that accept `...`. We can observe allocation elimination like this: ```ruby def m x = GC.stat(:total_allocated_objects) yield GC.stat(:total_allocated_objects) - x end def bar(a) = a def foo(...) = bar(...) def test m { foo(123) } end test p test # allocates 1 object on master, but 0 objects with this patch ``` ```ruby def bar(a, b:) = a + b def foo(...) = bar(...) def test m { foo(1, b: 2) } end test p test # allocates 2 objects on master, but 0 objects with this patch ``` How does it work? ----------------- This patch works by using a dynamic stack size when passing forwarded parameters to callees. The caller's info object (known as the "CI") contains the stack size of the parameters, so we pass the CI object itself as a parameter to the callee. When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee. The CI at the forwarded call site is adjusted using information from the caller's CI. I think this description is kind of confusing, so let's walk through an example with code. ```ruby def delegatee(a, b) = a + b def delegator(...) delegatee(...) # CI2 (FORWARDING) end def caller delegator(1, 2) # CI1 (argc: 2) end ``` Before we call the delegator method, the stack looks like this: ``` Executing Line | Code | Stack ---------------+---------------------------------------+-------- 1| def delegatee(a, b) = a + b | self 2| | 1 3| def delegator(...) | 2 4| # | 5| delegatee(...) # CI2 (FORWARDING) | 6| end | 7| | 8| def caller | -> 9| delegator(1, 2) # CI1 (argc: 2) | 10| end | ``` The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in to `delegator`, it writes `CI1` on to the stack as a local variable for the `delegator` method. The `delegator` method has a special local called `...` that holds the caller's CI object. Here is the ISeq disasm fo `delegator`: ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil 0006 leave [Re] ``` The local called `...` will contain the caller's CI: CI1. Here is the stack when we enter `delegator`: ``` Executing Line | Code | Stack ---------------+---------------------------------------+-------- 1| def delegatee(a, b) = a + b | self 2| | 1 3| def delegator(...) | 2 -> 4| # | CI1 (argc: 2) 5| delegatee(...) # CI2 (FORWARDING) | cref_or_me 6| end | specval 7| | type 8| def caller | 9| delegator(1, 2) # CI1 (argc: 2) | 10| end | ``` The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to memcopy the caller's stack before calling `delegatee`. In this case, it will memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much memory to copy from the caller because `CI1` contains stack size information (argc: 2). Before executing the `send` instruction, we push `...` on the stack. The `send` instruction pops `...`, and because it is tagged with `FORWARDING`, it knows to memcopy (using the information in the CI it just popped): ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL|FORWARDING>, nil 0006 leave [Re] ``` Instruction 001 puts the caller's CI on the stack. `send` is tagged with FORWARDING, so it reads the CI and _copies_ the callers stack to this stack: ``` Executing Line | Code | Stack ---------------+---------------------------------------+-------- 1| def delegatee(a, b) = a + b | self 2| | 1 3| def delegator(...) | 2 4| # | CI1 (argc: 2) -> 5| delegatee(...) # CI2 (FORWARDING) | cref_or_me 6| end | specval 7| | type 8| def caller | self 9| delegator(1, 2) # CI1 (argc: 2) | 1 10| end | 2 ``` The "FORWARDING" call site combines information from CI1 with CI2 in order to support passing other values in addition to the `...` value, as well as perfectly forward splat args, kwargs, etc. Since we're able to copy the stack from `caller` in to `delegator`'s stack, we can avoid allocating objects. I want to do this to eliminate object allocations for delegate methods. My long term goal is to implement `Class#new` in Ruby and it uses `...`. I was able to implement `Class#new` in Ruby [here](https://github.com/ruby/ruby/pull/9289). If we adopt the technique in this patch, then we can optimize allocating objects that take keyword parameters for `initialize`. For example, this code will allocate 2 objects: one for `SomeObject`, and one for the kwargs: ```ruby SomeObject.new(foo: 1) ``` If we combine this technique, plus implement `Class#new` in Ruby, then we can reduce allocations for this common operation. Co-Authored-By: John Hawthorn <[email protected]> Co-Authored-By: Alan Wu <[email protected]>
2024-06-03Avoid unnecessary writes to ISEQ during GCJohn Hawthorn
On mark we check whether a callcache has been invalidated and if it has we replace it with the empty callcache, rb_vm_empty_cc(). However we also consider the empty callcache to not be active, and so previously would overwrite it with itself. These additional writes are problematic because they may force Copy-on-Write to occur on the memory page, increasing system memory use.
2024-05-20[PRISM] Respect eval coverage settingKevin Newton
2024-05-03Rename `vast` to `ast_value`yui-knk
There is an English word "vast". This commit changes the name to be more clear name to avoid confusion.
2024-05-01[PRISM] Respect frozen_string_literal option in ↵Kevin Newton
RubyVM::InstructionSequence.compile
2024-04-27Add line_count field to rb_ast_body_tHASUMI Hitoshi
This patch adds `int line_count` field to `rb_ast_body_t` structure. Instead, we no longer cast `script_lines` to Fixnum. ## Background Ref https://github.com/ruby/ruby/pull/10618 In the PR above, we have decoupled IMEMO from `rb_ast_t`. This means we could lift the five-words-restriction of the structure that forced us to unionize `rb_ast_t *` and `FIXNUM` in one field. ## Relating refactor - Remove the second parameter of `rb_ruby_ast_new()` function ## Attention I will remove a code that assigns -1 to line_count, in `rb_binding_add_dynavars()` of vm.c, because I don't think it is necessary. But I will make another PR for this so that we can atomically revert in case I was wrong (See the comment on the code)
2024-04-26[PRISM] Enable coverage in eval ISEQsKevin Newton
2024-04-26[PRISM] Enable coverage in top and main iseqsKevin Newton
2024-04-26[Universal parser] Decouple IMEMO from rb_ast_tHASUMI Hitoshi
This patch removes the `VALUE flags` member from the `rb_ast_t` structure making `rb_ast_t` no longer an IMEMO object. ## Background We are trying to make the Ruby parser generated from parse.y a universal parser that can be used by other implementations such as mruby. To achieve this, it is necessary to exclude VALUE and IMEMO from parse.y, AST, and NODE. ## Summary (file by file) - `rubyparser.h` - Remove the `VALUE flags` member from `rb_ast_t` - `ruby_parser.c` and `internal/ruby_parser.h` - Use TypedData_Make_Struct VALUE which wraps `rb_ast_t` `in ast_alloc()` so that GC can manage it - You can retrieve `rb_ast_t` from the VALUE by `rb_ruby_ast_data_get()` - Change the return type of `rb_parser_compile_XXXX()` functions from `rb_ast_t *` to `VALUE` - rb_ruby_ast_new() which internally `calls ast_alloc()` is to create VALUE vast outside ruby_parser.c - `iseq.c` and `vm_core.h` - Amend the first parameter of `rb_iseq_new_XXXX()` functions from `rb_ast_body_t *` to `VALUE` - This keeps the VALUE of AST on the machine stack to prevent being removed by GC - `ast.c` - Almost all change is replacement `rb_ast_t *ast` with `VALUE vast` (sorry for the big diff) - Fix `node_memsize()` - Now it includes `rb_ast_local_table_link`, `tokens` and script_lines - `compile.c`, `load.c`, `node.c`, `parse.y`, `proc.c`, `ruby.c`, `template/prelude.c.tmpl`, `vm.c` and `vm_eval.c` - Follow-up due to the above changes - `imemo.{c|h}` - If an object with `imemo_ast` appears, considers it a bug Co-authored-by: Nobuyoshi Nakada <[email protected]>
2024-04-25YJIT: Optimize local variables when EP == BP (take 2) (#10607)Takashi Kokubun
* Revert "Revert "YJIT: Optimize local variables when EP == BP" (#10584)" This reverts commit c8783441952217c18e523749c821f82cd7e5d222. * YJIT: Take care of GC references in ISEQ invariants Co-authored-by: Alan Wu <[email protected]> --------- Co-authored-by: Alan Wu <[email protected]>
2024-04-18Don't mark empty singleton cc'seileencodes
These cc's aren't managed by the garbage collector so we shouldn't try to mark and move them.
2024-04-17`ISeq#to_a` respects `use_block` statusKoichi Sasada
```ruby b = RubyVM::InstructionSequence.compile('def f = yield; def g = nil').to_a pp b #=> ... {:use_block=>true}, ... ```
2024-04-15[Universal parser] DeVALUE of p->debug_lines and ast->body.script_linesHASUMI Hitoshi
This patch is part of universal parser work. ## Summary - Decouple VALUE from members below: - `(struct parser_params *)->debug_lines` - `(rb_ast_t *)->body.script_lines` - Instead, they are now `rb_parser_ary_t *` - They can also be a `(VALUE)FIXNUM` as before to hold line count - `ISEQ_BODY(iseq)->variable.script_lines` remains VALUE - In order to do this, - Add `VALUE script_lines` param to `rb_iseq_new_with_opt()` - Introduce `rb_parser_build_script_lines_from()` to convert `rb_parser_ary_t *` into `VALUE` ## Other details - Extend `rb_parser_ary_t *`. It previously could only store `rb_parser_ast_token *`, now can store script_lines, too - Change tactics of building the top-level `SCRIPT_LINES__` in `yycompile0()` - Before: While parsing, each line of the script is added to `SCRIPT_LINES__[path]` - After: After `yyparse(p)`, `SCRIPT_LINES__[path]` will be built from `p->debug_lines` - Remove the second parameter of `rb_parser_set_script_lines()` to make it simple - Introduce `script_lines_free()` to be called from `rb_ast_free()` because the GC no longer takes care of the script_lines - Introduce `rb_parser_string_deep_copy()` in parse.y to maintain script_lines when `rb_ruby_parser_free()` called - With regard to this, please see *Future tasks* below ## Future tasks - Decouple IMEMO from `rb_ast_t *` - This lifts the five-members-restriction of Ruby object, - So we will be able to move the ownership of the `lex.string_buffer` from parser to AST - Then we remove `rb_parser_string_deep_copy()` to make the whole thing simple
2024-04-15show warning for unused blockKoichi Sasada
With verbopse mode (-w), the interpreter shows a warning if a block is passed to a method which does not use the given block. Warning on: * the invoked method is written in C * the invoked method is not `initialize` * not invoked with `super` * the first time on the call-site with the invoked method (`obj.foo{}` will be warned once if `foo` is same method) [Feature #15554] `Primitive.attr! :use_block` is introduced to declare that primitive functions (written in C) will use passed block. For minitest, test needs some tweak, so use https://github.com/minitest/minitest/commit/ea9caafc0754b1d6236a490d59e624b53209734a for `test-bundled-gems`.
2024-04-03Reapply "Mark iseq structs with rb_gc_mark_movable"Peter Zhu
This reverts commit 16c18eafb579cf2263c7e0057c4c81358fe62075.
2024-04-02[PRISM] Fix ISEQ loadKevin Newton
2024-03-29[PRISM] Have RubyVM::InstructionSequence.compile respect --parser=prismKevin Newton
2024-03-27[PRISM] Pass --enable-frozen-string-literal through to evalsKevin Newton