ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
2025-02-13	[Feature #21116] Extract RJIT as a third-party gem	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/12740
2024-09-05	Optimized instruction for Hash#freeze	Étienne Barrié
	If a Hash which is empty or only using literals is frozen, we detect this as a peephole optimization and change the instructions to be `opt_hash_freeze`. [Feature #20684] Co-authored-by: Jean Boussier <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/11406
2024-09-05	Optimized instruction for Array#freeze	Étienne Barrié
	If an Array which is empty or only using literals is frozen, we detect this as a peephole optimization and change the instructions to be `opt_ary_freeze`. [Feature #20684] Co-authored-by: Jean Boussier <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/11406
2024-08-13	Delete newarraykwsplat	Alan Wu
	The pushtoarraykwsplat instruction was designed to replace newarraykwsplat, and we now meet the condition for deletion mentioned in 77c1233f79a0f96a081b70da533fbbde4f3037fa. Notes: Merged: https://github.com/ruby/ruby/pull/11371 Merged-By: XrXr
2024-06-18	Optimized forwarding callers and callees	Aaron Patterson
	This patch optimizes forwarding callers and callees. It only optimizes methods that only take `...` as their parameter, and then pass `...` to other calls. Calls it optimizes look like this: ```ruby def bar(a) = a def foo(...) = bar(...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) = bar(1, 2, ...) # optimized foo(123) ``` ```ruby def bar(a) = a def foo(...) list = [1, 2] bar(list, ...) # optimized end foo(123) ``` All variants of the above but using `super` are also optimized, including a bare super like this: ```ruby def foo(...) super end ``` This patch eliminates intermediate allocations made when calling methods that accept `...`. We can observe allocation elimination like this: ```ruby def m x = GC.stat(:total_allocated_objects) yield GC.stat(:total_allocated_objects) - x end def bar(a) = a def foo(...) = bar(...) def test m { foo(123) } end test p test # allocates 1 object on master, but 0 objects with this patch ``` ```ruby def bar(a, b:) = a + b def foo(...) = bar(...) def test m { foo(1, b: 2) } end test p test # allocates 2 objects on master, but 0 objects with this patch ``` How does it work? ----------------- This patch works by using a dynamic stack size when passing forwarded parameters to callees. The caller's info object (known as the "CI") contains the stack size of the parameters, so we pass the CI object itself as a parameter to the callee. When forwarding parameters, the forwarding ISeq uses the caller's CI to determine how much stack to copy, then copies the caller's stack before calling the callee. The CI at the forwarded call site is adjusted using information from the caller's CI. I think this description is kind of confusing, so let's walk through an example with code. ```ruby def delegatee(a, b) = a + b def delegator(...) delegatee(...) # CI2 (FORWARDING) end def caller delegator(1, 2) # CI1 (argc: 2) end ``` Before we call the delegator method, the stack looks like this: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| 5\| delegatee(...) # CI2 (FORWARDING) \| 6\| end \| 7\| \| 8\| def caller \| -> 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The ISeq for `delegator` is tagged as "forwardable", so when `caller` calls in to `delegator`, it writes `CI1` on to the stack as a local variable for the `delegator` method. The `delegator` method has a special local called `...` that holds the caller's CI object. Here is the ISeq disasm fo `delegator`: ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` The local called `...` will contain the caller's CI: CI1. Here is the stack when we enter `delegator`: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 -> 4\| # \| CI1 (argc: 2) 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| 9\| delegator(1, 2) # CI1 (argc: 2) \| 10\| end \| ``` The CI at `delegatee` on line 5 is tagged as "FORWARDING", so it knows to memcopy the caller's stack before calling `delegatee`. In this case, it will memcopy self, 1, and 2 to the stack before calling `delegatee`. It knows how much memory to copy from the caller because `CI1` contains stack size information (argc: 2). Before executing the `send` instruction, we push `...` on the stack. The `send` instruction pops `...`, and because it is tagged with `FORWARDING`, it knows to memcopy (using the information in the CI it just popped): ``` == disasm: #<ISeq:delegator@-e:1 (1,0)-(1,39)> local table (size: 1, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1]) [ 1] "..."@0 0000 putself ( 1)[LiCa] 0001 getlocal_WC_0 "..."@0 0003 send <calldata!mid:delegatee, argc:0, FCALL\|FORWARDING>, nil 0006 leave [Re] ``` Instruction 001 puts the caller's CI on the stack. `send` is tagged with FORWARDING, so it reads the CI and _copies_ the callers stack to this stack: ``` Executing Line \| Code \| Stack ---------------+---------------------------------------+-------- 1\| def delegatee(a, b) = a + b \| self 2\| \| 1 3\| def delegator(...) \| 2 4\| # \| CI1 (argc: 2) -> 5\| delegatee(...) # CI2 (FORWARDING) \| cref_or_me 6\| end \| specval 7\| \| type 8\| def caller \| self 9\| delegator(1, 2) # CI1 (argc: 2) \| 1 10\| end \| 2 ``` The "FORWARDING" call site combines information from CI1 with CI2 in order to support passing other values in addition to the `...` value, as well as perfectly forward splat args, kwargs, etc. Since we're able to copy the stack from `caller` in to `delegator`'s stack, we can avoid allocating objects. I want to do this to eliminate object allocations for delegate methods. My long term goal is to implement `Class#new` in Ruby and it uses `...`. I was able to implement `Class#new` in Ruby [here](https://github.com/ruby/ruby/pull/9289). If we adopt the technique in this patch, then we can optimize allocating objects that take keyword parameters for `initialize`. For example, this code will allocate 2 objects: one for `SomeObject`, and one for the kwargs: ```ruby SomeObject.new(foo: 1) ``` If we combine this technique, plus implement `Class#new` in Ruby, then we can reduce allocations for this common operation. Co-Authored-By: John Hawthorn <[email protected]> Co-Authored-By: Alan Wu <[email protected]>
2024-06-05	Improve YJIT performance warning regression test	Jean Boussier
	[Bug #20522]
2024-05-01	YJIT: Fix `Struct` accessors not firing tracing events (#10690)	Alan Wu
	* YJIT: Fix `Struct` accessors not firing tracing events Reading and writing to structs should fire `c_call` and `c_return`, but YJIT wasn't correctly dropping those calls when tracing. This has been missing since this functionality was added in 3081c83169c, but the added test only fails when ran in isolation with `--yjit-call-threshold=1`. The test sometimes failed on CI. * RJIT: YJIT: Fix `Struct` readers not firing tracing events Same issue as YJIT, but it looks like RJIT doesn't support writing to structs, so only reading needs changing.
2024-03-19	Implement chilled strings	Étienne Barrié
	[Feature #20205] As a path toward enabling frozen string literals by default in the future, this commit introduce "chilled strings". From a user perspective chilled strings pretend to be frozen, but on the first attempt to mutate them, they lose their frozen status and emit a warning rather than to raise a `FrozenError`. Implementation wise, `rb_compile_option_struct.frozen_string_literal` is no longer a boolean but a tri-state of `enabled/disabled/unset`. When code is compiled with frozen string literals neither explictly enabled or disabled, string literals are compiled with a new `putchilledstring` instruction. This instruction is identical to `putstring` except it marks the String with the `STR_CHILLED (FL_USER3)` and `FL_FREEZE` flags. Chilled strings have the `FL_FREEZE` flag as to minimize the need to check for chilled strings across the codebase, and to improve compatibility with C extensions. Notes: - `String#freeze`: clears the chilled flag. - `String#-@`: acts as if the string was mutable. - `String#+@`: acts as if the string was mutable. - `String#clone`: copies the chilled flag. Co-authored-by: Jean Boussier <[email protected]>
2024-03-06	Move FL_SINGLETON to FL_USER1	Jean Boussier
	This frees FL_USER0 on both T_MODULE and T_CLASS. Note: prior to this, FL_SINGLETON was never set on T_MODULE, so checking for `FL_SINGLETON` without first checking that `FL_TYPE` was `T_CLASS` was valid. That's no longer the case.
2024-01-18	RJIT: Properly reject keyword splat with `yield`	Alan Wu
	See the fix for YJIT.
2024-01-16	Drop obsoleted BUILTIN_ATTR_NO_GC attribute	Takashi Kokubun
	The thing that has used this in the past was very buggy, and we've never revisied it. Let's remove it until we need it again.
2023-12-25	Typofix under lib and test, tool directories	Hiroshi SHIBATA

2023-12-21	RJIT: Avoid retaining unrelated local variables in memory	Takashi Kokubun

2023-12-21	RJIT: Minimize string allocations in InsnCompiler	Takashi Kokubun

2023-12-21	RJIT: Convert opt_case_dispatch keys with #to_value	Takashi Kokubun
	comptime_key is a Ruby object and the value is not valid in machine code. This PR also implements `CMP r/m64, imm32 (Mod 01: [reg]+disp8)` that is now needed for running mail.gem benchmark.
2023-11-08	Refactor rb_shape_transition_shape_capa out	Jean Boussier
	Right now the `rb_shape_get_next` shape caller need to first check if there is capacity left, and if not call `rb_shape_transition_shape_capa` before it can call `rb_shape_get_next`. And on each of these it needs to checks if we got a TOO_COMPLEX back. All this logic is duplicated in the interpreter, YJIT and RJIT. Instead we can have `rb_shape_get_next` do the capacity transition when needed. The caller can compare the old and new shapes capacity to know if resizing is needed. It also can check for TOO_COMPLEX only once.
2023-10-10	Refactor rb_shape_transition_shape_capa to not accept capacity	Jean Boussier
	This way the groth factor is encapsulated, which allows rb_shape_transition_shape_capa to be smarter about ideal sizes.
2023-08-28	RJIT: Remove Type::CArray and limit use of Type::CString	Alan Wu
	See previous similar YJIT commit. Notes: Merged: https://github.com/ruby/ruby/pull/8299
2023-07-17	Remove __bp__ and speed-up bmethod calls (#8060)	Alan Wu
	Remove rb_control_frame_t::__bp__ and optimize bmethod calls This commit removes the __bp__ field from rb_control_frame_t. It was introduced to help MJIT, but since MJIT was replaced by RJIT, we can use vm_base_ptr() to compute it from the SP of the previous control frame instead. Removing the field avoids needing to set it up when pushing new frames. Simply removing __bp__ would cause crashes since RJIT and YJIT used a slightly different stack layout for bmethod calls than the interpreter. At the moment of the call, the two layouts looked as follows: ┌────────────┐ ┌────────────┐ │ frame_base │ │ frame_base │ ├────────────┤ ├────────────┤ │ ... │ │ ... │ ├────────────┤ ├────────────┤ │ args │ │ args │ ├────────────┤ └────────────┘<─prev_frame_sp │ receiver │ prev_frame_sp─>└────────────┘ RJIT & YJIT interpreter Essentially, vm_base_ptr() needs to compute the address to frame_base given prev_frame_sp in the diagrams. The presence of the receiver created an off-by-one situation. Make the interpreter use the layout the JITs use for iseq-to-iseq bmethod calls. Doing so removes unnecessary argument shifting and vm_exec_core() re-entry from the interpreter, yielding a speed improvement visible through `benchmark/vm_defined_method.yml`: patched: 7578743.1 i/s master: 4796596.3 i/s - 1.58x slower C-to-iseq bmethod calls now store one more VALUE than before, but that should have negligible impact on overall performance. Note that re-entering vm_exec_core() used to be necessary for firing TracePoint events, but that's no longer the case since 9121e57a5f50bc91bae48b3b91edb283bf96cb6b. Closes ruby/ruby#6428
2023-07-13	Remove RARRAY_CONST_PTR_TRANSIENT	Peter Zhu
	RARRAY_CONST_PTR now does the same things as RARRAY_CONST_PTR_TRANSIENT. Notes: Merged: https://github.com/ruby/ruby/pull/8071
2023-07-04	YJIT: Fix autosplat miscomp for blocks with optionals (#8006)	Alan Wu
	* YJIT: Fix autosplat miscomp for blocks with optionals When passing an array as the sole argument to `yield`, and the yieldee takes more than 1 optional parameter, the array is expanded similar to `array` splat calls. This is called "autosplat" in `setup_parameters_complex()`. Previously, YJIT did not detect this autosplat condition. It passed the array without expanding it, deviating from interpreter behavior. Detect this conditon and refuse to compile it. Fixes: Shopify/yjit#313 RJIT: Fix autosplat miscomp for blocks with optionals This is mirrors the same issue as YJIT. See previous commit. Notes: Merged-By: maximecb <[email protected]>
2023-06-06	Unify length field for embedded and heap strings (#7908)	Peter Zhu
	* Unify length field for embedded and heap strings The length field is of the same type and position in RString for both embedded and heap allocated strings, so we can unify it. * Remove RSTRING_EMBED_LEN Notes: Merged-By: maximecb <[email protected]>
2023-04-26	RJIT: Fix unspecified_bits with locals	Takashi Kokubun

2023-04-18	Update RJIT to support newarray_send	Aaron Patterson
	This also adds max / hash support Notes: Merged: https://github.com/ruby/ruby/pull/6090
2023-04-12	RJIT: argc check in known cfuncs	John Hawthorn
	Notes: Merged: https://github.com/ruby/ruby/pull/7697
2023-04-05	RJIT: Skip a class guard if known to be T_STRING	Takashi Kokubun

2023-04-05	RJIT: Handle include_all argument of respond_to?	Takashi Kokubun

2023-04-04	RJIT: Remove unused variables	Takashi Kokubun

2023-04-04	RJIT: Always use guard_two_fixnums	Takashi Kokubun

2023-04-04	RJIT: Eliminate known-result guards for blockarg	Takashi Kokubun

2023-04-04	RJIT: Eliminate known-result branches	Takashi Kokubun

2023-04-04	RJIT: Propagate argument types on method calls	Takashi Kokubun

2023-04-04	RJIT: Fix mapping offsets in stack_swap	Takashi Kokubun

2023-04-04	[Feature #19579] Remove !USE_RVARGC code (#7655)	Peter Zhu
	Remove !USE_RVARGC code [Feature #19579] The Variable Width Allocation feature was turned on by default in Ruby 3.2. Since then, we haven't received bug reports or backports to the non-Variable Width Allocation code paths, so we assume that nobody is using it. We also don't plan on maintaining the non-Variable Width Allocation code, so we are going to remove it. Notes: Merged-By: maximecb <[email protected]>
2023-04-04	RJIT: Fix the argument of shift_stack	Takashi Kokubun

2023-04-04	RJIT: Fix the argument for defined	Takashi Kokubun

2023-04-04	RJIT: Add --rjit-verify-ctx option	Takashi Kokubun

2023-04-04	RJIT: Fix arguments to SPECIAL_CONST_P	Takashi Kokubun

2023-04-03	RJIT: Update type information on setlocal	Takashi Kokubun

2023-04-03	RJIT: Fix arguments for shift_stack	Takashi Kokubun

2023-04-03	Fix a test_rubyoptions failure	Takashi Kokubun

2023-04-03	RJIT: Propagate self's type information	Takashi Kokubun

2023-04-03	RJIT: Upgrade type on jit_guard_known_class	Takashi Kokubun

2023-04-03	RJIT: Upgrade type to Fixnum after guard	Takashi Kokubun

2023-04-02	RJIT: Upgrade type to String after guard	Takashi Kokubun

2023-04-02	RJIT: Upgrade type to Array after guard	Takashi Kokubun

2023-04-02	RJIT: Upgrade type to UnknownHeap after guard	Takashi Kokubun

2023-04-02	RJIT: Update type information on setn insn	Takashi Kokubun

2023-04-02	RJIT: Swap type information on swap insn	Takashi Kokubun

2023-04-02	RJIT: Store type information in Context	Takashi Kokubun