ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
14 hours	Turn `rb_classext_t.fields` into a T_IMEMO/class_fields	Jean Boussier
	This behave almost exactly as a T_OBJECT, the layout is entirely compatible. This aims to solve two problems. First, it solves the problem of namspaced classes having a single `shape_id`. Now each namespaced classext has an object that can hold the namespace specific shape. Second, it open the door to later make class instance variable writes atomics, hence be able to read class variables without locking the VM. In the future, in multi-ractor mode, we can do the write on a copy of the `fields_obj` and then atomically swap it. Considerations: - Right now the `RClass` shape_id is always synchronized, but with namespace we should likely mark classes that have multiple namespace with a specific shape flag. Notes: Merged: https://github.com/ruby/ruby/pull/13411
2025-04-25	Inline Class#new.	Aaron Patterson
	This commit inlines instructions for Class#new. To make this work, we added a new YARV instructions, `opt_new`. `opt_new` checks whether or not the `new` method is the default allocator method. If it is, it allocates the object, and pushes the instance on the stack. If not, the instruction jumps to the "slow path" method call instructions. Old instructions: ``` > ruby --dump=insns -e'Object.new' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)> 0000 opt_getconstant_path <ic:0 Object> ( 1)[Li] 0002 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE> 0004 leave ``` New instructions: ``` > ./miniruby --dump=insns -e'Object.new' == disasm: #<ISeq:<main>@-e:1 (1,0)-(1,10)> 0000 opt_getconstant_path <ic:0 Object> ( 1)[Li] 0002 putnil 0003 swap 0004 opt_new <calldata!mid:new, argc:0, ARGS_SIMPLE>, 11 0007 opt_send_without_block <calldata!mid:initialize, argc:0, FCALL\|ARGS_SIMPLE> 0009 jump 14 0011 opt_send_without_block <calldata!mid:new, argc:0, ARGS_SIMPLE> 0013 swap 0014 pop 0015 leave ``` This commit speeds up basic object allocation (`Foo.new`) by 60%, but classes that take keyword parameters see an even bigger benefit because no hash is allocated when instantiating the object (3x to 6x faster). Here is an example that uses `Hash.new(capacity: 0)`: ``` > hyperfine "ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" "./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end'" Benchmark 1: ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' Time (mean ± σ): 1.082 s ± 0.004 s [User: 1.074 s, System: 0.008 s] Range (min … max): 1.076 s … 1.088 s 10 runs Benchmark 2: ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' Time (mean ± σ): 627.9 ms ± 3.5 ms [User: 622.7 ms, System: 4.8 ms] Range (min … max): 622.7 ms … 633.2 ms 10 runs Summary ./ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ran 1.72 ± 0.01 times faster than ruby --disable-gems -e'i = 0; while i < 10_000_000; Hash.new(capacity: 0); i += 1; end' ``` This commit changes the backtrace for `initialize`: ``` aaron@tc ~/g/ruby (inline-new)> cat test.rb class Foo def initialize puts caller end end def hello Foo.new end hello aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24] test.rb:8:in 'Class#new' test.rb:8:in 'Object#hello' test.rb:11:in '<main>' aaron@tc ~/g/ruby (inline-new)> ./miniruby -v test.rb ruby 3.5.0dev (2025-03-28T23:59:40Z inline-new c4157884e4) +PRISM [arm64-darwin24] test.rb:8:in 'Object#hello' test.rb:11:in '<main>' ``` It also increases memory usage for calls to `new` by 122 bytes: ``` aaron@tc ~/g/ruby (inline-new)> cat test.rb require "objspace" class Foo def initialize puts caller end end def hello Foo.new end puts ObjectSpace.memsize_of(RubyVM::InstructionSequence.of(method(:hello))) aaron@tc ~/g/ruby (inline-new)> make runruby RUBY_ON_BUG='gdb -x ./.gdbinit -p' ./miniruby -I./lib -I. -I.ext/common ./tool/runruby.rb --extout=.ext -- --disable-gems ./test.rb 656 aaron@tc ~/g/ruby (inline-new)> ruby -v test.rb ruby 3.4.2 (2025-02-15 revision d2930f8e7a) +PRISM [arm64-darwin24] 544 ``` Thanks to @ko1 for coming up with this idea! Co-Authored-By: John Hawthorn <[email protected]>
2024-06-02	Use real filename instead of `__FILE__`	Daisuke Fujimura (fd0)

2023-08-25	Add Missing Counters to `rb_debug_counter_type` enum (#8297)	Zack Deveau
	Add missing counters to rb_debug_counter_type enum On master we have calls to the RB_DEBUG_COUNTER_INC macro for counters that are not getting defined in the rb_debug_counter_type enum. This commit adds those that are missing in order for compilation to pass with -DUSE_RUBY_DEBUG_LOG. Notes: Merged-By: k0kubun <[email protected]>
2023-07-13	Remove unused references to the transient heap	Peter Zhu
	Notes: Merged: https://github.com/ruby/ruby/pull/8071
2023-07-13	[Feature #19730] Remove transient heap	Peter Zhu
	Notes: Merged: https://github.com/ruby/ruby/pull/7942
2023-02-21	Refactor to separate marking and sweeping phases	Peter Zhu
	This commit separates the marking and sweeping phases so that marking functions do not directly call sweeping functions. Notes: Merged: https://github.com/ruby/ruby/pull/7304
2023-02-15	Refactor / document instance variable debug counters	Aaron Patterson
	This commit is refactoring and documenting the debug counters related to instance variables. Notes: Merged: https://github.com/ruby/ruby/pull/7305
2022-12-15	Transition complex objects to "too complex" shape	Jemma Issroff
	When an object becomes "too complex" (in other words it has too many variations in the shape tree), we transition it to use a "too complex" shape and use a hash for storing instance variables. Without this patch, there were rare cases where shape tree growth could "explode" and cause performance degradation on what would otherwise have been cached fast paths. This patch puts a limit on shape tree growth, and gracefully degrades in the rare case where there could be a factorial growth in the shape tree. For example: ```ruby class NG; end HUGE_NUMBER.times do NG.new.instance_variable_set(:"@unique_ivar_#{_1}", 1) end ``` We consider objects to be "too complex" when the object's class has more than SHAPE_MAX_VARIATIONS (currently 8) leaf nodes in the shape tree and the object introduces a new variation (a new leaf node) associated with that class. For example, new variations on instances of the following class would be considered "too complex" because those instances create more than 8 leaves in the shape tree: ```ruby class Foo; end 9.times { Foo.new.instance_variable_set(":@uniq_#{_1}", 1) } ``` However, the following class is not too complex because it only has one leaf in the shape tree: ```ruby class Foo def initialize @a = @b = @c = @d = @e = @f = @g = @h = @i = nil end end 9.times { Foo.new } `` This case is rare, so we don't expect this change to impact performance of most applications, but it needs to be handled. Co-Authored-By: Aaron Patterson <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/6931
2022-12-06	MJIT: Remove obsoleted MJIT counters	Takashi Kokubun

2022-12-06	MJIT: Remove an unused argument and unused counters	Takashi Kokubun
	I plan to rebuild MJIT metrics later, not using debug counters.
2022-11-13	Remove unused debug counters	Takashi Kokubun
	The structure and readability of jit_exec is messed up right now. I'd like to help the current situation by this for now. I'll resurrect them when I need it again.
2022-10-11	Revert "Revert "This commit implements the Object Shapes technique in CRuby.""	Jemma Issroff
	This reverts commit 9a6803c90b817f70389cae10d60b50ad752da48f.
2022-09-30	Revert "This commit implements the Object Shapes technique in CRuby."	Aaron Patterson
	This reverts commit 68bc9e2e97d12f80df0d113e284864e225f771c2.
2022-09-28	This commit implements the Object Shapes technique in CRuby.	Jemma Issroff
	Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <[email protected]> Co-Authored-By: Eileen M. Uchitelle <[email protected]> Co-Authored-By: John Hawthorn <[email protected]>
2022-09-26	Revert this until we can figure out WB issues or remove shapes from GC	Aaron Patterson
	Revert "* expand tabs. [ci skip]" This reverts commit 830b5b5c351c5c6efa5ad461ae4ec5085e5f0275. Revert "This commit implements the Object Shapes technique in CRuby." This reverts commit 9ddfd2ca004d1952be79cf1b84c52c79a55978f4.
2022-09-26	This commit implements the Object Shapes technique in CRuby.	Jemma Issroff
	Object Shapes is used for accessing instance variables and representing the "frozenness" of objects. Object instances have a "shape" and the shape represents some attributes of the object (currently which instance variables are set and the "frozenness"). Shapes form a tree data structure, and when a new instance variable is set on an object, that object "transitions" to a new shape in the shape tree. Each shape has an ID that is used for caching. The shape structure is independent of class, so objects of different types can have the same shape. For example: ```ruby class Foo def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end class Bar def initialize # Starts with shape id 0 @a = 1 # transitions to shape id 1 @b = 1 # transitions to shape id 2 end end foo = Foo.new # `foo` has shape id 2 bar = Bar.new # `bar` has shape id 2 ``` Both `foo` and `bar` instances have the same shape because they both set instance variables of the same name in the same order. This technique can help to improve inline cache hits as well as generate more efficient machine code in JIT compilers. This commit also adds some methods for debugging shapes on objects. See `RubyVM::Shape` for more details. For more context on Object Shapes, see [Feature: #18776] Co-Authored-By: Aaron Patterson <[email protected]> Co-Authored-By: Eileen M. Uchitelle <[email protected]> Co-Authored-By: John Hawthorn <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/6386
2022-08-19	Rename mjit_exec to jit_exec (#6262)	Takashi Kokubun
	* Rename mjit_exec to jit_exec * Rename mjit_exec_slowpath to mjit_check_iseq * Remove mjit_exec references from comments Notes: Merged-By: k0kubun <[email protected]>
2021-11-19	optimize `Struct` getter/setter	Koichi Sasada
	Introduce new optimized method type `OPTIMIZED_METHOD_TYPE_STRUCT_AREF/ASET` with index information. Notes: Merged: https://github.com/ruby/ruby/pull/5131
2021-06-18	Add a cache for class variables	eileencodes
	Redo of 34a2acdac788602c14bf05fb616215187badd504 and 931138b00696419945dc03e10f033b1f53cd50f3 which were reverted. GitHub PR #4340. This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105c) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be009) [x86_64-darwin19] \| \|compare-ruby\|built-ruby\| \|:--------\|-----------:\|---------:\| \|vm_cvar \| 5.681M\| 36.980M\| \| \| -\| 6.51x\| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do \|x\| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/4544
2021-05-11	Revert "Filling cache values on cvar write"	Aaron Patterson
	This reverts commit 08de37f9fa3469365e6b5c964689ae2bae0eb9f3. This reverts commit e8ae922b62adb00a80d3d4c49f7d7b0e6026eaba.
2021-05-11	Filling cache values on cvar write	eileencodes
	Instead of on read. Once it's in the inline cache we never have to make one again. We want to eventually put the value into the cache, and the best opportunity to do that is when you write the value. Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-05-11	Add a cache for class variables	eileencodes
	This change implements a cache for class variables. Previously there was no cache for cvars. Cvar access is slow due to needing to travel all the way up th ancestor tree before returning the cvar value. The deeper the ancestor tree the slower cvar access will be. The benefits of the cache are more visible with a higher number of included modules due to the way Ruby looks up class variables. The benchmark here includes 26 modules and shows with the cache, this branch is 6.5x faster when accessing class variables. ``` compare-ruby: ruby 3.1.0dev (2021-03-15T06:22:34Z master 9e5105ca45) [x86_64-darwin19] built-ruby: ruby 3.1.0dev (2021-03-15T12:12:44Z add-cache-for-clas.. c6be0093ae) [x86_64-darwin19] \| \|compare-ruby\|built-ruby\| \|:--------\|-----------:\|---------:\| \|vm_cvar \| 5.681M\| 36.980M\| \| \| -\| 6.51x\| ``` Benchmark.ips calling `ActiveRecord::Base.logger` from within a Rails application. ActiveRecord::Base.logger has 71 ancestors. The more ancestors a tree has, the more clear the speed increase. IE if Base had only one ancestor we'd see no improvement. This benchmark is run on a vanilla Rails application. Benchmark code: ```ruby require "benchmark/ips" require_relative "config/environment" Benchmark.ips do \|x\| x.report "logger" do ActiveRecord::Base.logger end end ``` Ruby 3.0 master / Rails 6.1: ``` Warming up -------------------------------------- logger 155.251k i/100ms Calculating ------------------------------------- ``` Ruby 3.0 with cvar cache / Rails 6.1: ``` Warming up -------------------------------------- logger 1.546M i/100ms Calculating ------------------------------------- logger 14.857M (± 4.8%) i/s - 74.198M in 5.006202s ``` Lastly we ran a benchmark to demonstate the difference between master and our cache when the number of modules increases. This benchmark measures 1 ancestor, 30 ancestors, and 100 ancestors. Ruby 3.0 master: ``` Warming up -------------------------------------- 1 module 1.231M i/100ms 30 modules 432.020k i/100ms 100 modules 145.399k i/100ms Calculating ------------------------------------- 1 module 12.210M (± 2.1%) i/s - 61.553M in 5.043400s 30 modules 4.354M (± 2.7%) i/s - 22.033M in 5.063839s 100 modules 1.434M (± 2.9%) i/s - 7.270M in 5.072531s Comparison: 1 module: 12209958.3 i/s 30 modules: 4354217.8 i/s - 2.80x (± 0.00) slower 100 modules: 1434447.3 i/s - 8.51x (± 0.00) slower ``` Ruby 3.0 with cvar cache: ``` Warming up -------------------------------------- 1 module 1.641M i/100ms 30 modules 1.655M i/100ms 100 modules 1.620M i/100ms Calculating ------------------------------------- 1 module 16.279M (± 3.8%) i/s - 82.038M in 5.046923s 30 modules 15.891M (± 3.9%) i/s - 79.459M in 5.007958s 100 modules 16.087M (± 3.6%) i/s - 81.005M in 5.041931s Comparison: 1 module: 16279458.0 i/s 100 modules: 16087484.6 i/s - same-ish: difference falls within error 30 modules: 15891406.2 i/s - same-ish: difference falls within error ``` Co-authored-by: Aaron Patterson <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/4340
2021-04-26	Fix some typos by spell checker	Ryuta Kamizono
	Notes: Merged: https://github.com/ruby/ruby/pull/4414
2021-01-29	global call-cache cache table for rb_funcall*	Koichi Sasada
	rb_funcall* (rb_funcall(), rb_funcallv(), ...) functions invokes Ruby's method with given receiver. Ruby 2.7 introduced inline method cache with static memory area. However, Ruby 3.0 reimplemented the method cache data structures and the inline cache was removed. Without inline cache, rb_funcall* searched methods everytime. Most of cases per-Class Method Cache (pCMC) will be helped but pCMC requires VM-wide locking and it hurts performance on multi-Ractor execution, especially all Ractors calls methods with rb_funcall. This patch introduced Global Call-Cache Cache Table (gccct) for rb_funcall. Call-Cache was introduced from Ruby 3.0 to manage method cache entry atomically and gccct enables method-caching without VM-wide locking. This table solves the performance issue on multi-ractor execution. [Bug #17497] Ruby-level method invocation does not use gccct because it has inline-method-cache and the table size is limited. Basically rb_funcall* is not used frequently, so 1023 entries can be enough. We will revisit the table size if it is not enough. Notes: Merged: https://github.com/ruby/ruby/pull/4129
2021-01-05	enable constant cache on ractors	Koichi Sasada
	constant cache `IC` is accessed by non-atomic manner and there are thread-safety issues, so Ruby 3.0 disables to use const cache on non-main ractors. This patch enables it by introducing `imemo_constcache` and allocates it by every re-fill of const cache like `imemo_callcache`. [Bug #17510] Now `IC` only has one entry `IC::entry` and it points to `iseq_inline_constant_cache_entry`, managed by T_IMEMO object. `IC` is atomic data structure so `rb_mjit_before_vm_ic_update()` and `rb_mjit_after_vm_ic_update()` is not needed. Notes: Merged: https://github.com/ruby/ruby/pull/4022
2020-12-17	add debug counters for gc start events	Koichi Sasada

2020-12-17	make RB_DEBUG_COUNTER_INC()_thread-safe	Koichi Sasada
	Notes: Merged: https://github.com/ruby/ruby/pull/3915
2020-12-16	add vm_sync debug counters	Koichi Sasada
	* vm_sync_lock * vm_sync_lock_enter * vm_sync_lock_enter_nb * vm_sync_lock_enter_cr * vm_sync_barrier Notes: Merged: https://github.com/ruby/ruby/pull/3910
2020-12-15	add several debug counters	Koichi Sasada
	add cc_found_in_ccs (renamed from cc_found_ccs), cc_not_found_in_ccs, call0_public, call0_other debug counters to measure more details. also it contains several modification. Notes: Merged: https://github.com/ruby/ruby/pull/3903
2020-12-14	fix condition and add another debug counter	Koichi Sasada
	mc_inline_miss_same_def is added to check same method or not. Also the mc_inline_miss_same_cc calculation was fixed.
2020-12-14	add ccs_not_found debug counter	Koichi Sasada
	ccs_not_found to count not found in ccs table.
2020-12-14	add debug counters to survey the IMC miss	Koichi Sasada

2020-12-14	add cc_invalidate_negative debug counter	Koichi Sasada
	counts for invalidating negative cache. Notes: Merged: https://github.com/ruby/ruby/pull/3892
2020-11-09	Add debug counter for ivar inline cache misses that could hit	Aaron Patterson
	This commit adds a debug counter for the case where the inline cache missed but the ivar index table has an entry for that ivar. This is a case where a polymorphic cache could help Notes: Merged: https://github.com/ruby/ruby/pull/3750
2020-11-09	remove unused debug counter	Aaron Patterson
	Notes: Merged: https://github.com/ruby/ruby/pull/3740
2020-07-10	Explicit conversion to boolean to suppress shorten-64-to-32 warnings	Nobuyoshi Nakada

2020-05-28	Add a debug_counter for JIT cancel on leave	Takashi Kokubun

2020-05-11	sed -i 's\|ruby/impl\|ruby/internal\|'	卜部昌平
	To fix build failures. Notes: Merged: https://github.com/ruby/ruby/pull/3079
2020-05-11	sed -i s\|ruby/3\|ruby/impl\|g	卜部昌平
	This shall fix compile errors. Notes: Merged: https://github.com/ruby/ruby/pull/3079
2020-04-13	Make vm_call_cfunc_with_frame a fastpath (#3027)	Takashi Kokubun
	when there's no need to call CALLER_SETUP_ARG and CALLER_REMOVE_EMPTY_KW_SPLAT (i.e. !rb_splat_or_kwargs_p(ci) && !calling->kw_splat). Micro benchmark: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark/vm_send_cfunc.yml --repeat-count=4 before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after vm_send_cfunc 69.585M 88.724M i/s - 100.000M times in 1.437097s 1.127096s Comparison: vm_send_cfunc after: 88723605.2 i/s before: 69584737.1 i/s - 1.28x slower ``` Optcarrot: ``` $ benchmark-driver -v --rbenv 'before;after' benchmark.yml --repeat-count=12 --output=all before: ruby 2.8.0dev (2020-04-13T23:45:05Z master b9d3ceee8f) [x86_64-linux] after: ruby 2.8.0dev (2020-04-14T00:48:52Z no-splat-fastpath 418d363722) [x86_64-linux] Calculating ------------------------------------- before after Optcarrot Lan_Master.nes 50.76119601545175 42.73858236484051 fps 50.76388649761503 51.04211379912850 50.80930672252514 51.39455790755538 50.90236000778749 51.75656936556145 51.01744746340430 51.86875277356489 51.06495279015112 51.88692482485558 51.07785337168974 51.93429603190578 51.20163525187862 51.95768145071314 51.34671771913112 52.45577266040274 51.35918340835583 52.53163888762858 51.46641337418146 52.62172484121034 51.50835463462257 52.85064021113239 ``` Notes: Merged-By: k0kubun <[email protected]>
2020-04-08	Merge pull request #2991 from shyouhei/ruby.h	卜部昌平
	Split ruby.h Notes: Merged-By: shyouhei <[email protected]>
2020-03-30	Optimize exivar access on JIT-ed getivar	Takashi Kokubun
	JIT support of dd723771c11. $ benchmark-driver -v --rbenv 'before;before --jit;after --jit' benchmark/mjit_exivar.yml --repeat-count=4 before: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) [x86_64-linux] before --jit: ruby 2.8.0dev (2020-03-30T12:32:26Z master e5db3da9d3) +JIT [x86_64-linux] after --jit: ruby 2.8.0dev (2020-03-31T05:57:24Z mjit-exivar 128625baec) +JIT [x86_64-linux] Calculating ------------------------------------- before before --jit after --jit mjit_exivar 57.944M 53.579M 54.471M i/s - 200.000M times in 3.451588s 3.732772s 3.671687s Comparison: mjit_exivar before: 57944345.1 i/s after --jit: 54470876.7 i/s - 1.06x slower before --jit: 53579483.4 i/s - 1.08x slower
2020-03-16	Fix typos [ci skip]	Kazuhiro NISHIYAMA

2020-03-15	Add debug counter for unload_units	Takashi Kokubun
	changing add_iseq_to_process's debug counter name as well for comparison
2020-02-22	Introduce disposable call-cache.	Koichi Sasada
	This patch contains several ideas: (1) Disposable inline method cache (IMC) for race-free inline method cache * Making call-cache (CC) as a RVALUE (GC target object) and allocate new CC on cache miss. * This technique allows race-free access from parallel processing elements like RCU. (2) Introduce per-Class method cache (pCMC) * Instead of fixed-size global method cache (GMC), pCMC allows flexible cache size. * Caching CCs reduces CC allocation and allow sharing CC's fast-path between same call-info (CI) call-sites. (3) Invalidate an inline method cache by invalidating corresponding method entries (MEs) * Instead of using class serials, we set "invalidated" flag for method entry itself to represent cache invalidation. * Compare with using class serials, the impact of method modification (add/overwrite/delete) is small. * Updating class serials invalidate all method caches of the class and sub-classes. * Proposed approach only invalidate the method cache of only one ME. See [Feature #16614] for more details. Notes: Merged: https://github.com/ruby/ruby/pull/2888
2020-02-22	VALUE size packed callinfo (ci).	Koichi Sasada
	Now, rb_call_info contains how to call the method with tuple of (mid, orig_argc, flags, kwarg). Most of cases, kwarg == NULL and mid+argc+flags only requires 64bits. So this patch packed rb_call_info to VALUE (1 word) on such cases. If we can not represent it in VALUE, then use imemo_callinfo which contains conventional callinfo (rb_callinfo, renamed from rb_call_info). iseq->body->ci_kw_size is removed because all of callinfo is VALUE size (packed ci or a pointer to imemo_callinfo). To access ci information, we need to use these functions: vm_ci_mid(ci), _flag(ci), _argc(ci), _kwarg(ci). struct rb_call_info_kw_arg is renamed to rb_callinfo_kwarg. rb_funcallv_with_cc() and rb_method_basic_definition_p_with_cc() is temporary removed because cd->ci should be marked. Notes: Merged: https://github.com/ruby/ruby/pull/2888
2019-12-26	debug_counter.h must be self-contained	卜部昌平
	Include what is necessary. Notes: Merged: https://github.com/ruby/ruby/pull/2711
2019-12-25	add debug_counter access functions.	Koichi Sasada
	These functions are enabled only on USE_DEBUG_COUNTER=1.
2019-12-23	add more debug counters to count numeric objects.	Koichi Sasada