[#98621] Re: Function getlogin_r()'s protoype] — Bertram Scharpf <lists@...>
FYI,
3 messages
2020/06/02
[#98947] [Ruby master Feature#16986] Anonymous Struct literal — ko1@...
Issue #16986 has been reported by ko1 (Koichi Sasada).
66 messages
2020/06/26
[#98962] [Ruby master Bug#16988] Kernel.load loads file from current directory without '.' in path — misharinn@...
Issue #16988 has been reported by TheSmartnik (Nikita Misharin).
5 messages
2020/06/26
[#98969] [Ruby master Feature#16994] Sets: shorthand for frozen sets of symbols / strings — marcandre-ruby-core@...
Issue #16994 has been reported by marcandre (Marc-Andre Lafortune).
7 messages
2020/06/26
[#100117] [Ruby master Feature#16994] Sets: shorthand for frozen sets of symbols / strings
— matz@...
2020/09/25
Issue #16994 has been updated by matz (Yukihiro Matsumoto).
[ruby-core:98660] [Ruby master Feature#16897] Can a Ruby 3.0 compatible general purpose memoizer be written in such a way that it matches Ruby 2 performance?
From:
merch-redmine@...
Date:
2020-06-05 19:01:19 UTC
List:
ruby-core #98660
Issue #16897 has been updated by jeremyevans0 (Jeremy Evans).
As @Eregon mentioned, the `***a` approach is likely to be the same speed or slower than `*args, **kw` approach on CRuby as it has to allocate at least as many objects. It could be theoretically possible to increase performance by reducing allocations in the following cases:
* 0-3 arguments with no keywords
* 0-1 arguments with 1 keyword
You could do this by storing the arguments inside the object, similar to how arrays and hashes are optimized internally. Those cases would allow you to get by with a single object allocation instead of allocating two objects (array+hash), assuming that the caller side is not doing any object allocation. All other cases would be as slow or slower.
This approach would only be faster if you never needed to access the arguments or keywords passed. As soon as you need access to the arguments or keywords, it would likely be slower as it would have to allocate an array or hash for them. This limits the usefulness of the approach to specific cases.
When you compare `***a` to `ruby2_keywords`, which is currently the fastest approach, the cases where it could theoretically be faster I believe are limited to 0-1 arguments with 1 keyword.
This approach will increase complexity in an already complex system. It would be significant undertaking to implement, and it's not clear it would provide a net performance improvement.
It is true that supporting `ruby2_keywords` makes `*args` calls without keywords slower. I think the maximum slowdown was around 10%, and that was when the callee did not accept a splat or keywords. When the callee accepted a splat or keywords, I think the slowdown was around 1%. However, as `ruby2_keywords` greatly speeds up delegation (see below), `ruby2_keywords` results in a net increase in performance in the majority of cases. Until `ruby2_keywords` no longer results in a net increase in performance in the majority of cases, I believe it should stay.
Here's a benchmark showing a 160% improvement in delegation performance in master by using `ruby2_keywords` instead of `*args, **kw`:
```ruby
def m1(arg) end
def m2(*args) end
def m3(arg, k: 1) end
def m4(*args, k: 1) end
def m5(arg, **kw) end
def m6(*args, **kw) end
ruby2_keywords def d1(*args)
m2(*args);m2(*args);m2(*args);m2(*args);m2(*args);
m3(*args);m3(*args);m3(*args);m3(*args);m3(*args);
m4(*args);m4(*args);m4(*args);m4(*args);m4(*args);
m5(*args);m5(*args);m5(*args);m5(*args);m5(*args);
m6(*args);m6(*args);m6(*args);m6(*args);m6(*args);
end
ruby2_keywords def d1a(*args)
m1(*args);m1(*args);m1(*args);m1(*args);m1(*args);
end
def d2(*args, **kw)
m2(*args, **kw);m2(*args, **kw);m2(*args, **kw);m2(*args, **kw);m2(*args, **kw);
m3(*args, **kw);m3(*args, **kw);m3(*args, **kw);m3(*args, **kw);m3(*args, **kw);
m4(*args, **kw);m4(*args, **kw);m4(*args, **kw);m4(*args, **kw);m4(*args, **kw);
m5(*args, **kw);m5(*args, **kw);m5(*args, **kw);m5(*args, **kw);m5(*args, **kw);
m6(*args, **kw);m6(*args, **kw);m6(*args, **kw);m6(*args, **kw);m6(*args, **kw);
end
def d2a(*args, **kw)
m1(*args, **kw);m1(*args, **kw);m1(*args, **kw);m1(*args, **kw);m1(*args, **kw);
end
require 'benchmark'
print "ruby2_keywords: "
puts(Benchmark.measure do
100000.times do
d1a(1)
d1(1, k: 1)
end
end)
print " *args, **kw: "
puts(Benchmark.measure do
100000.times do
d2a(1)
d2(1, k: 1)
end
end)
```
Results:
```
ruby2_keywords: 1.350000 0.000000 1.350000 ( 1.395517)
*args, **kw: 3.630000 0.000000 3.630000 ( 3.693702)
```
----------------------------------------
Feature #16897: Can a Ruby 3.0 compatible general purpose memoizer be written in such a way that it matches Ruby 2 performance?
https://bugs.ruby-lang.org/issues/16897#change-86000
* Author: sam.saffron (Sam Saffron)
* Status: Open
* Priority: Normal
----------------------------------------
```ruby
require 'benchmark/ips'
module Memoizer
def memoize_26(method_name)
cache = {}
uncached = "#{method_name}_without_cache"
alias_method uncached, method_name
define_method(method_name) do |*arguments|
found = true
data = cache.fetch(arguments) { found = false }
unless found
cache[arguments] = data = public_send(uncached, *arguments)
end
data
end
end
def memoize_27(method_name)
cache = {}
uncached = "#{method_name}_without_cache"
alias_method uncached, method_name
define_method(method_name) do |*args, **kwargs|
found = true
all_args = [args, kwargs]
data = cache.fetch(all_args) { found = false }
unless found
cache[all_args] = data = public_send(uncached, *args, **kwargs)
end
data
end
end
def memoize_27_v2(method_name)
uncached = "#{method_name}_without_cache"
alias_method uncached, method_name
cache = "MEMOIZE_#{method_name}"
params = instance_method(method_name).parameters
has_kwargs = params.any? {|t, name| "#{t}".start_with? "key"}
has_args = params.any? {|t, name| !"#{t}".start_with? "key"}
args = []
args << "args" if has_args
args << "kwargs" if has_kwargs
args_text = args.map do |n|
n == "args" ? "*args" : "**kwargs"
end.join(",")
class_eval <<~RUBY
#{cache} = {}
def #{method_name}(#{args_text})
found = true
all_args = #{args.length === 2 ? "[args, kwargs]" : args[0]}
data = #{cache}.fetch(all_args) { found = false }
unless found
#{cache}[all_args] = data = public_send(:#{uncached} #{args.empty? ? "" : ", #{args_text}"})
end
data
end
RUBY
end
end
module Methods
def args_only(a, b)
sleep 0.1
"#{a} #{b}"
end
def kwargs_only(a:, b: nil)
sleep 0.1
"#{a} #{b}"
end
def args_and_kwargs(a, b:)
sleep 0.1
"#{a} #{b}"
end
end
class OldMethod
extend Memoizer
include Methods
memoize_26 :args_and_kwargs
memoize_26 :args_only
memoize_26 :kwargs_only
end
class NewMethod
extend Memoizer
include Methods
memoize_27 :args_and_kwargs
memoize_27 :args_only
memoize_27 :kwargs_only
end
class OptimizedMethod
extend Memoizer
include Methods
memoize_27_v2 :args_and_kwargs
memoize_27_v2 :args_only
memoize_27_v2 :kwargs_only
end
OptimizedMethod.new.args_only(1,2)
methods = [
OldMethod.new,
NewMethod.new,
OptimizedMethod.new
]
Benchmark.ips do |x|
x.warmup = 1
x.time = 2
methods.each do |m|
x.report("#{m.class} args only") do |times|
while times > 0
m.args_only(10, b: 10)
times -= 1
end
end
x.report("#{m.class} kwargs only") do |times|
while times > 0
m.kwargs_only(a: 10, b: 10)
times -= 1
end
end
x.report("#{m.class} args and kwargs") do |times|
while times > 0
m.args_and_kwargs(10, b: 10)
times -= 1
end
end
end
x.compare!
end
# # Ruby 2.6.5
# #
# OptimizedMethod args only: 974266.9 i/s
# OldMethod args only: 949344.9 i/s - 1.03x slower
# OldMethod args and kwargs: 945951.5 i/s - 1.03x slower
# OptimizedMethod kwargs only: 939160.2 i/s - 1.04x slower
# OldMethod kwargs only: 868229.3 i/s - 1.12x slower
# OptimizedMethod args and kwargs: 751797.0 i/s - 1.30x slower
# NewMethod args only: 730594.4 i/s - 1.33x slower
# NewMethod args and kwargs: 727300.5 i/s - 1.34x slower
# NewMethod kwargs only: 665003.8 i/s - 1.47x slower
#
# #
# # Ruby 2.7.1
#
# OptimizedMethod kwargs only: 1021707.6 i/s
# OptimizedMethod args only: 955694.6 i/s - 1.07x (0.00) slower
# OldMethod args and kwargs: 940911.3 i/s - 1.09x (ア 0.00) slower
# OldMethod args only: 930446.1 i/s - 1.10x (ア 0.00) slower
# OldMethod kwargs only: 858238.5 i/s - 1.19x (ア 0.00) slower
# OptimizedMethod args and kwargs: 773773.5 i/s - 1.32x (ア 0.00) slower
# NewMethod args and kwargs: 772653.3 i/s - 1.32x (ア 0.00) slower
# NewMethod args only: 771253.2 i/s - 1.32x (ア 0.00) slower
# NewMethod kwargs only: 700604.1 i/s - 1.46x (ア 0.00) slower
```
The bottom line is that a generic delegator often needs to make use of all the arguments provided to a method.
```ruby
def count(*args, **kwargs)
counter[[args, kwargs]] += 1
orig_count(*args, **kwargs)
end
```
The old pattern meant we could get away with one less array allocation per:
```ruby
def count(*args)
counter[args] += 1
orig_count(*args, **kwargs)
end
```
I would like to propose some changes to Ruby 3 to allow to recover this performance.
Perhaps:
```ruby
def count(...)
args = ...
counter[args] += 1
orig_count(...)
end
```
Or:
```ruby
def count(***args)
counter[args] += 1
orig_count(***args)
end
```
Thoughts?
--
https://bugs.ruby-lang.org/
Unsubscribe: <mailto:[email protected]?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>