[#106355] [Ruby master Bug#18373] RBS build failure: '/include/x86_64-linux/ruby/config.h', needed by 'constants.o'. — "vo.x (Vit Ondruch)" <noreply@...>

Issue #18373 has been reported by vo.x (Vit Ondruch).

28 messages 2021/12/01

[#106356] [Ruby master Bug#18374] make: Circular spec/ruby/optional/capi/ext/array_spec.c <- spec/ruby/optional/capi/ext/array_spec.c dependency dropped. — "vo.x (Vit Ondruch)" <noreply@...>

Issue #18374 has been reported by vo.x (Vit Ondruch).

8 messages 2021/12/01

[#106360] [Ruby master Feature#18376] Version comparison API — "vo.x (Vit Ondruch)" <noreply@...>

Issue #18376 has been reported by vo.x (Vit Ondruch).

28 messages 2021/12/01

[#106543] [Ruby master Bug#18396] An unexpected "hash value omission" syntax error when parentheses call expr follows — "koic (Koichi ITO)" <noreply@...>

Issue #18396 has been reported by koic (Koichi ITO).

10 messages 2021/12/08

[#106596] [Ruby master Misc#18399] DevMeeting-2022-01-13 — "mame (Yusuke Endoh)" <noreply@...>

Issue #18399 has been reported by mame (Yusuke Endoh).

11 messages 2021/12/09

[#106621] [Ruby master Misc#18404] 3.1 documentation problems tracking ticket — "zverok (Victor Shepelev)" <noreply@...>

Issue #18404 has been reported by zverok (Victor Shepelev).

16 messages 2021/12/11

[#106634] [Ruby master Bug#18407] Behavior difference between integer and string flags to File creation — deivid <noreply@...>

Issue #18407 has been reported by deivid (David Rodr鱈guez).

12 messages 2021/12/13

[#106644] [Ruby master Bug#18408] Rightward assignment into instance variable — "Dan0042 (Daniel DeLorme)" <noreply@...>

Issue #18408 has been reported by Dan0042 (Daniel DeLorme).

23 messages 2021/12/13

[#106686] [Ruby master Bug#18409] Crash (free(): invalid pointer) if LD_PRELOAD doesn't explicitly include libjemalloc.so.2 — "itay-grudev (Itay Grudev)" <noreply@...>

Issue #18409 has been reported by itay-grudev (Itay Grudev).

7 messages 2021/12/15

[#106730] [Ruby master Bug#18417] IO::Buffer problems — "zverok (Victor Shepelev)" <noreply@...>

Issue #18417 has been reported by zverok (Victor Shepelev).

9 messages 2021/12/19

[#106784] [CommonRuby Feature#18429] Configure ruby-3.0.3 on Solaris 10 Unknown keyword 'URL' in './ruby.tmp.pc' — "dklein (Dmitri Klein)" <noreply@...>

Issue #18429 has been reported by dklein (Dmitri Klein).

32 messages 2021/12/23

[#106828] [Ruby master Bug#18435] Calling `protected` on ancestor method changes result of `instance_methods(false)` — "ufuk (Ufuk Kayserilioglu)" <noreply@...>

Issue #18435 has been reported by ufuk (Ufuk Kayserilioglu).

23 messages 2021/12/26

[#106833] [Ruby master Feature#18438] Add `Exception#additional_message` to show additional error information — "mame (Yusuke Endoh)" <noreply@...>

Issue #18438 has been reported by mame (Yusuke Endoh).

30 messages 2021/12/27

[#106834] [Ruby master Bug#18439] Support YJIT for VC++ — "usa (Usaku NAKAMURA)" <noreply@...>

Issue #18439 has been reported by usa (Usaku NAKAMURA).

11 messages 2021/12/27

[#106851] [Ruby master Bug#18442] Make Ruby 3.0.3 on Solaris 10 with "The following command caused the error: cc -D_STDC_C99= " — "dklein (Dmitri Klein)" <noreply@...>

Issue #18442 has been reported by dklein (Dmitri Klein).

8 messages 2021/12/27

[#106928] [Ruby master Bug#18454] YJIT slowing down key Discourse benchmarks — "sam.saffron (Sam Saffron)" <noreply@...>

Issue #18454 has been reported by sam.saffron (Sam Saffron).

8 messages 2021/12/31

[ruby-core:106910] [Ruby master Bug#18447] Potential performance regression with String#lines in large strings

From: "ttilberg (Tim Tilberg)" <noreply@...>
Date: 2021-12-29 21:54:49 UTC
List: ruby-core #106910
Issue #18447 has been updated by ttilberg (Tim Tilberg).


Thanks everyone! I wondered if it wasn't due to memory allocations, but the "build a massive string" operation was successful and quite fast. I looked through `rb_str_enumerate_lines` in the C source, and got lost rather quickly, aside from seeing that if the string wasn't frozen, we would create a new one that _is_ frozen. I'm curious if someone would bother to take a moment to share where this was manifested from my example (`String#lines`)? I'm working my way up towards understanding Ruby at a deeper level.

Regardless, thank you everyone for triaging and creating a patch so quickly!

----------------------------------------
Bug #18447: Potential performance regression with String#lines in large strings
https://bugs.ruby-lang.org/issues/18447#change-95730

* Author: ttilberg (Tim Tilberg)
* Status: Closed
* Priority: Normal
* Assignee: peterzhu2118 (Peter Zhu)
* ruby -v: ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-darwin20]
* Backport: 2.6: UNKNOWN, 2.7: UNKNOWN, 3.0: UNKNOWN, 3.1: UNKNOWN
----------------------------------------
I believe there may be a potential performance regression regarding `String#lines` worth noting between 3.0.3 and 3.1.0. This came about in [this discussion](https://www.reddit.com/r/ruby/comments/rpje0g/fast_to_way_to_parse_csv/hqadqsd/) regarding large file parsing performance. We were benchmarking various ways to parse a 10 million row CSV file. Slurping the file took significantly longer than streaming in version 3.1.0, even though the data was able to fit in memory. After further research, we started to feel that there may be something to speak up about here, and I think it's pinned down to `String#lines`.

I'm running Mac OS Big Sur 11.6.1 on a 13" 2020 MBP with 32 GB ram. Specific Ruby versions are included in the comparison examples below.

The simplest reproduction seems to be:

Ruby 3.0.3: ~1.5 seconds

```
笆カ time ruby -ve '("\n" * 10_000_000).lines'
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x86_64-darwin20]
ruby -ve '("\n" * 10_000_000).lines'  1.38s user 0.39s system 100% cpu 1.756 total
```

Ruby 3.1.0: ~11.5 seconds

```
笆カ time ruby -ve '("\n" * 10_000_000).lines'
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-darwin20]
ruby -ve '("\n" * 10_000_000).lines'  1.52s user 10.01s system 99% cpu 11.579 total
```

Some other observations:

An earlier script that I ran ruby-prof against looked like:

```
puts File.read("sample-data.csv").lines.sum { 1 }
```

- It appeared that the time increase stemmed from `String#lines`, as all other methods had similar time taken between the versions:

```
 Ruby 3.1.0
  %self      total      self      wait     child     calls  name
 89.93     12.728    12.728     0.000     0.000        1   String#lines
  9.01      1.275     1.275     0.000     0.000        1   Array#sum
  1.05      0.149     0.149     0.000     0.000        1   <Class::IO>#read

  Ruby 3.0.3
   %self      total      self      wait     child     calls  name
 74.91      3.773     3.773     0.000     0.000        1   String#lines
 22.15      1.116     1.116     0.000     0.000        1   Array#sum
  2.93      0.148     0.148     0.000     0.000        1   <Class::IO>#read
```

- A similar enumerator without `String#lines` does not appear to cause this:

```
笆カ time ruby -ve '10_000_000.times.map { nil }'
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x86_64-darwin20]
ruby -ve '10_000_000.times.map { nil }'  0.57s user 0.16s system 102% cpu 0.710 total
```

```
笆カ time ruby -ve '10_000_000.times.map { nil }'
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-darwin20]
ruby -ve '10_000_000.times.map { nil }'  0.61s user 0.16s system 102% cpu 0.753 total
```

- It doesn't seem related to string generation:

```
笆カ time ruby -ve '("\n" * 10_000_000)'
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x86_64-darwin20]
-e:1: warning: possibly useless use of * in void context
ruby -ve '("\n" * 10_000_000)'  0.13s user 0.14s system 107% cpu 0.246 total
```

```
笆カ time ruby -ve '("\n" * 10_000_000)'
ruby 3.1.0p0 (2021-12-25 revision fb4df44d16) [x86_64-darwin20]
-e:1: warning: possibly useless use of * in void context
ruby -ve '("\n" * 10_000_000)'  0.13s user 0.14s system 107% cpu 0.245 total
```

(Thanks to simpl1g for the discussion on Reddit, and help detecting this potential issue)



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:[email protected]?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread

Prev Next