[#99002] [Ruby master Feature#17004] Provide a way for methods to omit their return value — shyouhei@...

Issue #17004 has been reported by shyouhei (Shyouhei Urabe).

21 messages 2020/07/01

[#99044] [Ruby master Bug#17007] SystemStackError when using super inside Module included and lexically inside refinement — eregontp@...

Issue #17007 has been reported by Eregon (Benoit Daloze).

7 messages 2020/07/03

[#99078] [Ruby master Feature#17016] Enumerable#scan_left — finch.parker@...

Issue #17016 has been reported by parker (Parker Finch).

42 messages 2020/07/07

[#99079] [Ruby master Bug#17017] Range#max & Range#minmax incorrectly use Float end as max — bosticko@...

Issue #17017 has been reported by sambostock (Sam Bostock).

25 messages 2020/07/07

[#99097] [Ruby master Bug#17021] "arm64" and "arm" are mixed in RbConfig on Apple silicon — watson1978@...

Issue #17021 has been reported by watson1978 (Shizuo Fujita).

9 messages 2020/07/09

[#99115] [Ruby master Bug#17023] How to prevent String memory to be relocated in ruby-ffi — larskanis@...

Issue #17023 has been reported by larskanis (Lars Kanis).

22 messages 2020/07/10

[#99156] [Ruby master Bug#17030] Enumerable#grep{_v} should be optimized for Regexp — marcandre-ruby-core@...

Issue #17030 has been reported by marcandre (Marc-Andre Lafortune).

25 messages 2020/07/13

[#99257] [Ruby master Misc#17041] DevelopersMeeting20200826Japan — mame@...

Issue #17041 has been reported by mame (Yusuke Endoh).

18 messages 2020/07/22

[#99308] [Ruby master Feature#17047] Support parameters for MAIL FROM and RCPT TO — bugs.ruby-lang.org@...

Issue #17047 has been reported by c960657 (Christian Schmidt).

11 messages 2020/07/23

[#99311] [Ruby master Bug#17048] Calling initialize_copy on live modules leads to crashes — XrXr@...

Issue #17048 has been reported by alanwu (Alan Wu).

17 messages 2020/07/24

[#99351] [Ruby master Bug#17052] Ruby with LTO enabled on {aarch64, ppc64le} architectures. — v.ondruch@...

Issue #17052 has been reported by vo.x (Vit Ondruch).

35 messages 2020/07/27

[#99375] [Ruby master Feature#17055] Allow suppressing uninitialized instance variable and method redefined verbose mode warnings — merch-redmine@...

Issue #17055 has been reported by jeremyevans0 (Jeremy Evans).

29 messages 2020/07/28

[#99391] [Ruby master Feature#17059] epoll as IO.select — dsh0416@...

Issue #17059 has been reported by dsh0416 (Delton Ding).

18 messages 2020/07/29

[#99418] [Ruby master Feature#17097] `map_min`, `map_max` — sawadatsuyoshi@...

Issue #17097 has been reported by sawa (Tsuyoshi Sawada).

11 messages 2020/07/31

[ruby-core:99026] [Ruby master Feature#17002] Extend heap pages to exactly 16KiB

From: ko1@...
Date: 2020-07-02 05:51:06 UTC
List: ruby-core #99026
Issue #17002 has been updated by ko1 (Koichi Sasada).


Thank you for your survey. Seems fine!

----------------------------------------
Feature #17002: Extend heap pages to exactly 16KiB
https://bugs.ruby-lang.org/issues/17002#change-86401

* Author: tenderlovemaking (Aaron Patterson)
* Status: Open
* Priority: Normal
----------------------------------------
Hi,

I would like to extend heap pages to be exactly 16KiB.  Currently, `struct heap_page_body` is 16KiB - `(sizeof(size_t) * 5)`.

Before I list the reasons I want to change, there are two important facts I want to list.  First, OS pages are 4KiB on platforms I tested (macOS, Ubuntu, Windows).  Second, when the GC allocates pages, it first allocates `struct heap_page_body` immediately followed by `struct heap_page`:

https://github.com/ruby/ruby/blob/289a28e68f30e879760fd000833b512d506a0805/gc.c#L1756-L1767

I want to make this change for a few reasons:

1. I would like `struct heap_page_body` to be a multiple of OS pages so that we can use `mprotect` on it (I want to implement read barriers on heap pages with `mprotect`, so this is my selfish reason)
2. Some allocators (specifically glibc) will put `struct heap_page` on the same OS page as `struct heap_page_body`.  `struct heap_page` is frequently modified, so that OS page (including Ruby objects) will be copied.  Extending `struct heap_page_body` to 16KiB can help prevent CoW faults. (see Note 1)
3. Allocating 16KiB can reduce overall memory consumption.  Some allocators (specifically jemalloc) will round requested chunks to bin sizes.  jemalloc has a 16KiB bin size, so our request for `16KiB - (sizeof(size_t) * 5)` is rounded up to 16KiB anyway, and `(sizeof(size_t) * 5)` is wasted.  `(sizeof(size_t) * 5)` is enough room to fit one more Ruby object, so if we use that space for one more object, then we don't need to allocate as many pages, and memory usage can actually decrease.

My hypothesis is that this patch will either not change overall memory usage, or decrease overall memory usage.  But in either case it will allow us to use `mprotect`, and improve CoW.

Tests
===

I tested this patch on an Ubuntu machine with jemalloc and glibc.  Here is my system information:

Linux version:

```
aaron@whiteclaw ~> uname -a
Linux whiteclaw 5.4.0-37-generic #41-Ubuntu SMP Wed Jun 3 18:57:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
aaron@whiteclaw ~> lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04 LTS
Release:	20.04
Codename:	focal
```

GLIBC version:

```
aaron@whiteclaw ~> ldd --version
ldd (Ubuntu GLIBC 2.31-0ubuntu9) 2.31
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
```

jemalloc version:

```
aaron@whiteclaw ~/git> apt list --installed | grep jemalloc

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libjemalloc-dev/focal,now 5.2.1-1ubuntu1 amd64 [installed]
libjemalloc2/focal,now 5.2.1-1ubuntu1 amd64 [installed,automatic]
```

To test memory usage, I used this tool: https://github.com/bpowers/mstat

`mstat` is a sampling profiler that will report memory usage over time.  I generated RDoc for Ruby and took samples while documentation was generated.  Here is the Ruby command I used:

```
./ruby --disable-gems "./libexec/rdoc" --root "." --encoding=UTF-8 --all --ri --op ".ext/rdoc" --page-dir "./doc" --no-force-update  "."
```

I made 50 samples for each allocator and branch like this:

```
for x in (seq 50)
  sudo rm -rf .ext/rdoc; sudo ../src/mstat/mstat -o glibc-branch_$x.tsv ./ruby --disable-gems "./libexec/rdoc" --root "." --encoding=UTF-8 --all --ri --op ".ext/rdoc" --page-dir "./doc" --no-force-update  "."
end
```

In other words I made 200 samples total (50 jemalloc + master, 50 jemalloc + branch, 50 glibc + master, 50 glibc + branch).

glibc
====

Here is a comparison of glibc over time (lower is better):

![glibc changes](https://user-images.githubusercontent.com/3124/86180732-947f4c80-bae1-11ea-8388-0d1ab121a270.png)

From this graph it looks like glibc is mostly the same, but sometimes lower.  It looks like there are some outlier samples that go higher.  I made a box plot to compare maximum RSS:

![glibc max boxplot](https://user-images.githubusercontent.com/3124/86180925-ef18a880-bae1-11ea-978a-e45a0b3d116f.png)

The box plot shows the max RSS is usually lower with some outliers that are higher.

jemalloc
====

Here is a comparison of jemalloc over time (lower is better):

![jemalloc over time](https://user-images.githubusercontent.com/3124/86181071-31da8080-bae2-11ea-9e9e-3c8dd868be4d.png)

According to this graph jemalloc is usually lower.  I made another box plot to compare maximum RSS on jemalloc:

![jemalloc max RSS](https://user-images.githubusercontent.com/3124/86181149-5b93a780-bae2-11ea-88c8-ed17d34ab758.png)

The box plot shows that max RSS is typically lower on jemalloc.

CoW Performance
====

I didn't find a good way to measure CoW performance, but I don't think this patch would possibly degrade it.

Summary
===

I would like to merge this patch because there are a few good points (ability to use mprotect, memory savings, possible CoW improvements), and I can't find any downsides.

Thanks John Hawthorn for helping me get the math right on the "end pointer" part.

Note 1:  I was able to prove that `struct heap_page` will exist on the same OS page as `struct heap_page_body` here: https://github.com/ruby/ruby/pull/3253/commits/33390d15e7a6f803823efcb41205167c8b126fbb


---Files--------------------------------
0001-Expand-heap-pages-to-be-exactly-16kb.patch (4.51 KB)


-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:[email protected]?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread