[#105104] [Ruby master Bug#18141] Marshal load with proc yield strings before they are fully initialized — "byroot (Jean Boussier)" <noreply@...>

Issue #18141 has been reported by byroot (Jean Boussier).

10 messages 2021/09/01

[#105114] [Ruby master Feature#18143] Add a new method to change GC.stress only in the given block such as GC.with_stress(flag) {...} — "kou (Kouhei Sutou)" <noreply@...>

Issue #18143 has been reported by kou (Kouhei Sutou).

8 messages 2021/09/02

[#105180] [Ruby master Bug#18156] 3.0.2 configuration checks by default for C++ compiler instead of C? — "vo.x (Vit Ondruch)" <noreply@...>

Issue #18156 has been reported by vo.x (Vit Ondruch).

11 messages 2021/09/08

[#105191] [Ruby master Bug#18159] Integrate functionality of dead_end gem into Ruby — duerst <noreply@...>

Issue #18159 has been reported by duerst (Martin Dst).

37 messages 2021/09/11

[#105269] [Ruby master Bug#18169] Local copies of gemified libraries are being released out of sync with their gems — "headius (Charles Nutter)" <noreply@...>

Issue #18169 has been reported by headius (Charles Nutter).

15 messages 2021/09/15

[#105276] [Ruby master Bug#18170] Exception#inspect should not include newlines — "mame (Yusuke Endoh)" <noreply@...>

Issue #18170 has been reported by mame (Yusuke Endoh).

29 messages 2021/09/16

[#105310] [Ruby master Misc#18174] DevelopersMeeting20211021Japan — "mame (Yusuke Endoh)" <noreply@...>

Issue #18174 has been reported by mame (Yusuke Endoh).

14 messages 2021/09/16

[#105313] [Ruby master Misc#18175] Propose Jean Boussier (@byroot) as a core committer — "tenderlovemaking (Aaron Patterson)" <noreply@...>

Issue #18175 has been reported by tenderlovemaking (Aaron Patterson).

11 messages 2021/09/16

[#105354] [Ruby master Feature#18181] Introduce Enumerable#min_with_value, max_with_value, and minmax_with_value — "kyanagi (Kouhei Yanagita)" <noreply@...>

Issue #18181 has been reported by kyanagi (Kouhei Yanagita).

16 messages 2021/09/20

[#105361] [Ruby master Feature#18183] make SecureRandom.choose public — "olleicua (Antha Auciello)" <noreply@...>

Issue #18183 has been reported by olleicua (Antha Auciello).

17 messages 2021/09/21

[#105377] [Ruby master Bug#18187] Float#clamp() returns ArgumentError (comparison of Float with 1 failed) — "SouravGoswami (Sourav Goswami)" <noreply@...>

Issue #18187 has been reported by SouravGoswami (Sourav Goswami).

7 messages 2021/09/22

[#105391] [Ruby master Bug#18189] `rb_cString` can be NULL during `Init_Object` — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18189 has been reported by ioquatix (Samuel Williams).

9 messages 2021/09/23

[#105428] [Ruby master Bug#18194] No easy way to format exception messages per thread/fiber scheduler context. — "ioquatix (Samuel Williams)" <noreply@...>

Issue #18194 has been reported by ioquatix (Samuel Williams).

16 messages 2021/09/26

[#105450] [Ruby master Feature#18228] Add a `timeout` option to `IO.copy_stream` — "byroot (Jean Boussier)" <noreply@...>

Issue #18228 has been reported by byroot (Jean Boussier).

11 messages 2021/09/27

[#105452] [Ruby master Feature#18229] Proposal to merge YJIT — "maximecb (Maxime Chevalier-Boisvert)" <noreply@...>

Issue #18229 has been reported by maximecb (Maxime Chevalier-Boisvert).

21 messages 2021/09/27

[#105500] [Ruby master Feature#18231] `RubyVM.keep_script_lines` — "ko1 (Koichi Sasada)" <noreply@...>

Issue #18231 has been reported by ko1 (Koichi Sasada).

19 messages 2021/09/30

[#105504] [Ruby master Bug#18232] Ractor.make_shareable is broken in code loaded with RubyVM::InstructionSequence.load_from_binary — "byroot (Jean Boussier)" <noreply@...>

Issue #18232 has been reported by byroot (Jean Boussier).

7 messages 2021/09/30

[ruby-core:105451] [Ruby master Feature#18228] Add a `timeout` option to `IO.copy_stream`

From: "Eregon (Benoit Daloze)" <noreply@...>
Date: 2021-09-27 16:57:44 UTC
List: ruby-core #105451
Issue #18228 has been updated by Eregon (Benoit Daloze).


I wonder, can `sendfile(2)` be interrupted by a signal like SIGVTALRM like for read/write?
That might be another strategy to implement a timeout for it.

----------------------------------------
Feature #18228: Add a `timeout` option to `IO.copy_stream`
https://bugs.ruby-lang.org/issues/18228#change-93902

* Author: byroot (Jean Boussier)
* Status: Open
* Priority: Normal
----------------------------------------
### Context

In many situations dealing with large files, `IO.copy_stream` when usable bring major performance gains (often twice faster at the very least). And more importantly, when the copying is deferred to the kernel, the performance is much more consistent as it is less impacted by the CPU utilization on the machine.

However, it is often unsafe to use because it doesn't have a timeout, so you can only use it if both the source and destination IOs are trusted, otherwise it is trivial for an attacker to DOS the service by reading the response very slowly.

### Some examples

- It is [used by `webrick`](https://github.com/ruby/webrick/commit/54be684da9d993ad6c237e2e9853eb98bcbaae6e).
- `Net::HTTP` uses it to send request body if they are IOs, but [it is used with a "fake IO" to allow for timeouts](https://github.com/ruby/net-http/pull/27), so `sendfile(2)` &co are never used.
- [A proof of concept of integrating in puma shows a 2x speedup](https://github.com/puma/puma/pull/2703). 
- [Various other HTTP client could use it as well](https://github.com/nahi/httpclient/pull/383).
- I used it in private projects to download and upload large archives in and out of Google Cloud Storage with great effects.

### Possible implementation

The main difficulty is that the underlying sycalls don't have a timeout either.

The main syscall used in these scenarios is `sendfile(2)`. It doesn't have a timeout parameter, however if called on file descriptors with `O_NONBLOCK` it does return early and allow for a `select/poll` loop. I did a very quick and dirty experiment with this, and it does seem to work.

The other two accelerating syscalls are [`copy_file_range(2)`](https://man7.org/linux/man-pages/man2/copy_file_range.2.html) (linux) and [`fcopyfile(2)`](https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man3/fcopyfile.3.html) (macOS). Neither have a timeout, and neither manpage document an `EAGAIN / EWOULDBLOCK` error. However these syscalls are limited to real file copies, generally speaking timeouts for real files are less of a critical need, so it would be possible to simply not use these syscalls if a timeout is provided.

### Interface

`copy_stream(src, dst, copy_length, src_offset, timeout)`
or `copy_stream(src, dst, copy_length, src_offset, timeout: nil)`

As for the return value in case of a timeout, it is important to convey both that a timeout happened, and the number of bytes that were copied, otherwise it makes retries impossible.

- It could simply returns the number of byte, and let the caller compare it to the expected number of bytes copied, but that wouldn't work in cases where the size of `src` isn't known.
- It could return `-1 - bytes_copied`, not particularly elegant but would work.
- It could return multiple values or some kind of result object when a timeout is provided.
- It could raise an error, with `bytes_copied` as an attribute on the error.

Or alternatively `copy_stream` would be left without a timeout, and some kind of `copy_stream2` would be introduced so that `copy_stream` return value wouldn't be made inconsistent.






-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:[email protected]?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread