Age | Commit message (Collapse) | Author |
|
* Add `open_timeout` as an overall timeout option for `Socket.tcp`
[Background]
Currently, `TCPSocket.new` and `Socket.tcp` accept two kind of timeout options:
- `resolv_timeout`, which controls the timeout for DNS resolution
- `connect_timeout`, which controls the timeout for the connection attempt
With the introduction of Happy Eyeballs Version 2 (as per [RFC 8305](https://datatracker.ietf.org/doc/html/rfc8305)) in[ Feature #20108](https://bugs.ruby-lang.org/issues/20108) and [Feature #20782](https://bugs.ruby-lang.org/issues/20782), both address resolution and connection attempts are now parallelized.
As a result, the sum of `resolv_timeout` and `connect_timeout` no longer represents the total timeout duration. This is because, in HEv2, name resolution and connection attempts are performed concurrently, causing the two timeouts to overlap.
Example:
When `resolv_timeout: 200ms` and `connect_timeout: 100ms` are set:
1. An IPv6 address is resolved after the method starts immediately (IPv4 is still being resolved).
2. A connection attempt is initiated to the IPv6 address
3. After 100ms, `connect_timeout` is exceeded. However, since `resolv_timeout` still has 100ms left, the IPv4 resolution continues.
4. After 200ms from the start, the method raises a `resolv_timeout` error.
In this case, the total elapsed time before a timeout is 200ms, not the expected 300ms (100ms + 200ms).
Furthermore, in HEv2, connection attempts are also parallelized.
It starts a new connection attempts every 250ms for resolved addresses. This makes the definition of `connect_timeout` even more ambiguous—specifically, it becomes unclear from which point the timeout is counted.
Additionally, these methods initiate new connection attempts every 250ms (Connection Attempt Delay) for each candidate address, thereby parallelizing connection attempts. However, this behavior makes it unclear from which point in time the connect_timeout is actually measured.
Currently, a `connect_timeout` is raised only after the last connection attempt exceeds the timeout.
Example:
When `connect_timeout: 100ms` is set and 3 address candidates:
1. Start a connection attempt to the address `a`
2. 250ms after step 1, start a new connection attempt to the address `b`
3. 500ms after step 1, start a new connection attempt to the address `c`
4. 1000ms after step 3 (1000ms after starting the connection to `c`, 1250ms after starting the connection to `b,` and 1500ms after starting the connection to `a`) `connect_timeout` is raised
This behavior aims to favor successful connections by allowing more time for each attempt, but it results in a timeout model that is difficult to reason about.
These methods have supported `resolv_timeout` and `connect_timeout` options even before the introduction of HEv2. However, in many use cases, it would be more convenient if a timeout occurred after a specified duration from the start of the method. Similar functions in other languages (such as PHP, Python, and Go) typically allow specifying only an overall timeout.
[Proposal]
I propose adding an `open_timeout` option to `Socket.tcp` in this PR, which triggers a timeout after a specified duration has elapsed from the start of the method.
The name `open_timeout` aligns with the existing accessor used in `Net::HTTP`.
If `open_timeout` is specified together with `resolv_timeout` and `connect_timeout`, I propose that only `open_timeout` be used and the others be ignored. While it is possible to support combinations of `open_timeout`, `resolv_timeout`, and `connect_timeout`, doing so would require defining which timeout takes precedence in which situations. In this case, I believe it is more valuable to keep the behavior simple and easy to understand, rather than supporting more complex use cases.
If this proposal is accepted, I also plan to extend `open_timeout` support to `TCPSocket.new`.
While the long-term future of `resolv_timeout` and `connect_timeout` may warrant further discussion, I believe the immediate priority is to offer a straightforward way to specify an overall timeout.
[Outcome]
If `open_timeout` is also supported by `TCPSocket.new`, users would be able to manage total connection timeouts directly in `Net::HTTP#connect` without relying on `Timeout.timeout`.
https://github.com/ruby/ruby/blob/aa0f689bf45352c4a592e7f1a044912c40435266/lib/net/http.rb#L1657
---
* Raise an exception if it is specified together with other timeout options
> If open_timeout is specified together with resolv_timeout and connect_timeout, I propose that only open_timeout be used and the others be ignored.
Since this approach may be unclear to users, I’ve decided to explicitly raise an `ArgumentError` if these options are specified together.
* Add doc
* Fix: open_timeout error should be raised even if there are still addresses that have not been tried
Notes:
Merged-By: shioimm <[email protected]>
|
|
We no longer execute those files with Solaris platforms.
Notes:
Merged: https://github.com/ruby/ruby/pull/13037
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/12740
|
|
(#12678)
* Do not save ResolutionError if resolution succeeds for any address family
Socket with Happy Eyeballs Version 2 performs connection attempts and name resolution in parallel.
In the existing implementation, if a connection attempt failed for one address family while name resolution was still in progress for the other, and that name resolution later failed, the method would terminate with a name resolution error.
This behavior was intended to ensure that the final error reflected the most recent failure, potentially overriding an earlier error.
However, [Bug #21088](https://bugs.ruby-lang.org/issues/21088) made me realize that terminating with a name resolution error is unnatural when name resolution succeeded for at least one address family.
This PR modifies the behavior so that if name resolution succeeds for one address family, any name resolution error from the other is not saved.
This PR includes the following changes:
* Do not display select(2) as the system call that caused the raised error, as it is for internal processing
* Fix bug: Get errno with Socket::SO_ERROR in Windows environment with a workaround for tests not passing
Notes:
Merged-By: shioimm <[email protected]>
|
|
* TCPSocket.new: Close resources in ensure
* TCPSocket.new: Remove unnecessary comments
* Socket.tcp: Make assert_separately in TestSocket more readable
* Socket.tcp: Returning instead of exiting
* Socket.tcp: Close resources in ensure
* Socket.tcp: Avoid test failures on hosts that only support IPv4
Notes:
Merged-By: shioimm <[email protected]>
|
|
This change includes the following updates:
- Added an environment variable `RUBY_TCP_NO_FAST_FALLBACK` to control enabling/disabling fast_fallback
- Updated documentation and man pages
- Revised the implementation of Socket.tcp_fast_fallback= and Socket.tcp_fast_fallback, which previously performed dynamic name resolution of constants and variables. As a result, the following performance improvements were achieved:
(Case of 1000 executions of `TCPSocket.new` to the local host)
Rehearsal -----------------------------------------
before 0.031462 0.147946 0.179408 ( 0.249279)
after 0.031164 0.146839 0.178003 ( 0.346935)
-------------------------------- total: 0.178003sec
user system total real
before 0.027584 0.138712 0.166296 ( 0.233356)
after 0.025953 0.127608 0.153561 ( 0.237971)
Notes:
Merged-By: shioimm <[email protected]>
|
|
|
|
|
|
* Introduction of Happy Eyeballs Version 2 (RFC8305) in Socket.tcp
This is an implementation of Happy Eyeballs version 2 (RFC 8305) in Socket.tcp.
[Background]
Currently, `Socket.tcp` synchronously resolves names and makes connection attempts with `Addrinfo::foreach.`
This implementation has the following two problems.
1. In name resolution, the program stops until the DNS server responds to all DNS queries.
2. In a connection attempt, while an IP address is trying to connect to the destination host and is taking time, the program stops, and other resolved IP addresses cannot try to connect.
[Proposal]
"Happy Eyeballs" ([RFC 8305](https://datatracker.ietf.org/doc/html/rfc8305)) is an algorithm to solve this kind of problem. It avoids delays to the user whenever possible and also uses IPv6 preferentially.
I implemented it into `Socket.tcp` by using `Addrinfo.getaddrinfo` in each thread spawned per address family to resolve the hostname asynchronously, and using `Socket::connect_nonblock` to try to connect with multiple addrinfo in parallel.
[Outcome]
This change eliminates a fatal defect in the following cases.
Case 1. One of the A or AAAA DNS queries does not return
---
require 'socket'
class Addrinfo
class << self
# Current Socket.tcp depends on foreach
def foreach(nodename, service, family=nil, socktype=nil, protocol=nil, flags=nil, timeout: nil, &block)
getaddrinfo(nodename, service, Socket::AF_INET6, socktype, protocol, flags, timeout: timeout)
.concat(getaddrinfo(nodename, service, Socket::AF_INET, socktype, protocol, flags, timeout: timeout))
.each(&block)
end
def getaddrinfo(_, _, family, *_)
case family
when Socket::AF_INET6 then sleep
when Socket::AF_INET then [Addrinfo.tcp("127.0.0.1", 4567)]
end
end
end
end
Socket.tcp("localhost", 4567)
---
Because the current `Socket.tcp` cannot resolve IPv6 names, the program stops in this case. It cannot start to connect with IPv4 address.
Though `Socket.tcp` with HEv2 can promptly start a connection attempt with IPv4 address in this case.
Case 2. Server does not promptly return ack for syn of either IPv4 / IPv6 address family
---
require 'socket'
fork do
socket = Socket.new(Socket::AF_INET6, :STREAM)
socket.setsockopt(:SOCKET, :REUSEADDR, true)
socket.bind(Socket.pack_sockaddr_in(4567, '::1'))
sleep
socket.listen(1)
connection, _ = socket.accept
connection.close
socket.close
end
fork do
socket = Socket.new(Socket::AF_INET, :STREAM)
socket.setsockopt(:SOCKET, :REUSEADDR, true)
socket.bind(Socket.pack_sockaddr_in(4567, '127.0.0.1'))
socket.listen(1)
connection, _ = socket.accept
connection.close
socket.close
end
Socket.tcp("localhost", 4567)
---
The current `Socket.tcp` tries to connect serially, so when its first name resolves an IPv6 address and initiates a connection to an IPv6 server, this server does not return an ACK, and the program stops.
Though `Socket.tcp` with HEv2 starts to connect sequentially and in parallel so a connection can be established promptly at the socket that attempted to connect to the IPv4 server.
In exchange, the performance of `Socket.tcp` with HEv2 will be degraded.
---
100.times { Socket.tcp("www.ruby-lang.org", 80) }
---
This is due to the addition of the creation of IO objects, Thread objects, etc., and calls to `IO::select` in the implementation.
* Avoid NameError of Socket::EAI_ADDRFAMILY in MinGW
* Support Windows with SO_CONNECT_TIME
* Improve performance
I have additionally implemented the following patterns:
- If the host is single-stack, name resolution is performed in the main thread. This reduces the cost of creating threads.
- If an IP address is specified, name resolution is performed in the main thread. This also reduces the cost of creating threads.
- If only one IP address is resolved, connect is executed in blocking mode. This reduces the cost of calling IO::select.
Also, I have added a fast_fallback option for users who wish not to use HE.
Here are the results of each performance test.
```ruby
require 'socket'
require 'benchmark'
HOSTNAME = "www.ruby-lang.org"
PORT = 80
ai = Addrinfo.tcp(HOSTNAME, PORT)
Benchmark.bmbm do |x|
x.report("Domain name") do
30.times { Socket.tcp(HOSTNAME, PORT).close }
end
x.report("IP Address") do
30.times { Socket.tcp(ai.ip_address, PORT).close }
end
x.report("fast_fallback: false") do
30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close }
end
end
```
```
user system total real
Domain name 0.015567 0.032511 0.048078 ( 0.325284)
IP Address 0.004458 0.014219 0.018677 ( 0.284361)
fast_fallback: false 0.005869 0.021511 0.027380 ( 0.321891)
````
And this is the measurement result when executed in a single stack environment.
```
user system total real
Domain name 0.007062 0.019276 0.026338 ( 1.905775)
IP Address 0.004527 0.012176 0.016703 ( 3.051192)
fast_fallback: false 0.005546 0.019426 0.024972 ( 1.775798)
```
The following is the result of the run on Ruby 3.3.0.
(on Dual stack environment)
```
user system total real
Ruby 3.3.0 0.007271 0.027410 0.034681 ( 0.472510)
```
(on Single stack environment)
```
user system total real
Ruby 3.3.0 0.005353 0.018898 0.024251 ( 1.774535)
```
* Do not cache `Socket.ip_address_list`
As mentioned in the comment at https://github.com/ruby/ruby/pull/9374#discussion_r1482269186, caching Socket.ip_address_list does not follow changes in network configuration.
But if we stop caching, it becomes necessary to check every time `Socket.tcp` is called whether it's a single stack or not, which could further degrade performance in the case of a dual stack.
From this, I've changed the approach so that when a domain name is passed, it doesn't check whether it's a single stack or not and resolves names in parallel each time.
The performance measurement results are as follows.
require 'socket'
require 'benchmark'
HOSTNAME = "www.ruby-lang.org"
PORT = 80
ai = Addrinfo.tcp(HOSTNAME, PORT)
Benchmark.bmbm do |x|
x.report("Domain name") do
30.times { Socket.tcp(HOSTNAME, PORT).close }
end
x.report("IP Address") do
30.times { Socket.tcp(ai.ip_address, PORT).close }
end
x.report("fast_fallback: false") do
30.times { Socket.tcp(HOSTNAME, PORT, fast_fallback: false).close }
end
end
user system total real
Domain name 0.004085 0.011873 0.015958 ( 0.330097)
IP Address 0.000993 0.004400 0.005393 ( 0.257286)
fast_fallback: false 0.001348 0.008266 0.009614 ( 0.298626)
* Wait forever if fallback addresses are unresolved, unless resolv_timeout
Changed from waiting only 3 seconds for name resolution when there is no fallback address available, to waiting as long as there is no resolv_timeout.
This is in accordance with the current `Socket.tcp` specification.
* Use exact pattern to match IPv6 address format for specify address family
|
|
https://github.com/ruby/ruby/pull/9088#discussion_r1411490445
|
|
The test for Socket::ResolutionError#error_code fails in the FreeBSD environment with this test condition. Because Socket::ResolutionError#error_code returns Socket::EAI_FAIL instead of Socket::EAI_FAMILY.
https://rubyci.s3.amazonaws.com/freebsd12/ruby-master/log/20231130T103002Z.fail.html.gz
This PR avoids the test failure by relaxing the condition.
Also changed the domain for testing to `example.com`.
|
|
https://rubyci.s3.amazonaws.com/freebsd12/ruby-master/log/20231130T103002Z.fail.html.gz
|
|
|
|
rsock_raise_socket_error is called only when getaddrinfo and getaddrname fail
|
|
|
|
It seems like it never succeeds on this CI.
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/7462
|
|
we should handle ensure block when omit this test
|
|
http://rubyci.s3.amazonaws.com/freebsd13/ruby-master/log/20220906T043002Z.fail.html.gz
http://rubyci.s3.amazonaws.com/freebsd13/ruby-master/log/20220905T103002Z.fail.html.gz
|
|
|
|
Notes:
Merged-By: k0kubun <[email protected]>
|
|
This reverts commit 27fb9d272daaae89089dfb61849ebe8e7aa6c833.
The test failure on Solaris 10 is due to incomplete IPv6 configuration
on the CI server, that have already been fixed.
Reference for the fix: https://centrify.force.com/support/Article/KB-1179-X11-Forwarding-fails-with-Centrify-OpenSSH-5-0-Solaris/
|
|
The test fails on Solaris 10. Maybe due to the IPv6 configuration on the
server, but I have no idea at all. I've asked @ngoto to investigate the
issue, so will tentatively skip the tests on Solaris
http://rubyci.s3.amazonaws.com/solaris10-gcc/ruby-master/log/20210729T040002Z.fail.html.gz
|
|
|
|
Notes:
Merged: https://github.com/ruby/ruby/pull/4513
|
|
because the name "MJIT" is an internal code name, it's inconsistent with
--jit while they are related to each other, and I want to discourage future
JIT implementation-specific (e.g. MJIT-specific) APIs by this rename.
[Feature #17490]
|
|
|
|
getaddrinfo_a() gets stuck after fork().
To avoid this, we need 1 second sleep to wait for internal
worker threads of getaddrinfo_a() to be finished, but that is unacceptable.
[Bug #17220] [Feature #17134] [Feature #17187]
|
|
|
|
* Otherwise those tests, etc cannot run on alternative Ruby implementations.
|
|
We need stop worker threads in getaddrinfo_a() before fork().
This change adds a hook before fork() that cancel all outstanding requests
and wait for all ongoing requests. Then, it waits for all worker
threads to be finished.
Fixes [Bug #17220]
|
|
When interfaces do not include localhost,
some other tests may fail.
|
|
hoping to stabilize:
https://app.wercker.com/ruby/ruby/runs/mjit-test1/5d6df8a8a952c20008acf75b?step=5d6df90e4971a6000714c627
|
|
Fix following error on `utun*`:
```
1) Error:
TestSocket#test_udp_server:
Errno::ECONNREFUSED: Connection refused - recvmsg(2)
```
|
|
|
|
CI failures are still happening from these tests, but try
to break out of it earlier instead of holding up the job.
[Bug #14898]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64484 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
It looks like we need to retry test_timestampns in addition
to test_timestamp; so share some code while we're at it.
cf. http://ci.rvm.jp/results/trunk-test@frontier/1153126
[ruby-core:88104] [Bug #14898]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@64157 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
I theorize there can be UDP packet loss even over loopback if
the kernel is under memory pressure. Retry sending periodically
until recvmsg succeeds.
i[ruby-core:87842] [Bug #14898]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@63872 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* The warnings are shown by Thread.report_on_exception defaulting to
true. [Feature #14143] [ruby-core:83979]
* Improves tests by narrowing down the scope where an exception
is expected.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@61188 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* ext/socket/socket.c (sock_s_getnameinfo): check null byte.
patched by tommy (Masahiro Tomita) in [ruby-dev:50286].
[Bug #13994]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@60162 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
AIX does not set the MSG_TRUNC flag for a message partially read
by recvmsg(2) with the MSG_PEEK flag set.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@54073 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
some environments disables IPv6 even if they have IPv6 addresses.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@53294 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
when Socket::SO_TIMESTAMP is not defined.
Fix error on Solaris 10. [Bug #11728] [ruby-dev:49377]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52701 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
nil on Solaris 10 which have no HAVE_STRUCT_MSGHDR_MSG_CONTROL.
Reported by Naohisa Goto. [ruby-core:71557] [Bug #11709]
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52657 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52647 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* ext/socket/ancdata.c (bsock_recvmsg_internal): grow buffer
on unspecified maxdatlen
[ruby-core:71517] [Bug #11701]
* ext/socket/lib/socket.rb (Socket#recvmsg): nil default for dlen
(Socket#recvmsg_nonblock): ditto
* test/socket/test_socket.rb (test_recvmsg_udp_no_arg): new test
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52625 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@52576 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@50727 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
after path length check.
This fixes a fd leak by TestSocket_UNIXSocket#test_too_long_path.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46218 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|
|
* test/socket/test_socket.rb (test_udp_server): ignore interface
with no address assigned.
git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@46204 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
|