Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.5.5, 4.0.0
Description
On glibc based Linux systems select() can monitor only file descriptor numbers that are less than FD_SETSIZE (1024).
This is an unreasonably low limit for many modern applications.
When running via pyspark we frequently observe:
Exception occurred during processing of request from ('127.0.0.1', 46334) Traceback (most recent call last): File "/usr/lib/python3.11/socketserver.py", line 317, in _handle_request_noblock self.process_request(request, client_address) File "/usr/lib/python3.11/socketserver.py", line 348, in process_request self.finish_request(request, client_address) File "/usr/lib/python3.11/socketserver.py", line 361, in finish_request self.RequestHandlerClass(request, client_address, self) File "/usr/lib/python3.11/socketserver.py", line 755, in __init__ self.handle() File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 293, in handle poll(authenticate_and_accum_updates) File "/usr/lib/python3.11/site-packages/pyspark/accumulators.py", line 266, in poll r, _, _ = select.select([self.rfile], [], [], 1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: filedescriptor out of range in select()
On POSIX systems poll() should be used instead of select().