ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
2 days	merge revision(s) fa85d23ff4a02985ebfe0716b0ff768f5b4fe13d: [Backport #21380]ruby_3_3	nagachika
	[Bug #21380] Prohibit modification in String#split block Reported at https://hackerone.com/reports/3163876
2025-03-16	merge revision(s) c224ca4feaff20cab03d76439bcbfb35d4e2f6b1: [Backport #21172]	nagachika
	Fix a race condition with interned strings sweeping. [Bug #21172] This fixes a rare CI failure. The timeline of the race condition is: - A `"foo" oid=1` string is interned. - `"foo" oid=1` is no longer referenced and will be swept in the future. - Another `"foo" oid=2` string is interned. - `register_fstring` finds `"foo" oid=1`, but since it is about to be swept, removes it from `fstring_table` and insert `"foo" oid=2` instead. - `"foo" oid=1` is swept, since it has the `RSTRING_FSTR` flag, a `st_delete` is issued in `fstring_table` which removes `"foo" oid=2`. I don't know how to reproduce this bug consistently in a single test case.
2025-01-14	merge revision(s) 02b70256b5171d4b85ea7eeab836d3d7cfb3dbfc, ↵	Takashi Kokubun
	6b4f8945d600168bf530d21395da8293fbd5e8ba: [Backport #20909] Check negative integer underflow Many of Oniguruma functions need valid encoding strings
2024-06-20	String.new(capacity:) don't substract termlen (#11027)	Jean byroot Boussier
	[Bug #20585] This was changed in 36a06efdd9f0604093dccbaf96d4e2cb17874dc8 because `String.new(1024)` would end up allocating `1025` bytes, but the problem with this change is that the caller may be trying to right size a String. So instead, we should just better document the behavior of `capacity:`. Co-authored-by: Jean Boussier <[email protected]>
2024-05-29	merge revision(s) 7e4b1f8e1935a10df3c41ee60ca0987d73281126: [Backport #20322]	Takashi Kokubun
	[Bug #20322] Fix rb_enc_interned_str_cstr null encoding The documentation for `rb_enc_interned_str_cstr` notes that `enc` can be a null pointer, but this currently causes a segmentation fault when trying to autoload the encoding. This commit fixes the issue by checking for NULL before calling `rb_enc_autoload`.
2024-05-29	merge revision(s) e04146129ec6898dd6a9739dad2983c6e9b68056: [Backport #20292]	Takashi Kokubun
	[Bug #20292] Truncate embedded string to new capacity
2024-05-28	merge revision(s) 5e0c17145131e073814c7e5b15227d0b4e73cabe: [Backport #20169]	Takashi Kokubun
	Make io_fwrite safe for compaction [Bug #20169] Embedded strings are not safe for system calls without the GVL because compaction can cause pages to be locked causing the operation to fail with EFAULT. This commit changes io_fwrite to use rb_str_tmp_frozen_no_embed_acquire, which guarantees that the return string is not embedded.
2024-03-20	merge revision(s) ade56737e2273847426214035c0ff2340b43799a: [Backport ↵	NARUSE, Yui
	#20190] (#10300) Fix coderange of invalid_encoding_string.<<(ord) Appending valid encoding character can change coderange from invalid to valid. Example: "\x95".force_encoding('sjis')<<0x5C will be a valid string "\x{955C}"
2024-03-14	merge revision(s) b3d612804946e841e47d14e09b6839224a79c1a4: [Backport ↵	NARUSE, Yui
	#20150] (#10253) Fix memory leak in grapheme clusters [Bug #20150] String#grapheme_cluters and String#each_grapheme_cluster leaks memory because if the string is not UTF-8, then the created regex will not be freed. For example: str = "hello world".encode(Encoding::UTF_32LE) 10.times do 1_000.times do str.grapheme_clusters end puts `ps -o rss= -p #{$$}` end Before: 26000 42256 59008 75792 92528 109232 125936 142672 159392 176160 After: 9264 9504 9808 10000 10128 10224 10352 10544 10704 10896 --- string.c \| 98 +++++++++++++++++++++++++++++++----------------- test/ruby/test_string.rb \| 11 ++++++ 2 files changed, 75 insertions(+), 34 deletions(-)
2023-12-24	Fix Symbol#inspect for GC compaction	Peter Zhu
	The test fails when RGENGC_CHECK_MODE is turned on: 1) Failure: TestSymbol#test_inspect_under_gc_compact_stress [test/ruby/test_symbol.rb:123]: <":testing"> expected but was <":\x00\x00\x00\x00\x00\x00\x00">.
2023-12-23	Fix String#sub for GC compaction	Peter Zhu
	The test fails when RGENGC_CHECK_MODE is turned on: TestString#test_sub_gc_compact_stress = 9.42 s 1) Failure: TestString#test_sub_gc_compact_stress [test/ruby/test_string.rb:2089]: <"aaa [amp] yyy"> expected but was <"aaa [] yyy">.
2023-12-17	Stir the hash value more with encoding index	Nobuyoshi Nakada

2023-12-16	[Bug #20068] Encoding does not matter to empty strings	Nobuyoshi Nakada

2023-12-13	Make String#chomp! raise ArgumentError for 2+ arguments if string is empty	Jeremy Evans
	String#chomp! returned nil without checking the number of passed arguments in this case.
2023-12-01	Make String#undump compaction safe	Peter Zhu

2023-12-01	Pin embedded shared strings	Peter Zhu
	Embedded shared strings cannot be moved because strings point into the slot of the shared string. There may be code using the RSTRING_PTR on the stack, which would pin the string but not pin the shared string, causing it to move.
2023-11-29	Guard match from GC in String#gsub	Peter Zhu
	We need to guard match from GC because otherwise it could end up being reclaimed or moved in compaction.
2023-11-27	Guard match from GC when scanning string	Peter Zhu
	We need to guard match from GC because otherwise it could end up being reclaimed or moved in compaction.
2023-11-20	Specialize String#dup	Jean Boussier
	`String#+@` is 2-3 times faster than `String#dup` because it can directly go through `rb_str_dup` instead of using the generic much slower `rb_obj_dup`. This fact led to the existance of the ugly `Performance/UnfreezeString` rubocop performance rule that encourage users to rewrite the much more readable and convenient `"foo".dup` into the ugly `(+"foo")`. Let's make that rubocop rule useless. ``` compare-ruby: ruby 3.3.0dev (2023-11-20T02:02:55Z master 701b0650de) [arm64-darwin22] last_commit=[ruby/prism] feat: add encoding for IBM865 (https://github.com/ruby/prism/pull/1884) built-ruby: ruby 3.3.0dev (2023-11-20T12:51:45Z faster-str-lit-dup 6b745bbc5d) [arm64-darwin22] warming up.. \| \|compare-ruby\|built-ruby\| \|:------\|-----------:\|---------:\| \|uplus \| 16.312M\| 16.332M\| \| \| -\| 1.00x\| \|dup \| 5.912M\| 16.329M\| \| \| -\| 2.76x\| ```
2023-11-09	String#force_encoding don't clear coderange if encoding is unchanged	Jean Boussier
	Some code out there blind calls `force_encoding` without checking what the original encoding was, which clears the coderange uselessly. If the String is big, it can be a rather costly mistake. For instance the `rack-utf8_sanitizer` gem does this on request bodies.
2023-11-08	String for string literal is not resizable	Nobuyoshi Nakada

2023-11-02	Make String.new size pools aware.	Jean Boussier
	If the required capacity would fit in an embded string, returns one. This can reduce malloc churn for code that use string buffers.
2023-09-27	[DOC] Missing comment markers	Nobuyoshi Nakada

2023-09-26	[Bug #19902] Update the coderange regarding the changed region	Nobuyoshi Nakada

2023-09-01	Use end of char boundary in start_with?	John Hawthorn
	Previously we used the next character following the found prefix to determine if the match ended on a broken character. This had caused surprising behaviour when a valid character was followed by a UTF-8 continuation byte. This commit changes the behaviour to instead look for the end of the last character in the prefix. [Bug #19784] Co-authored-by: ywenc <[email protected]> Co-authored-by: Nobuyoshi Nakada <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/8348
2023-08-26	[Bug #19784] Fix behaviors against prefix with broken encoding	Nobuyoshi Nakada
	- String#start_with? - String#delete_prefix - String#delete_prefix! Notes: Merged: https://github.com/ruby/ruby/pull/8296
2023-08-26	Introduce `at_char_boundary` function	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/8296
2023-08-23	Fix premature string collection during append	Alan Wu
	Previously, the following crashed due to use-after-free with AArch64 Alpine Linux 3.18.3 (aarch64-linux-musl): ```ruby str = 'a' * (3210241024) p({z: str}) ``` 32 MiB is the default for `GC_MALLOC_LIMIT_MAX`, and the crash could be dodged by setting `RUBY_GC_MALLOC_LIMIT_MAX` to large values. Under a debugger, one can see the `str2` of rb_str_buf_append() getting prematurely collected while str_buf_cat4() allocates capacity. Add GC guards so the buffer of `str2` lives across the GC run initiated in str_buf_cat4(). [Bug #19792]
2023-08-22	Use STR_EMBED_P instead of testing STR_NOEMBED	Peter Zhu

2023-08-18	Don't check for STR_NOEMBED in rb_fstring	Peter Zhu
	We don't need to check for STR_NOEMBED because the check above for STR_EMBED_P means that it can never be false. Notes: Merged: https://github.com/ruby/ruby/pull/8238
2023-08-11	[DOC] Don't suppress autolinks (#8208)	Burdette Lamar
	Notes: Merged-By: peterzhu2118 <[email protected]>
2023-08-03	No computing embed_capa_max in str_subseq	Kunshan Wang
	Fix str_subseq so that it does not attempt to predict the size of the object returned by str_alloc_heap. Notes: Merged: https://github.com/ruby/ruby/pull/8165
2023-07-28	Fill terminator properly	Nobuyoshi Nakada

2023-07-15	[Bug #19769] Fix range of size 1 in `String#tr`	alexandre184
	Notes: Merged: https://github.com/ruby/ruby/pull/8080 Merged-By: nobu <[email protected]>
2023-07-09	Make the string index functions closer to symmetric	Nobuyoshi Nakada
	So that irregular parts may be more noticeable. Notes: Merged: https://github.com/ruby/ruby/pull/8047
2023-07-09	Make `rb_str_rindex` return byte index	Nobuyoshi Nakada
	Leave callers to convert byte index to char index, as well as `rb_str_index`, so that `rb_str_rpartition` does not need to re-convert char index to byte index. Notes: Merged: https://github.com/ruby/ruby/pull/8047
2023-07-09	[Bug #19763] Raise same message exception for regexp	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/8045
2023-06-28	Ensure the byte position is a valid boundary	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/7991
2023-06-28	[Bug #19748] Fix out-of-bound access in `String#byteindex`	Nobuyoshi Nakada

2023-06-28	[Bug #19746] `String#index` with regexp should clear `$~` unless matched	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/7988
2023-06-20	[DOC] Regexp doc (#7923)	Burdette Lamar
	Notes: Merged-By: peterzhu2118 <[email protected]>
2023-06-09	Assign into optimal size pools using String#split("")	Matt Valentine-House
	When String#split is used with an empty string as the field seperator it effectively splits the original string into chars, and there is a pre-existing fast path for this using SPLIT_TYPE_CHARS. However this path creates an empty array in the smallest size pool and grows from there, despite already knowing the size of the desired array. This commit pre-allocates the correct size array in this case in order to allow the arrays to be embedded and avoid being allocated in the transient heap Notes: Merged: https://github.com/ruby/ruby/pull/7919
2023-06-06	Unify length field for embedded and heap strings (#7908)	Peter Zhu
	* Unify length field for embedded and heap strings The length field is of the same type and position in RString for both embedded and heap allocated strings, so we can unify it. * Remove RSTRING_EMBED_LEN Notes: Merged-By: maximecb <[email protected]>
2023-06-05	[DOC] Update flags doc for strings	Peter Zhu
	The length of an embedded string is no longer in the flags.
2023-06-01	Simplify duplicated code	Peter Zhu
	The capacity of the string can be calculated using the str_capacity function. Notes: Merged: https://github.com/ruby/ruby/pull/7879
2023-06-01	Don't refetch ptr and len	Peter Zhu
	The call to RSTRING_GETMEM already fetched the pointer and length, so we don't need to fetch it again. Notes: Merged: https://github.com/ruby/ruby/pull/7879
2023-05-26	Remove dead code in string.c	Peter Zhu
	The STR_DEC_LEN macro is not used.
2023-04-06	[Feature #19474] Refactor NEWOBJ macros	Matt Valentine-House
	NEWOBJ_OF is now our canonical newobj macro. It takes an optional ec Notes: Merged: https://github.com/ruby/ruby/pull/7393
2023-04-04	[Feature #19579] Remove !USE_RVARGC code (#7655)	Peter Zhu
	Remove !USE_RVARGC code [Feature #19579] The Variable Width Allocation feature was turned on by default in Ruby 3.2. Since then, we haven't received bug reports or backports to the non-Variable Width Allocation code paths, so we assume that nobody is using it. We also don't plan on maintaining the non-Variable Width Allocation code, so we are going to remove it. Notes: Merged-By: maximecb <[email protected]>
2023-03-18	RJIT: Optimize String#bytesize	Takashi Kokubun