ruby.git - The Ruby Programming Language

Age	Commit message (Collapse)	Author
24 hours	[Bug #20998] Check if the string is frozen in rb_str_locktmp() & ↵	Benoit Daloze
	rb_str_unlocktmp() Notes: Merged: https://github.com/ruby/ruby/pull/13615
4 days	Get rid of FL_EXIVAR	Jean Boussier
	Now that the shape_id gives us all the same information, it's no longer needed. Notes: Merged: https://github.com/ruby/ruby/pull/13612
4 days	Use the `shape_id` rather than `FL_EXIVAR`	Jean Boussier
	We still keep setting `FL_EXIVAR` so that `rb_shape_verify_consistency` can detect discrepancies. Notes: Merged: https://github.com/ruby/ruby/pull/13612
4 days	Add SHAPE_ID_HAS_IVAR_MASK for quick ivar check	Jean Boussier
	This allow checking if an object has ivars with just a shape_id mask. Notes: Merged: https://github.com/ruby/ruby/pull/13606
2025-05-29	[Bug #21380] Prohibit modification in String#split block	Nobuyoshi Nakada
	Reported at https://hackerone.com/reports/3163876 Notes: Merged: https://github.com/ruby/ruby/pull/13462
2025-05-27	Rename `rb_shape_set_shape_id` in `rb_obj_set_shape_id`	Jean Boussier
	Notes: Merged: https://github.com/ruby/ruby/pull/13450
2025-05-26	[DOC] More tweaks for String#byteindex	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13440
2025-05-26	Add shape_id to RBasic under 32 bit	John Hawthorn
	This makes `RBobject` `4B` larger on 32 bit systems but simplifies the implementation a lot. [Feature #21353] Co-authored-by: Jean Boussier <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/13341
2025-05-25	Use RB_VM_LOCKING	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/13439
2025-05-22	[DOC] Tweaks for String#byteindex	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13365
2025-05-16	[DOC] Tweaks for String#append_as_bytes	Burdette Lamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13352 Merged-By: peterzhu2118 <[email protected]>
2025-05-16	[DOC] Tweaks for String#b	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13354
2025-05-16	[DOC] Tweaks for String#ascii_only?	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13353
2025-05-15	[DOC] Tweaks for String#=~ (#13325)	Burdette Lamar
	Notes: Merged-By: peterzhu2118 <[email protected]>
2025-05-14	[DOC] Tweaks for String#<< (#13306)	Burdette Lamar
	Notes: Merged-By: peterzhu2118 <[email protected]>
2025-05-14	[DOC] Tweaks for String#== (#13323)	Burdette Lamar
	Notes: Merged-By: peterzhu2118 <[email protected]>
2025-05-14	[DOC] Tweaks for String#[] (#13335)	Burdette Lamar
	Notes: Merged-By: peterzhu2118 <[email protected]>
2025-05-14	[DOC] Tweaks for String#[]=	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13336
2025-05-13	[DOC] Tweaks for String#<=>	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13320
2025-05-13	[DOC] Remove a garbage	Nobuyoshi Nakada

2025-05-12	[DOC] Tweak for String#+@ (#13285)	Burdette Lamar
	Notes: Merged-By: peterzhu2118 <[email protected]>
2025-05-08	[DOC] Tweaks for What's Here	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13281
2025-05-08	[DOC] Tweaks for String#-@	Burdette Lamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13254 Merged-By: peterzhu2118 <[email protected]>
2025-05-08	Move `object_id` in object fields.	Jean Boussier
	And get rid of the `obj_to_id_tbl` It's no longer needed, the `object_id` is now stored inline in the object alongside instance variables. We still need the inverse table in case `_id2ref` is invoked, but we lazily build it by walking the heap if that happens. The `object_id` concern is also no longer a GC implementation concern, but a generic implementation. Co-Authored-By: Matt Valentine-House <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/13159
2025-05-04	[DOC] Tweaks for String#+	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13252
2025-05-04	[DOC] Tweaks for String#*	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13251
2025-05-04	[DOC] Tweaks for String#%	BurdetteLamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13244
2025-05-01	[DOC] Tweaks for String.new	Burdette Lamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13027 Merged-By: peterzhu2118 <[email protected]>
2025-04-30	Suppress gcc 15 unterminated-string-initialization warnings	Nobuyoshi Nakada

2025-04-23	Fix comparison of signed and unsigned integers	Jean Boussier
	``` ../string.c:660:38: warning: comparison of integers of different signs: 'rb_atomic_t' (aka 'unsigned int') and 'int' [-Wsign-compare] 660 \| RUBY_ASSERT(table->count < table->capacity / 2); ``` Notes: Merged: https://github.com/ruby/ruby/pull/13160
2025-04-19	Fix style [ci skip]	Nobuyoshi Nakada

2025-04-19	Implement dsize function for `fstring_table_type`	Jean Boussier
	The fstring table size used to be reported as part of the VM size, but since it was refactored to be lock-less it was no longer reported. Since it's now wrapped by a `T_DATA`, we can implement its `dsize` function and get a valuable insight into the size of the table. ``` {"address":"0x100ebff18", "type":"DATA", "shape_id":0, "slot_size":80, "struct":"VM/fstring_table", "memsize":131176, ... ``` Notes: Merged: https://github.com/ruby/ruby/pull/13138
2025-04-19	Fix style of recent fstring feature	Jean Boussier
	Notes: Merged: https://github.com/ruby/ruby/pull/13137
2025-04-18	Lock-free hash set for fstrings [Feature #21268]	John Hawthorn
	This implements a hash set which is wait-free for lookup and lock-free for insert (unless resizing) to use for fstring de-duplication. As highlighted in https://bugs.ruby-lang.org/issues/19288, heavy use of fstrings (frozen interned strings) can significantly reduce the parallelism of Ractors. I tried a few other approaches first: using an RWLock, striping a series of RWlocks (partitioning the hash N-ways to reduce lock contention), and putting a cache in front of it. All of these improved the situation, but were unsatisfying as all still required locks for writes (and granular locks are awkward, since we run the risk of needing to reach a vm barrier) and this table is somewhat write-heavy. My main reference for this was Cliff Click's talk on a lock free hash-table for java https://www.youtube.com/watch?v=HJ-719EGIts. It turns out this lock-free hash set is made easier to implement by a few properties: * We only need a hash set rather than a hash table (we only need keys, not values), and so the full entry can be written as a single VALUE * As a set we only need lookup/insert/delete, no update * Delete is only run inside GC so does not need to be atomic (It could be made concurrent) * I use rb_vm_barrier for the (rare) table rebuilds (It could be made concurrent) We VM lock (but don't require other threads to stop) for table rebuilds, as those are rare * The conservative garbage collector makes deferred replication easy, using a T_DATA object Another benefits of having a table specific to fstrings is that we compare by value on lookup/insert, but by identity on delete, as we only want to remove the exact string which is being freed. This is faster and provides a second way to avoid the race condition in https://bugs.ruby-lang.org/issues/21172. This is a pretty standard open-addressing hash table with quadratic probing. Similar to our existing st_table or id_table. Deletes (which happen on GC) replace existing keys with a tombstone, which is the only type of update which can occur. Tombstones are only cleared out on resize. Unlike st_table, the VALUEs are stored in the hash table itself (st_table's bins) rather than as a compact index. This avoids an extra pointer dereference and is possible because we don't need to preserve insertion order. The table targets a load factor of 2 (it is enlarged once it is half full). Notes: Merged: https://github.com/ruby/ruby/pull/12921
2025-04-18	Extract rb_gc_free_fstring to string.c	John Hawthorn
	This allows more flexibility in how we deal with the fstring table Notes: Merged: https://github.com/ruby/ruby/pull/12921
2025-04-14	Assert the GVL is held when performing various `rb_` functions.	Samuel Williams
	[Feature #20877] Notes: Merged: https://github.com/ruby/ruby/pull/11975
2025-04-02	[DOC] Tweaks to String::try_convert	Burdette Lamar
	Notes: Merged: https://github.com/ruby/ruby/pull/13030 Merged-By: peterzhu2118 <[email protected]>
2025-03-27	Freeze $/ and make it ractor safe	Étienne Barrié
	[Feature #21109] By always freezing when setting the global rb_rs variable, we can ensure it is not modified and can be accessed from a ractor. We're also making sure it's an instance of String and does not have any instance variables. Of course, if $/ is changed at runtime, it may cause surprising behavior but doing so is deprecated already anyway. Co-authored-by: Jean Boussier <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/12975
2025-03-08	string.c: Improve `fstring_hash` to reduce collisions	Jean Boussier
	`rb_str_hash` doesn't include the encoding for ASCII only strings because ASCII only strings are equal regardless of their encoding. But in the case if the `fstring_table`, two identical ASCII strings with different encodings aren't equal. Given it's common to have both `:foo` (or `def foo`) and `"foo"` in the same source code, this causes a lot of collisions in the `fstring_table`. Notes: Merged: https://github.com/ruby/ruby/pull/12881
2025-03-05	Fix a race condition with interned strings sweeping.	Jean Boussier
	[Bug #21172] This fixes a rare CI failure. The timeline of the race condition is: - A `"foo" oid=1` string is interned. - `"foo" oid=1` is no longer referenced and will be swept in the future. - Another `"foo" oid=2` string is interned. - `register_fstring` finds `"foo" oid=1`, but since it is about to be swept, removes it from `fstring_table` and insert `"foo" oid=2` instead. - `"foo" oid=1` is swept, since it has the `RSTRING_FSTR` flag, a `st_delete` is issued in `fstring_table` which removes `"foo" oid=2`. I don't know how to reproduce this bug consistently in a single test case. Notes: Merged: https://github.com/ruby/ruby/pull/12857
2025-02-24	String#gsub! Elide MatchData allocation when we know it can't escape	Jean Boussier
	In gsub is used with a string replacement or a map that doesn't have a default proc, we know for sure no code can cause the MatchData to escape the `gsub` call. In such case, we still have to allocate a new MatchData because we don't know what is the lifetime of the backref, but for any subsequent match we can re-use the MatchData we allocated ourselves, reducing allocations significantly. This partially fixes [Misc #20652], except when a block is used, and partially reduce the performance impact of abc0304cb28cb9dcc3476993bc487884c139fd11 / [Bug #17507] ``` compare-ruby: ruby 3.5.0dev (2025-02-24T09:44:57Z master 5cf146399f) +PRISM [arm64-darwin24] built-ruby: ruby 3.5.0dev (2025-02-24T10:58:27Z gsub-elude-match da966636e9) +PRISM [arm64-darwin24] warming up.... \| \|compare-ruby\|built-ruby\| \|:----------------\|-----------:\|---------:\| \|escape \| 3.577k\| 3.697k\| \| \| -\| 1.03x\| \|escape_bin \| 5.869k\| 6.743k\| \| \| -\| 1.15x\| \|escape_utf8 \| 3.448k\| 3.738k\| \| \| -\| 1.08x\| \|escape_utf8_bin \| 6.361k\| 7.267k\| \| \| -\| 1.14x\| ``` Co-Authored-By: Étienne Barrié <[email protected]>
2025-02-12	Elide string allocation when using `String#gsub` in MAP mode	Jean Boussier
	If the provided Hash doesn't have a default proc, we know for sure that we'll never call into user provided code, hence the string we allocate to access the Hash can't possibly escape. So we don't actually have to allocate it, we can use a fake_str, AKA a stack allocated string. ``` compare-ruby: ruby 3.5.0dev (2025-02-10T13:47:44Z master 3fb455adab) +PRISM [arm64-darwin23] built-ruby: ruby 3.5.0dev (2025-02-10T17:09:52Z opt-gsub-alloc ea5c28958f) +PRISM [arm64-darwin23] warming up.... \| \|compare-ruby\|built-ruby\| \|:----------------\|-----------:\|---------:\| \|escape \| 3.374k\| 3.722k\| \| \| -\| 1.10x\| \|escape_bin \| 5.469k\| 6.587k\| \| \| -\| 1.20x\| \|escape_utf8 \| 3.465k\| 3.734k\| \| \| -\| 1.08x\| \|escape_utf8_bin \| 5.752k\| 7.283k\| \| \| -\| 1.27x\| ``` Notes: Merged: https://github.com/ruby/ruby/pull/12730
2025-01-22	[DOC] Fix code markup in String#match	Kouhei Yanagita
	Notes: Merged: https://github.com/ruby/ruby/pull/12608
2025-01-12	[Doc] Encourage use of encoding constants	Jean Boussier
	Lots of documentation examples still use encoding APIs with encoding names rather than encoding constants. I think it would be preferable to direct users toward constants as it can help with auto-completion, static analysis and such. Notes: Merged: https://github.com/ruby/ruby/pull/12552
2025-01-02	[DOC] Exclude 'Class' and 'Module' from RDoc's autolinking	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/12496
2024-12-13	[DOC] [Feature #20205] Document the new power of String#+@	Alan Wu
	Notes: Merged: https://github.com/ruby/ruby/pull/12341
2024-11-27	Optimize `rb_must_asciicompat`	Jean Boussier
	While profiling `strscan`, I noticed `rb_must_asciicompat` was quite slow, as more than 5% of the benchmark was spent in it: https://share.firefox.dev/49bOcTn By checking for the common 3 ASCII compatible encoding index first, we can skip a lot of expensive operations in the happy path. Notes: Merged: https://github.com/ruby/ruby/pull/12180
2024-11-26	Many of Oniguruma functions need valid encoding strings	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/12169
2024-11-26	Check negative integer underflow	Nobuyoshi Nakada
	Notes: Merged: https://github.com/ruby/ruby/pull/12169
2024-11-25	Place all non-default GC API behind USE_SHARED_GC	Matt Valentine-House
	So that it doesn't get included in the generated binaries for builds that don't support loading shared GC modules Co-Authored-By: Peter Zhu <[email protected]> Notes: Merged: https://github.com/ruby/ruby/pull/12149