[#108176] [Ruby master Bug#18679] Encoding::UndefinedConversionError: "\xE2" from ASCII-8BIT to UTF-8 — "taf2 (Todd Fisher)" <noreply@...>

Issue #18679 has been reported by taf2 (Todd Fisher).

8 messages 2022/04/05

[#108185] [Ruby master Feature#18683] Allow to create hashes with a specific capacity. — "byroot (Jean Boussier)" <noreply@...>

Issue #18683 has been reported by byroot (Jean Boussier).

13 messages 2022/04/06

[#108198] [Ruby master Feature#18685] Enumerator.product: Cartesian product of enumerators — "knu (Akinori MUSHA)" <noreply@...>

Issue #18685 has been reported by knu (Akinori MUSHA).

8 messages 2022/04/08

[#108201] [Ruby master Misc#18687] [ANN] Upgraded bugs.ruby-lang.org to Redmine 5.0 — "hsbt (Hiroshi SHIBATA)" <noreply@...>

Issue #18687 has been reported by hsbt (Hiroshi SHIBATA).

10 messages 2022/04/09

[#108216] [Ruby master Misc#18691] An option to run `make rbconfig.rb` in a different directory — "jaruga (Jun Aruga)" <noreply@...>

Issue #18691 has been reported by jaruga (Jun Aruga).

14 messages 2022/04/12

[#108225] [Ruby master Misc#18726] CI Error on c99 and c2x — "znz (Kazuhiro NISHIYAMA)" <noreply@...>

Issue #18726 has been reported by znz (Kazuhiro NISHIYAMA).

11 messages 2022/04/14

[#108235] [Ruby master Bug#18729] Method#owner and UnboundMethod#owner are incorrect after using Module#public/protected/private — "Eregon (Benoit Daloze)" <noreply@...>

Issue #18729 has been reported by Eregon (Benoit Daloze).

28 messages 2022/04/14

[#108237] [Ruby master Bug#18730] Double `return` event handling with different tracepoints — "hurricup (Alexandr Evstigneev)" <noreply@...>

Issue #18730 has been reported by hurricup (Alexandr Evstigneev).

8 messages 2022/04/14

[#108294] [Ruby master Bug#18743] Enumerator#next / peek re-use each others stacktraces — sos4nt <noreply@...>

Issue #18743 has been reported by sos4nt (Stefan Schテシテ殕er).

20 messages 2022/04/19

[#108301] [Ruby master Bug#18744] I used Jazzy to generate the doc for my iOS library, but it showed me a bug — "zhaoxinqiang (marc steven)" <noreply@...>

Issue #18744 has been reported by zhaoxinqiang (marc steven).

8 messages 2022/04/20

[ruby-core:108417] [Ruby master Feature#18757] Introduce %R for anchored regular expression patterns

From: "zeke (Zeke Gabrielse)" <noreply@...>
Date: 2022-04-27 15:00:05 UTC
List: ruby-core #108417
Issue #18757 has been updated by zeke (Zeke Gabrielse).

Description updated

Fix `validates_format` pattern

----------------------------------------
Feature #18757: Introduce %R for anchored regular expression patterns
https://bugs.ruby-lang.org/issues/18757#change-97449

* Author: zeke (Zeke Gabrielse)
* Status: Open
* Priority: Normal
----------------------------------------
When defining regular expression patterns, it's often the case that you want to anchor with `\A` and `\z` to match the full text input, rather than `^` and `$`, respectively, which may (unintentionally) match text including newlines. This is especially true in the context of an web application such as a Rails app. Unfortunately, `\A` and `\z` reduce the legibility of a regular expression.

For example, take this `ActionMailbox` usage:

```ruby
class ApplicationMailbox < ActionMailbox::Base
  routing %r{\Areplies\+.*?@ruby-lang\.org\z}i => :replies
  routing %r{\Asales@.*?\z}i                   => :leads
end
```

At first glance, it may look as if the second route matches `Asales`, but that's not the case upon further inspection. To improve legibility, a developer may choose to use `^` instead of `\A`. Because when defining a pattern using `\A` and `\z`, readability suffers, but especially for `\A`. In other cases, developers forget to use `\A` and `\z` over `^` or `$` when validating or matching against user input.

I propose Ruby introduces a new percent-notation, `%R{}`, for defining interpolated regular expression patterns that automatically anchor a pattern with `\A` and `\z`.

For example, the above will look like below:

```ruby
class ApplicationMailbox < ActionMailbox::Base
  routing %R{replies\+.*?@ruby-lang\.org}i => :replies
  routing %R{sales@.*?}i                   => :leads
end
```

This is much more readable, and it's safer — developers using `%R{}` are not going to accidentally use `^` or `$` instead of `\A` and `\z`, respectively (the former being vulnerable to matching input data containing newlines).

This is especially useful in pattern matching data where some values may be a symbol or a string, depending on where the data originated (internally vs externally):

```ruby
data = { type: :foo, id: 1 } # Could also be: { type: 'foo', id: 1 }

case data
in type: %R(foo), id:
  # ...
else
end
```

Formally, the new anchored regex percent notation would work as follows:

```ruby
re = %R(test)
# => /\Atest\z/

re.match?('test')    # => true
re.match?('testing') # => false
re.match?('a test')  # => false
re.match?(:test)     # => true
re.match?(:testing)  # => false
re.match?(:a_test)   # => false
```

This would also be useful for data validation purposes, where a developer could clean up patterns that previously used regular expressions with `\A...\z` and `^...$`, such as with Rails model validations, e.g. `validates_format(with: %R{[-a-z0-9]+)`

I do understand that having an uppercase `%R` behaves differently than other percent notations (i.e. lowercase is typically non-interpolated, uppercase interpolated), but since `%r` already allows interpolation, I figured it was okay to be a bit different. Regardless — I'm open to other syntax suggestions.



-- 
https://bugs.ruby-lang.org/

Unsubscribe: <mailto:[email protected]?subject=unsubscribe>
<http://lists.ruby-lang.org/cgi-bin/mailman/options/ruby-core>

In This Thread