> On 27 Jun 2024, at 12:31, Mike Schinkel <[email protected]> wrote:
>
>> On Jun 26, 2024, at 8:14 AM, Gina P. Banyard <[email protected] <mailto:[email protected]>> wrote:
>>
>>
>> On Wednesday, 26 June 2024 at 06:18, Mike Schinkel <[email protected] <mailto:[email protected]>> wrote:
>>> https://3v4l.org/RDYFs#v8.3.8
>>>
>>> Note those seven use-cases are found in around the first 25 results when searching
>>> GitHub for "strtok(". I could probably find more if I kept looking:
>>>
>>> https://github.com/search?q=strtok%28+language%3APHP+&type=code
>>>
>>> Regarding explode($delimiter, $str)[0] — unless it is to be special-cased during
>>> compilation —it is a really inefficient way to find the substring up to the first character,
>>> especially for large strings and/or when in a tight loop where the explode is contained in a called
>>> function
>>
>> Then use a regex: https://3v4l.org/SGWL5
>
> Using preg_match()
instead of strtok()
to process the ~4k file of
> commas is, on average, same as using explode()[0], or 10x as long as using strtok()
(at
> times it got as low as 4.4x, but that was rare):
>
> https://onlinephp.io/c/e1fad
>
> Size of file: 3972
> Number of commas: 359
> Time taken for strtok: 0.003 seconds
> Time taken for regex: 0.0307 seconds
> Times strtok() faster: 10.25
>
>> Or a combination of strpos and substr.
>
>
> Using strpos()
+ substr()
instead of strtok()
to process
> the ~4k file of commas is, took on average ~3x as long as using strtok()
. I implemented
> a class for this and tried to optimize it by using only string positions and not copying the string
> repeatedly. It also took about 1/2 hour to get the code working vs. about 15 seconds to get the code
> working with strtok(); which will most programmers prefer?
>
> https://onlinephp.io/c/2a09f
>
> Size of file: 3972
> Number of commas: 359
> Time for strtok: 0.0027 seconds
> Time for strpos/substr: 0.0089 seconds
> Times strtok() faster: 3.31
>
>
>> There are *plenty* of solutions to the specific problem you pose here, and thus many
>> different solutions more or less appropriate.
>
> Yes, and in all cases the existing solutions are significantly slower, except one.
>
> And that one solution that is not significantly slower is to not deprecate
> strtok()
. Not to mention not deprecating would keep from causing lots of BC breakage.
>
> -Mike
Hi All,
I do appreciate that strtok has a kind of bizarre signature/use pattern and potential for confusion
due to how subsequent calls work, but to me that sounds like a better result for uses that need the
repeated call functionality, would be to introduce a builtin StringTokenizer
class that
wraps the underlying strtok_r C call and uses internal state to keep track of the string being
tokenized.
As a "works the same" solution for grabbing the first segment of a string up to any of the
delimiter chars, could the strpbrk
function be expanded with a
$before_needle
arg like strstr
has? (strstr matches on an exact substring,
not on any pf a list of characters)
Cheers
Stephen