regexp

package standard library
go1.26.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 7, 2026 License: BSD-3-Clause Imports: 9 Imported by: 507,717

Documentation

Overview

Package regexp implements regular expression search.

The syntax of the regular expressions accepted is the same general syntax used by Perl, Python, and other languages. More precisely, it is the syntax accepted by RE2 and described at https://golang.org/s/re2syntax, except for \C. For an overview of the syntax, see the regexp/syntax package.

The regexp implementation provided by this package is guaranteed to run in time linear in the size of the input. (This is a property not guaranteed by most open source implementations of regular expressions.) For more information about this property, see https://swtch.com/~rsc/regexp/regexp1.html or any book about automata theory.

All characters are UTF-8-encoded code points. Following utf8.DecodeRune, each byte of an invalid UTF-8 sequence is treated as if it encoded utf8.RuneError (U+FFFD).

There are 16 methods of Regexp that match a regular expression and identify the matched text. Their names are matched by this regular expression:

Find(All)?(String)?(Submatch)?(Index)?

If 'All' is present, the routine matches successive non-overlapping matches of the entire expression. Empty matches abutting a preceding match are ignored. The return value is a slice containing the successive return values of the corresponding non-'All' routine. These routines take an extra integer argument, n. If n >= 0, the function returns at most n matches/submatches; otherwise, it returns all of them.

If 'String' is present, the argument is a string; otherwise it is a slice of bytes; return values are adjusted as appropriate.

If 'Submatch' is present, the return value is a slice identifying the successive submatches of the expression. Submatches are matches of parenthesized subexpressions (also known as capturing groups) within the regular expression, numbered from left to right in order of opening parenthesis. Submatch 0 is the match of the entire expression, submatch 1 is the match of the first parenthesized subexpression, and so on.

If 'Index' is present, matches and submatches are identified by byte index pairs within the input string: result[2*n:2*n+2] identifies the indexes of the nth submatch. The pair for n==0 identifies the match of the entire expression. If 'Index' is not present, the match is identified by the text of the match/submatch. If an index is negative or text is nil, it means that subexpression did not match any string in the input. For 'String' versions an empty string means either no match or an empty match.

There is also a subset of the methods that can be applied to text read from an io.RuneReader: Regexp.MatchReader, Regexp.FindReaderIndex, Regexp.FindReaderSubmatchIndex.

This set may grow. Note that regular expression matches may need to examine text beyond the text returned by a match, so the methods that match text from an io.RuneReader may read arbitrarily far into the input before returning.

(There are a few other methods that do not match this pattern.)

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	// Compile the expression once, usually at init time.
	// Use raw strings to avoid having to quote the backslashes.
	var validID = regexp.MustCompile(`^[a-z]+\[[0-9]+\]$`)

	fmt.Println(validID.MatchString("adam[23]"))
	fmt.Println(validID.MatchString("eve[7]"))
	fmt.Println(validID.MatchString("Job[48]"))
	fmt.Println(validID.MatchString("snakey"))
}
Output:
true
true
false
false

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func Match

func Match(pattern string, b []byte) (matched bool, err error)

Match reports whether the byte slice b contains any match of the regular expression pattern. More complicated queries need to use Compile and the full Regexp interface.

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	matched, err := regexp.Match(`foo.*`, []byte(`seafood`))
	fmt.Println(matched, err)
	matched, err = regexp.Match(`bar.*`, []byte(`seafood`))
	fmt.Println(matched, err)
	matched, err = regexp.Match(`a(b`, []byte(`seafood`))
	fmt.Println(matched, err)

}
Output:
true <nil>
false <nil>
false error parsing regexp: missing closing ): `a(b`

func MatchReader

func MatchReader(pattern string, r io.RuneReader) (matched bool, err error)

MatchReader reports whether the text returned by the io.RuneReader contains any match of the regular expression pattern. More complicated queries need to use Compile and the full Regexp interface.

func MatchString

func MatchString(pattern string, s string) (matched bool, err error)

MatchString reports whether the string s contains any match of the regular expression pattern. More complicated queries need to use Compile and the full Regexp interface.

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	matched, err := regexp.MatchString(`foo.*`, "seafood")
	fmt.Println(matched, err)
	matched, err = regexp.MatchString(`bar.*`, "seafood")
	fmt.Println(matched, err)
	matched, err = regexp.MatchString(`a(b`, "seafood")
	fmt.Println(matched, err)
}
Output:
true <nil>
false <nil>
false error parsing regexp: missing closing ): `a(b`

func QuoteMeta

func QuoteMeta(s string) string

QuoteMeta returns a string that escapes all regular expression metacharacters inside the argument text; the returned string is a regular expression matching the literal text.

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	fmt.Println(regexp.QuoteMeta(`Escaping symbols like: .+*?()|[]{}^$`))
}
Output:
Escaping symbols like: \.\+\*\?\(\)\|\[\]\{\}\^\$

Types

type Regexp

type Regexp struct {
	// contains filtered or unexported fields
}

Regexp is the representation of a compiled regular expression. A Regexp is safe for concurrent use by multiple goroutines, except for configuration methods, such as Regexp.Longest.

func Compile

func Compile(expr string) (*Regexp, error)

Compile parses a regular expression and returns, if successful, a Regexp object that can be used to match against text.

When matching against text, the regexp returns a match that begins as early as possible in the input (leftmost), and among those it chooses the one that a backtracking search would have found first. This so-called leftmost-first matching is the same semantics that Perl, Python, and other implementations use, although this package implements it without the expense of backtracking. For POSIX leftmost-longest matching, see CompilePOSIX.

func CompilePOSIX

func CompilePOSIX(expr string) (*Regexp, error)

CompilePOSIX is like Compile but restricts the regular expression to POSIX ERE (egrep) syntax and changes the match semantics to leftmost-longest.

That is, when matching against text, the regexp returns a match that begins as early as possible in the input (leftmost), and among those it chooses a match that is as long as possible. This so-called leftmost-longest matching is the same semantics that early regular expression implementations used and that POSIX specifies.

However, there can be multiple leftmost-longest matches, with different submatch choices, and here this package diverges from POSIX. Among the possible leftmost-longest matches, this package chooses the one that a backtracking search would have found first, while POSIX specifies that the match be chosen to maximize the length of the first subexpression, then the second, and so on from left to right. The POSIX rule is computationally prohibitive and not even well-defined. See https://swtch.com/~rsc/regexp/regexp2.html#posix for details.

func MustCompile

func MustCompile(str string) *Regexp

MustCompile is like Compile but panics if the expression cannot be parsed. It simplifies safe initialization of global variables holding compiled regular expressions.

func MustCompilePOSIX

func MustCompilePOSIX(str string) *Regexp

MustCompilePOSIX is like CompilePOSIX but panics if the expression cannot be parsed. It simplifies safe initialization of global variables holding compiled regular expressions.

func (*Regexp) AppendText added in go1.24.0

func (re *Regexp) AppendText(b []byte) ([]byte, error)

AppendText implements encoding.TextAppender. The output matches that of calling the Regexp.String method.

Note that the output is lossy in some cases: This method does not indicate POSIX regular expressions (i.e. those compiled by calling CompilePOSIX), or those for which the Regexp.Longest method has been called.

func (*Regexp) Copy deprecated added in go1.6

func (re *Regexp) Copy() *Regexp

Copy returns a new Regexp object copied from re. Calling Regexp.Longest on one copy does not affect another.

Deprecated: In earlier releases, when using a Regexp in multiple goroutines, giving each goroutine its own copy helped to avoid lock contention. As of Go 1.12, using Copy is no longer necessary to avoid lock contention. Copy may still be appropriate if the reason for its use is to make two copies with different Regexp.Longest settings.

func (*Regexp) Expand

func (re *Regexp) Expand(dst []byte, template []byte, src []byte, match []int) []byte

Expand appends template to dst and returns the result; during the append, Expand replaces variables in the template with corresponding matches drawn from src. The match slice should have been returned by Regexp.FindSubmatchIndex.

In the template, a variable is denoted by a substring of the form $name or ${name}, where name is a non-empty sequence of letters, digits, and underscores. A purely numeric name like $1 refers to the submatch with the corresponding index; other names refer to capturing parentheses named with the (?P<name>...) syntax. A reference to an out of range or unmatched index or a name that is not present in the regular expression is replaced with an empty slice.

In the $name form, name is taken to be as long as possible: $1x is equivalent to ${1x}, not ${1}x, and, $10 is equivalent to ${10}, not ${1}0.

To insert a literal $ in the output, use $$ in the template.

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	content := []byte(`
	# comment line
	option1: value1
	option2: value2

	# another comment line
	option3: value3
`)

	// Regex pattern captures "key: value" pair from the content.
	pattern := regexp.MustCompile(`(?m)(?P<key>\w+):\s+(?P<value>\w+)$`)

	// Template to convert "key: value" to "key=value" by
	// referencing the values captured by the regex pattern.
	template := []byte("$key=$value\n")

	result := []byte{}

	// For each match of the regex in the content.
	for _, submatches := range pattern.FindAllSubmatchIndex(content, -1) {
		// Apply the captured submatches to the template and append the output
		// to the result.
		result = pattern.Expand(result, template, content, submatches)
	}
	fmt.Println(string(result))
}
Output:
option1=value1
option2=value2
option3=value3

func (*Regexp) ExpandString

func (re *Regexp) ExpandString(dst []byte, template string, src string, match []int) []byte

ExpandString is like Regexp.Expand but the template and source are strings. It appends to and returns a byte slice in order to give the calling code control over allocation.

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	content := `
	# comment line
	option1: value1
	option2: value2

	# another comment line
	option3: value3
`

	// Regex pattern captures "key: value" pair from the content.
	pattern := regexp.MustCompile(`(?m)(?P<key>\w+):\s+(?P<value>\w+)$`)

	// Template to convert "key: value" to "key=value" by
	// referencing the values captured by the regex pattern.
	template := "$key=$value\n"

	result := []byte{}

	// For each match of the regex in the content.
	for _, submatches := range pattern.FindAllStringSubmatchIndex(content, -1) {
		// Apply the captured submatches to the template and append the output
		// to the result.
		result = pattern.ExpandString(result, template, content, submatches)
	}
	fmt.Println(string(result))
}
Output:
option1=value1
option2=value2
option3=value3

func (*Regexp) Find

func (re *Regexp) Find(b []byte) []byte

Find returns a slice holding the text of the leftmost match in b of the regular expression. A return value of nil indicates no match.

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	re := regexp.MustCompile(`foo.?`)
	fmt.Printf("%q\n", re.Find([]byte(`seafood fool`)))

}
Output:
"food"

func (*Regexp) FindAll

func (re *Regexp) FindAll(b []byte, n int) [][]byte

FindAll is the 'All' version of Regexp.Find; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	re := regexp.MustCompile(`foo.?`)
	fmt.Printf("%q\n", re.FindAll([]byte(`seafood fool`), -1))

}
Output:
["food" "fool"]

func (*Regexp) FindAllIndex

func (re *Regexp) FindAllIndex(b []byte, n int) [][]int

FindAllIndex is the 'All' version of Regexp.FindIndex; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	content := []byte("London")
	re := regexp.MustCompile(`o.`)
	fmt.Println(re.FindAllIndex(content, 1))
	fmt.Println(re.FindAllIndex(content, -1))
}
Output:
[[1 3]]
[[1 3] [4 6]]

func (*Regexp) FindAllString

func (re *Regexp) FindAllString(s string, n int) []string

FindAllString is the 'All' version of Regexp.FindString; it returns a slice of all successive matches of the expression, as defined by the 'All' description in the package comment. A return value of nil indicates no match.

Example
package main

import (
	"fmt"
	"regexp"
)

func main() {
	re := regexp.MustCompile(`a.`)
	fmt.Println(re.FindAllString("paranormal", -1))
	fmt.Println(re.FindAllString("paranormal", 2))
	fmt.Println(re.FindAllString("graal", -1))
	fmt.Println(re.FindAllString("none", -1))
}
Output:
[ar an al]
[ar an]
[aa]
[]

func (*Regexp) FindAllStringIndex