Hviidnet.com
2Jun/090

How to use findstr with regular expression

By default findstr does the comparison with regular expression. However, what surprised me is that the following command does not work.

findstr "abc|def" test.txt

when test.txt has only abc in it.

According to the online tutorial such as http://www.regular-expressions.info/reference.html, abc|efg should match abc. Why?

The reason is pretty simple, findstr does not support the full range of the regular expression. It does not support ?, {n}.

Some basic things works:

findstr "abc.*" test.txt

findstr "[0-9a-f].*" test.txt

9Jul/080

Regex Basics

Here is some basic regex (Regularexpressions)

\d = digit

\d{1,2} = one or 2 digits

\d{3,} = 3 or more digits

\d+ = 1 or more digits

\d* = 0 or more digits

\D = NON digits

\w = word / letters / numbers

\w{3,6} = 3 to 6 letters or numbers

\W = NON words or numbers

\s = matches any white space character, such as tabs, spaces, and so forth.

\S = Any non space chars. space, tabs and so forth.

\\ = \

\. = .

\? = ?

\+ = +

\* = *

\b = word boundary. if i want a word this specifies the boundary. Example \b[Dd]an\b this would find "Dan" even at the start of a line or at "Dan." and only captures "Dan".

^ = marks the start of a line.

$ = ENDS the line (the same as ^, just for the end insted)

example:

"57\\Server2\Share" -match "\\\\\w+\\\w+" (True)
"57\\Server2\Share" -match "^\\\\\w+\\\w+" (False)

first is marked true while the second realises the 57 should not be there.

Its possible to make groups with the following example:

[a-z] = everything between a and z. This still counts as one character, so "a" would be true, and "ab" would be false.

[abc] =  "a" true, "b" true, "c" true, "ab" false and so on.

| = Or. for example "(?:[a-z]+|\d{1,3})/.([a-z]+|\d{1,3})" would capture "0.3" and "a.5"

() = capture group

(?:regex) = the ?: excludes the capture group from the final result.

\d{1,3}? = here ?  means NOT GREEDY so it would only take "1" in "123"

/i = makes the regex match case insensitive
/s = enables "single-line mode". In this mode, the dot matches newlines.
/m = enables "multi-line mode". In this mode, the caret and dollar match before and after newlines in the subject string.
/x = enables "free-spacing mode". In this mode, whitespace between regex tokens is ignored, and an unescaped # starts a comment.

Feel free to ask questions below.