Programming2015. 9. 25. 03:23

Javascript based regular expression engine.

http://regexpal.com/ - save it to file and enable global checkmark which is enabled by default.


Python reqular expression test

http://pythex.org/

------------

grep

egrep = grep -e

awk, vi, emacs

POSIX

BRE - Basic R.E.

ERE - Extended R.E.

-------------


\ : excape character

\t : tab

\r, \n, \r\n : new lines


------------

[aeiou] : any of vowel

[0-9] : - is range character

[A-Za-z] : all alphabet

[^aeiou] : non vowel

------------

[\d\s] : digit and space

[^\d\s] : non digit and non space

[\D\S] : all ==> don't use multiple capital characters


*POSIX expressions are not popular



------------

#Year

(19|20)\d\d

/(19[5-9]\d|20[0-4]\d)/

------------

# Does first letter have to be capitalized?

Regex:  /^[A-Z][A-Za-z]+$/m


# Capture first, middle, and last name

Regex:  /^([A-Z][A-Za-z.'\-]+) (?:([A-Z][A-Za-z.'\-]+) )?([A-Z][A-Za-z.'\-]+)$/m

------------

# U.S. postal codes

Regex:  /^\d{5}(-\d{4})?$/m


# Canada postal codes

Regex:  /^[A-Z]\d[A-Z] \d[A-Z]\d$/m

------------

# Email and domain specifications allow other characters

Regex:  /^[\w.%+\-]+@[\w.\-]+\.[A-Za-z]{2,3}$/


Regex:  /^[\w.%+\-]+@[\w.\-]+\.[A-Za-z]{2,6}$/

String: "someone@somewhere.museum"

------------

# Query string portion

Regex:  /^(http|https):\/\/[\w\-_]+(\.[\w\-_]+)+.*$/

Regex:  /^(http|https):\/\/[\w\-_]+(\.[\w\-_]+)+[/?#]?.*$/

Regex:  /^(http|https):\/\/[\w\-_]+(\.[\w\-_]+)+[\w\-.,@?^=%&:/~\\+#]*$/


# Make groups non-capturing

Regex:  /^(?:http|https):\/\/[\w\-_]+(?:\.[\w\-_]+)+[\w\-.,@?^=%&:/~\\+#]*$/

------------

# Decimal numbers

Regex:  /^\d+\.\d+$/m

Regex:  /^\d?\.\d+$/m

Regex:  /^\d*\.?\d*$/m


# U.S. Dollar

Regex:  /^\$(\d*\.\d{2}|\d+)$/m

------------

# 12 hour time

Regex:  /^(0?[1-9]|1[0-2]):[0-5][0-9]$/m


# Optional am/pm


Regex:  /^(0?[1-9]|1[0-2]):[0-5][0-9](am|pm|AM|PM)?$/m

Regex:  /^(0?[1-9]|1[0-2]):[0-5][0-9]([aApP][mM])?$/m


Timezone


Regex:  /^([0-1]?[0-9]|[2][0-3]):[0-5][0-9](:[0-5][0-9])?( [A-Z]{3})?$/m

Regex:  /^([0-1]?[0-9]|2[0-3]):[0-5][0-9](:[0-5][0-9])?( ([A-Z]{3}|GMT [-+]([0-9]|1[0-2])))?$/m

------------

.Any character except newline.
\.A period (and so on for \*, \(, \\, etc.)
^The start of the string.
$The end of the string.
\d,\w,\sA digit, word character [A-Za-z0-9_], or whitespace.
\D,\W,\SAnything except a digit, word character, or whitespace.
[abc]Character a, b, or c.
[a-z]a through z.
[^abc]Any character except a, b, or c.
aa|bbEither aa or bb.
?Zero or one of the preceding element.
*Zero or more of the preceding element.
+One or more of the preceding element.
{n}Exactly n of the preceding element.
{n,}n or more of the preceding element.
{m,n}Between m and n of the preceding element.
??,*?,+?,
{n}?, etc.
Same as above, but as few as possible.
(expr)Capture expr for use with \1, etc.
(?:expr)Non-capturing group.
(?=expr)Followed by expr.
(?!expr)Not followed by expr.

------------

Regular expression cheat sheet

Special characters

\escape special characters
.matches any character
^matches beginning of string
$matches end of string
[5b-d]matches any chars '5', 'b', 'c' or 'd'
[^a-c6]matches any char except 'a', 'b', 'c' or '6'
R|Smatches either regex R or regex S
()creates a capture group and indicates precedence

Quantifiers

*0 or more (append ? for non-greedy)
+1 or more (append ? for non-greedy)
?0 or 1 (append ? for non-greedy)
{m}exactly mm occurrences
{m, n}from m to nm defaults to 0, n to infinity
{m, n}?from m to n, as few as possible

Special sequences

\Astart of string
\bmatches empty string at word boundary (between \wand \W)
\Bmatches empty string not at word boundary
\ddigit
\Dnon-digit
\swhitespace: [ \t\n\r\f\v]
\Snon-whitespace
\walphanumeric: [0-9a-zA-Z_]
\Wnon-alphanumeric
\Zend of string
\g<id>matches a previously defined group

Special sequences

(?iLmsux)matches empty string, sets re.X flags
(?:...)non-capturing version of regular parentheses
(?P...)matches whatever matched previously named group
(?P=)digit
(?#...)a comment; ignored
(?=...)lookahead assertion: matches without consuming
(?!...)negative lookahead assertion
(?<=...)lookbehind assertion: matches if preceded
(?<!...)negative lookbehind assertion
(?(id)yes|no)match 'yes' if group 'id' matched, else 'no'


------------

Unicode

Unicode indicator: \u

caf\u00E9 = cafe (e is unicode)


Unicode wildcard : \X (only for Perl and PHP)


------------


-----------


-----------

'Programming' 카테고리의 다른 글

Python - Date, Time  (0) 2015.09.29
Python with Django  (0) 2015.09.26
Book - EXCEL Hacks  (0) 2015.09.24
file/directory 비교 tool - Meld  (0) 2015.09.24
git 과 repo  (0) 2015.09.19
Posted by 쁘레드