오른손의 좋은 세상만들기

Programming2015. 9. 25. 03:23

Regular Expression

Javascript based regular expression engine.

http://regexpal.com/ - save it to file and enable global checkmark which is enabled by default.

Python reqular expression test

http://pythex.org/

------------

grep

egrep = grep -e

awk, vi, emacs

POSIX

BRE - Basic R.E.

ERE - Extended R.E.

-------------

\ : excape character

\t : tab

\r, \n, \r\n : new lines

------------

[aeiou] : any of vowel

[0-9] : - is range character

[A-Za-z] : all alphabet

[^aeiou] : non vowel

------------

[\d\s] : digit and space

[^\d\s] : non digit and non space

[\D\S] : all ==> don't use multiple capital characters

*POSIX expressions are not popular

------------

#Year

(19|20)\d\d

/(19[5-9]\d|20[0-4]\d)/

------------

# Does first letter have to be capitalized?

Regex: /^[A-Z][A-Za-z]+$/m

# Capture first, middle, and last name

Regex: /^([A-Z][A-Za-z.'\-]+) (?:([A-Z][A-Za-z.'\-]+) )?([A-Z][A-Za-z.'\-]+)$/m

------------

# U.S. postal codes

Regex: /^\d{5}(-\d{4})?$/m

# Canada postal codes

Regex: /^[A-Z]\d[A-Z] \d[A-Z]\d$/m

------------

# Email and domain specifications allow other characters

Regex: /^[\w.%+\-]+@[\w.\-]+\.[A-Za-z]{2,3}$/

Regex: /^[\w.%+\-]+@[\w.\-]+\.[A-Za-z]{2,6}$/

String: "someone@somewhere.museum"

------------

# Query string portion

Regex: /^(http|https):\/\/[\w\-_]+(\.[\w\-_]+)+.*$/

Regex: /^(http|https):\/\/[\w\-_]+(\.[\w\-_]+)+[/?#]?.*$/

Regex: /^(http|https):\/\/[\w\-_]+(\.[\w\-_]+)+[\w\-.,@?^=%&:/~\\+#]*$/

# Make groups non-capturing

Regex: /^(?:http|https):\/\/[\w\-_]+(?:\.[\w\-_]+)+[\w\-.,@?^=%&:/~\\+#]*$/

------------

# Decimal numbers

Regex: /^\d+\.\d+$/m

Regex: /^\d?\.\d+$/m

Regex: /^\d*\.?\d*$/m

# U.S. Dollar

Regex: /^\$(\d*\.\d{2}|\d+)$/m

------------

# 12 hour time

Regex: /^(0?[1-9]|1[0-2]):[0-5][0-9]$/m

# Optional am/pm

Regex: /^(0?[1-9]|1[0-2]):[0-5][0-9](am|pm|AM|PM)?$/m

Regex: /^(0?[1-9]|1[0-2]):[0-5][0-9]([aApP][mM])?$/m

Timezone

Regex: /^([0-1]?[0-9]|[2][0-3]):[0-5][0-9](:[0-5][0-9])?( [A-Z]{3})?$/m

Regex: /^([0-1]?[0-9]|2[0-3]):[0-5][0-9](:[0-5][0-9])?( ([A-Z]{3}|GMT [-+]([0-9]|1[0-2])))?$/m

------------

`.`	Any character except newline.
`\.`	A period (and so on for `\*`, `\(`, `\\`, etc.)
`^`	The start of the string.
`$`	The end of the string.
`\d`,`\w`,`\s`	A digit, word character `[A-Za-z0-9_]`, or whitespace.
`\D`,`\W`,`\S`	Anything except a digit, word character, or whitespace.
`[abc]`	Character a, b, or c.
`[a-z]`	a through z.
`[^abc]`	Any character except a, b, or c.
`aa\|bb`	Either aa or bb.
`?`	Zero or one of the preceding element.
`*`	Zero or more of the preceding element.
`+`	One or more of the preceding element.
`{n}`	Exactly n of the preceding element.
`{n,}`	n or more of the preceding element.
`{m,n}`	Between m and n of the preceding element.
`??`,`*?`,`+?`, `{n}?`, etc.	Same as above, but as few as possible.
`(`expr`)`	Capture expr for use with `\1`, etc.
`(?:`expr`)`	Non-capturing group.
`(?=`expr`)`	Followed by expr.
`(?!`expr`)`	Not followed by expr.

------------

Regular expression cheat sheet

Special characters

`\`	escape special characters
`.`	matches any character
`^`	matches beginning of string
`$`	matches end of string
`[5b-d]`	matches any chars '5', 'b', 'c' or 'd'
`[^a-c6]`	matches any char except 'a', 'b', 'c' or '6'
`R\|S`	matches either regex `R` or regex `S`
`()`	creates a capture group and indicates precedence

Quantifiers

`*`	0 or more (append `?` for non-greedy)
`+`	1 or more (append `?` for non-greedy)
`?`	0 or 1 (append `?` for non-greedy)
`{m}`	exactly `m`m occurrences
`{m, n}`	from `m` to `n`. `m` defaults to 0, `n` to infinity
`{m, n}?`	from `m` to `n`, as few as possible

Special sequences

`\A`	start of string
`\b`	matches empty string at word boundary (between `\w`and `\W`)
`\B`	matches empty string not at word boundary
`\d`	digit
`\D`	non-digit
`\s`	whitespace: `[ \t\n\r\f\v]`
`\S`	non-whitespace
`\w`	alphanumeric: `[0-9a-zA-Z_]`
`\W`	non-alphanumeric
`\Z`	end of string
`\g<id>`	matches a previously defined group

Special sequences

`(?iLmsux)`	matches empty string, sets re.X flags
`(?:...)`	non-capturing version of regular parentheses
`(?P...)`	matches whatever matched previously named group
`(?P=)`	digit
`(?#...)`	a comment; ignored
`(?=...)`	lookahead assertion: matches without consuming
`(?!...)`	negative lookahead assertion
`(?<=...)`	lookbehind assertion: matches if preceded
`(?<!...)`	negative lookbehind assertion
`(?(id)yes\|no)`	match 'yes' if group 'id' matched, else 'no'

------------

Unicode

Unicode indicator: \u

caf\u00E9 = cafe (e is unicode)

Unicode wildcard : \X (only for Perl and PHP)

------------

-----------

저작자표시 비영리 (새창열림)

'Programming' 카테고리의 다른 글

Python - Date, Time (0)	2015.09.29
Python with Django (0)	2015.09.26
Book - EXCEL Hacks (0)	2015.09.24
file/directory 비교 tool - Meld (0)	2015.09.24
git 과 repo (0)	2015.09.19

Posted by 쁘레드

일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

오른손의 좋은 세상만들기

Regular Expression

Special characters

Quantifiers

Special sequences

Special sequences

'Programming' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

글 보관함

달력

링크

티스토리툴바