정규표현식 – 생활코딩
생화코딩 강의
https://www.youtube.com/watch?v=V_ePeBaQzSc&list=PLB9NsRTifc1nc8J5BJhkE3AP9oieECVpJ
https://opentutorials.org/course/909/5143
http://zvon.org/comp/r/tut-Regexp.html#Pages~Contents
정규식 테스트 사이트
PHP 정규표현식 – 생활코딩
https://opentutorials.org/module/6/5141
목차
1. 정규표현식의 기본 패턴 (1~2)
2. 위치와 이스케이핑 (3~4)
3. 정규표현식의 패턴들 (5~6) 모든 문자
4. 정규표현식의 패턴들 (7~9) 특정문자 ([])
5. 정규표현식의 패턴들 (10) subpattern
6. 정규표현식의 패턴들 (11~14) 수량자
17. 정규표현식의 패턴들 (15~17) 수량자 2
18. 정규표현식의 패턴들 (18~24) 경계
19. 정규표현식의 패턴들 (25~26) Assertions
1. 정규표현식의 기본 패턴 (1~2)
Page 1
Regular expressions are case sensitive. Therefore Case 1 will find the specified text, but Case 2 will not.
Source
Hello, world!
Case 1
Regular Expression: Hello
First match: Hello, world!
All matches: Hello, world!
Case 2
Regular Expression: hello
First match: Hello, world!
All matches: Hello, world!
—————————————————————————-
Page 2
Each character inside the search pattern is significant including whitespace characters (space, tab, new line).
Source
Hello, world!
Case 1
Regular Expression: Hello, world
First match: Hello, world!
All matches: Hello, world!
Case 2
Regular Expression: Hello, world
First match: Hello, world!
All matches: Hello, world!
2. 위치와 이스케이핑 (3~4)
Page 3
Some characters have special meanings. Character ^ matches the beginning of the line (Case 1) while dollar sign $ the end of the line (Case 2)
Source
who is who
Case 1
Regular Expression: ^who
First match: who is who
All matches: who is who
Case 2
Regular Expression: who$
First match: who is who
All matches: who is who
Page 4
# 이스케이핑이란 정규표현식에서 어떤 역할을 가진 특수 문자를 보통 문자로 탈출시키는 것
If literal value of a special character is required, it must be escaped with a backslash \.
Case 1 does not match anything as both characters a special,
Case 2 matches all $,
Case 3 matches $ only if it is the first and
Case 4 the last character.
Backslash has special meaning and must be also escaped for literal use (Case 5).
Source
$12$ \-\ $25$
Case 1
Regular Expression: ^$
First match: $12$ \-\ $25$
All matches: $12$ \-\ $25$
Case 2
Regular Expression: \$
First match: $12$ \-\ $25$
All matches: $12$ \-\ $25$
Case 3
Regular Expression: ^\$
First match: $12$ \-\ $25$
All matches: $12$ \-\ $25$
Case 4
Regular Expression: \$$
First match: $12$ \-\ $25$
All matches: $12$ \-\ $25$
Case 5
Regular Expression: \\
First match: $12$ \-\ $25$
All matches: $12$ \–\ $25$
3. 정규표현식의 패턴들 (5~6) 모든 문자
Page 5
Point . matches any character.
Source
Regular expressions are powerful!!!
Case 1
Regular Expression: .
First match: Regular expressions are powerful!!!
All matches: Regular expressions are powerful!!!
Case 2
Regular Expression: ……
First match: Regular expressions are powerful!!!
All matches: Regular expressions are powerful!!!
4. 정규표현식의 패턴들 (7~9) 특정문자 ([])
Page 7
Inside square brackets “[]” a list of characters can be provided. The expression matches if any of these characters is found. The order of characters is insignificant.(Case 3)
Source
How do you do?
Case 1
Regular Expression: [oyu]
First match: How do you do?
All matches: How do you do?
Case 2
Regular Expression: [dH].
First match: How do you do?
All matches: How do you do?
Case 3
Regular Expression: [owy][yow]
First match: How do you do?
All matches: How do you do?
Page 8
A range of characters can be specified with [ – ] syntax. Case 1 and Case 2 are equivalent. Several ranges can be given in one expression (Case 5).
Source
ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
Case 1
Regular Expression: [C-K]
First match: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
All matches: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
Case 2
Regular Expression: [CDEFGHIJK]
First match: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
All matches: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
Case 3
Regular Expression: [a-d]
First match: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
All matches: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
Case 4
Regular Expression: [2-6]
First match: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
All matches: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
Case 5
Regular Expression: [C-Ka-d2-6]
First match: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
All matches: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
Page 9
If a character class starts with ^, then specified characters will not be selected
Source
ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
Case 1
Regular Expression: [^CDghi45]
First match: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
All matches: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
Case 2
Regular Expression: [^W-Z]
First match: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
All matches: ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 0123456789
5. 정규표현식의 패턴들 (10) subpattern
Page 10
Alternating text can be enclosed in parentheses and alternatives separated with |.
Source
Monday Tuesday Friday
Case 2
Regular Expression: (on|ues|rida)
First match: Monday Tuesday Friday
All matches: Monday Tuesday Friday
Case 2
Regular Expression: (Mon|Tues|Fri)day
First match: Monday Tuesday Friday
All matches: Monday Tuesday Friday
Case 3
Regular Expression: ..(id|esd|nd)ay
First match: Monday Tuesday Friday
All matches: Monday Tuesday Friday
6. 정규표현식의 패턴들 (11~14) 수량자
Page 11
Quantifiers specify how many times a character can occur. Star * (Case 1) matches zero or more times, plus + (Case 2) once or more times and question mark ? (Case 3) zero or once.
Source
aabc abc bc
Case 1
Regular Expression: a*b
First match: aabc abc bc
All matches: aabc abc bc
Case 2
Regular Expression: a+b
First match: aabc abc bc
All matches: aabc abc bc
Case 3
Regular Expression: a?b
First match: aabc abc bc
All matches: aabc abc bc
Page 12
Several examples of “*” quantifier
Source
-@- *** — “*” — *** -@-
Case 1
Regular Expression: .*
First match: -@- *** — “*” — *** -@-
All matches: -@- *** — “*” — *** -@-
Case 2
Regular Expression: -A*-
First match: -@- *** — “*” — *** -@-
All matches: -@- *** — “*” — *** -@-
Case 3
Regular Expression: [-@]*
First match: -@- *** — “*” — *** -@-
All matches: -@- *** — “*” — *** -@-
Page 13
Several examples of “+” quantifier
Source
-@@@- * ** – – “*” — * ** -@@@-
Case 1
Regular Expression: \*+
First match: -@@@- * ** – – “*” — * ** -@@@-
All matches: -@@@- * ** – – “*” — * ** -@@@-
Case 2
Regular Expression: -@+-
First match: -@@@- * ** – – “*” — * ** -@@@-
All matches: -@@@- * ** – – “*” — * ** -@@@-
Case 3
Regular Expression: [^ ]+
First match: –@@@- * ** – – “*” — * ** -@@@-
All matches: -@@@- * ** – – “*” — * ** -@@@-
Page 14
Several examples of “?” quantifier
Source
–XX-@-XX-@@-XX-@@@-XX-@@@@-XX-@@-@@-
Case 1
Regular Expression: -X?XX?X
First match: –-XX-@-XX-@@-XX-@@@-XX-@@@@-XX-@@-@@-
All matches: –-XX-@-XX-@@-XX-@@@-XX-@@@@-XX-@@-@@-
Case 2
Regular Expression: -@?@?@?-
First match: —XX-@-XX-@@-XX-@@@-XX-@@@@-XX-@@-@@-
All matches: —XX-@-XX-@@-XX-@@@-XX-@@@@-XX-@@-@@-
Case 3
Regular Expression: [^@]@?@
First match: –XX-@-XX-@@-XX-@@@-XX-@@@@-XX-@@-@@-
All matches: –XX-@-XX-@@-XX-@@@-XX-@@@@-XX-@@-@@–
17. 정규표현식의 패턴들 (15~17) 수량자 2
Page 15
Curly brackets enable precise specification of character repetitions.
{m} matches precisely m times (Case 1), {m,n} matches minimaly m times and maximaly n times (Case 2) and {m,}matches minimaly m times(Case 3).
Source
One ring to bring them all and in the darkness bind them
Case 1
Regular Expression: .{5}
First match: One ring to bring them all and in the darkness bind them
All matches: One ring to bring them all and in the darkness bind them
Case 2
Regular Expression: [els]{1,3}
First match: One ring to bring them all and in the darkness bind them
All matches: One ring to bring them all and in the darkness bind them
Case 3
Regular Expression: [a-z]{3,}
First match: One ring to bring them all and in the darkness bind them
All matches: One ring to bring them all and in the darkness bind them
Page 16
Quantifiers “*”, “+”, and “?” are special cases of the bracket notation. “*” is equivalent to {0,} (Case 1, Case 2), “+” to {1,} (Case 3, Case 4), and “?” to {0,1} (Case 5, Case 6).
Source
AA ABA ABBA ABBBA
Case 1
Regular Expression: AB*A
First match: AA ABA ABBA ABBBA
All matches: AA ABA ABBA ABBBA
Case 2
Regular Expression: AB{0,}A
First match: AA ABA ABBA ABBBA
All matches: AA ABA ABBA ABBBA
Case 3
Regular Expression: AB+A
First match: AA ABA ABBA ABBBA
All matches: AA ABA ABBA ABBBA
Case 4
Regular Expression: AB{1,}A
First match: AA ABA ABBA ABBBA
All matches: AA ABA ABBA ABBBA
Case 5
Regular Expression: AB?A
First match: AA ABA ABBA ABBBA
All matches: AA ABA ABBA ABBBA
Case 6
Regular Expression: AB{0,1}A
First match: AA ABA ABBA ABBBA
All matches: AA ABA ABBA ABBBA
Page 17
By default any subpattern matches as many times as possible. This behaviour is changed to matching the minimum number if quantifier is followed with the question mark. Compare “*” (Case 1) with “*?” (Case 2), “+” (Case 3) with “+?” (Case 4), and “?” (Case 5) with “??” (Case 6).
Source
One ring to bring them all and in the darkness bind them
Case 1
Regular Expression: r.*
First match: One ring to bring them all and in the darkness bind them
All matches: One ring to bring them all and in the darkness bind them
Case 2
Regular Expression: r.*?
First match: One ring to bring them all and in the darkness bind them
All matches: One ring to bring them all and in the darkness bind them
Case 3
Regular Expression: r.+
First match: One ring to bring them all and in the darkness bind them
All matches: One ring to bring them all and in the darkness bind them
Case 4
Regular Expression: r.+?
First match: One ring to bring them all and in the darkness bind them
All matches: One ring to bring them all and in the darkness bind them
Case 5
Regular Expression: r.?
First match: One ring to bring them all and in the darkness bind them
All matches: One ring to bring them all and in the darkness bind them
Case 6
Regular Expression: r.??
First match: One ring to bring them all and in the darkness bind them
All matches: One ring to bring them all and in the darkness bind them
18. 정규표현식의 패턴들 (18~24) 경계
Page 18
\w matches any word character ( alphanumeric plus “_” ).
In some languages these letter abbreviations are not recognized. Use character classes (“[A-z0-9_]“) instead (Case 5).
Source
A1 B2 c3 d_4 e:5 ffGG77–__–
Case 1
Regular Expression: \w
First match: A1 B2 c3 d_4 e:5 ffGG77–__–
All matches: A1 B2 c3 d_4 e:5 ffGG77—__—
Case 2
Regular Expression: \w*
First match: A1 B2 c3 d_4 e:5 ffGG77–__–
All matches: A1 B2 c3 d_4 e:5 ffGG77—__—
Case 3
Regular Expression: [a-z]\w*
First match: A1 B2 c3 d_4 e:5 ffGG77–__–
All matches: A1 B2 c3 d_4 e:5 ffGG77–__–
Case 4
Regular Expression: \w{5}
First match: A1 B2 c3 d_4 e:5 ffGG77–__–
All matches: A1 B2 c3 d_4 e:5 ffGG77–__–
Case 5
Regular Expression: [A-z0-9_]
First match: A1 B2 c3 d_4 e:5 ffGG77–__–
All matches: A1 B2 c3 d_4 e:5 ffGG77—__—
Page 19
\W matches any non-word character (everything but alphanumeric plus “_” ). Compare Case 1 and Case 2. It is equivalent to “[^A-z0-9_]“.
Source
AS _34:AS11.23 @#$ %12^*
Case 1
Regular Expression: \W
First match: AS _34:AS11.23 @#$ %12^*
All matches: AS _34:AS11.23 @#$ %12^*
Case 2
Regular Expression: \w
First match: AS _34:AS11.23 @#$ %12^*
All matches: AS _34:AS11.23 @#$ %12^*
Case 3
Regular Expression: [^A-z0-9_]
First match: AS _34:AS11.23 @#$ %12^*
All matches: AS _34:AS11.23 @#$ %12^* // 캐럿(^) 주의
Page 20
\s matches white space characters: space, new line and tab. \S matches any non-whitespace character.
Source
Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case 1
Regular Expression: \s // 공백만 선택됨
First match: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
All matches: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case 2
Regular Expression: \S // 공백 이외의 모든 문자가 선택됨
First match: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
All matches: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Page 21
\d matches any digit and \D anything else. Compare Case 1 and Case 2. Use “[0-9]” if your programming language does not support this abbreviation (Case 3).
Source
Page 123; published: 1234 id=12#24@112
Case 1
Regular Expression: \d
First match: Page 123; published: 1234 id=12#24@112
All matches: Page 123; published: 1234 id=12#24@112
Case 2
Regular Expression: \D
First match: Page 123; published: 1234 id=12#24@112
All matches: Page 123; published: 1234 id=12#24@112
Case 3
Regular Expression: [0-9]
First match: Page 123; published: 1234 id=12#24@112
All matches: Page 123; published: 1234 id=12#24@112
Page 22
\b matches a word boundary. A word boundary (\b) is defined as a spot between two characters that has a \w on one side of it and a \W on the other side of it (in either order).
Source
Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case 1
Regular Expression: \b.
First match: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
All matches: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case 2
Regular Expression: .\b
First match: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
All matches: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Page 23
\B matches a non (word boundary). A word boundary (\b) is defined as a spot between two characters that has a \w on one side of it and a \W on the other side of it (in either order).
Source
Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case 1
Regular Expression: \B.
First match: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
All matches: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case 2
Regular Expression: .\B
First match: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
All matches: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Page 24
\A matches the beginning of string. It is similar to ^, but ^ will match after each newline, if multiline strings are considered. Similarly, \Z matches only at the end of the string or before newline at the end of it. It is similar to $, but $ will match before each newline.
# multiline 플래그가 있더라도 \A 는 무조건 처음을 가리킴, \Z는 무조건 마지막을 가리킴
# multiline 플래그가 있다면 ^ 은 각 줄의 처음을 가리킴, $ 는 각 줄의 마지막을 가리킴
Source
Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case 1
Regular Expression: \A…
First match: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
All matches: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
Case 2
Regular Expression: …\Z
First match: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
All matches: Ere iron was found or tree was hewn, When young was mountain under moon; Ere ring was made, or wrought was woe, It walked the forests long ago.
19. 정규표현식의 패턴들 (25~26) Assertions
Page 25
(?=<pattern>) will look ahead if the pattern exists, but will not include it in the hit.
# 긍정형 전방 탐색 : 패턴이 true 일 경우 패턴의 앞 부분을 찾지만 결과에는 포함되지 않음
Source
AAAX—aaax—111
Case 1
Regular Expression: \w+(?=X)
First match: AAAX—aaax—111
All matches: AAAX—aaax—111
Case 2
Regular Expression: \w+
First match: AAAX—aaax—111
All matches: AAAX—aaax—111
Case 3
Regular Expression: \w+(?=\w)
First match: AAAX—aaax—111
All matches: AAAX—aaax—111
Page 26
(?!<pattern>) will look ahead if the pattern exists. If it does there will be no hit.
# 부정형 전방 탐색 : 패턴이 false 일 경우 패턴의 앞 부분을 찾지만 결과에는 포함되지 않음
Source
AAAX—AAA
Case 1
Regular Expression: AAA(?!X)
First match: AAAX—AAA
All matches: AAAX—AAA
Case 2
Regular Expression: AAA
First match: AAAX—AAA
All matches: AAAX—AAA