发布于 2015-07-11 02:53:07 | 299 次阅读 | 评论: 0 | 来源: 网络整理
正则表达式文本是一个模式之间的斜线之间或任意分隔符 %r 如下:
/pattern/im # option can be specified
%r!/usr/local! # general delimited regular expression
line1 = "Cats are smarter than dogs";
line2 = "Dogs also like meat";
if ( line1 =~ /Cats(.*)/ )
puts "Line1 starts with Cats"
if ( line2 =~ /Cats(.*)/ )
puts "Line2 starts with Dogs"
Line1 starts with Cats
修饰符 | 描述 |
i | Ignore case when matching text. |
o | Perform #{} interpolations only once, the first time the regexp literal is evaluated. |
x | Ignores whitespace and allows comments in regular expressions |
m | Matches multiple lines, recognizing newlines as normal characters |
u,e,s,n | Interpret the regexp as Unicode (UTF-8), EUC, SJIS, or ASCII. If none of these modifiers is specified, the regular expression is assumed to use the source encoding. |
%Q分隔字符串文字一样,Ruby允许正则表达式带 %r,然后由所选择的定界符。这是非常有用的,当所描述的模式中包含正斜杠字符不希望转义:
# Following matches a single slash character, no escape required
# Flag characters are allowed with this syntax, too
除控制字符, (+ ? . * ^ $ ( ) [ ] { } | ), 所有字符匹配。可以转义控制字符前面加上反斜线。
模式 | 描述 |
^ | Matches beginning of line. |
$ | Matches end of line. |
. | Matches any single character except newline. Using m option allows it to match newline as well. |
[...] | Matches any single character in brackets. |
[^...] | Matches any single character not in brackets |
re* | Matches 0 or more occurrences of preceding expression. |
re+ | Matches 1 or more occurrence of preceding expression. |
re? | Matches 0 or 1 occurrence of preceding expression. |
re{ n} | Matches exactly n number of occurrences of preceding expression. |
re{ n,} | Matches n or more occurrences of preceding expression. |
re{ n, m} | Matches at least n and at most m occurrences of preceding expression. |
a| b | Matches either a or b. |
(re) | Groups regular expressions and remembers matched text. |
(?imx) | Temporarily toggles on i, m, or x options within a regular expression. If in parentheses, only that area is affected. |
(?-imx) | Temporarily toggles off i, m, or x options within a regular expression. If in parentheses, only that area is affected. |
(?: re) | Groups regular expressions without remembering matched text. |
(?imx: re) | Temporarily toggles on i, m, or x options within parentheses. |
(?-imx: re) | Temporarily toggles off i, m, or x options within parentheses. |
(?#...) | Comment. |
(?= re) | Specifies position using a pattern. Doesn't have a range. |
(?! re) | Specifies position using pattern negation. Doesn't have a range. |
(?> re) | Matches independent pattern without backtracking. |
w | Matches word characters. |
W | Matches nonword characters. |
s | Matches whitespace. Equivalent to [tnrf]. |
S | Matches nonwhitespace. |
d | Matches digits. Equivalent to [0-9]. |
D | Matches nondigits. |
A | Matches beginning of string. |
Z | Matches end of string. If a newline exists, it matches just before newline. |
z | Matches end of string. |
G | Matches point where last match finished. |
b | Matches word boundaries when outside brackets. Matches backspace (0x08) when inside brackets. |
B | Matches nonword boundaries. |
n, t, etc. | Matches newlines, carriage returns, tabs, etc. |
1...9 | Matches nth grouped subexpression. |
10 | Matches nth grouped subexpression if it matched already. Otherwise refers to the octal representation of a character code. |
例子 | 描述 |
/ruby/ | Match "ruby". |
¥ | Matches Yen sign. Multibyte characters are suported in Ruby 1.9 and Ruby 1.8. |
例子 | 描述 |
/[Rr]uby/ | Match "Ruby" or "ruby" |
/rub[ye]/ | Match "ruby" or "rube" |
/[aeiou]/ | Match any one lowercase vowel |
/[0-9]/ | Match any digit; same as /[0123456789]/ |
/[a-z]/ | Match any lowercase ASCII letter |
/[A-Z]/ | Match any uppercase ASCII letter |
/[a-zA-Z0-9]/ | Match any of the above |
/[^aeiou]/ | Match anything other than a lowercase vowel |
/[^0-9]/ | Match anything other than a digit |
例子 | 描述 |
/./ | Match any character except newline |
/./m | In multiline mode . matches newline, too |
/d/ | Match a digit: /[0-9]/ |
/D/ | Match a nondigit: /[^0-9]/ |
/s/ | Match a whitespace character: /[ trnf]/ |
/S/ | Match nonwhitespace: /[^ trnf]/ |
/w/ | Match a single word character: /[A-Za-z0-9_]/ |
/W/ | Match a nonword character: /[^A-Za-z0-9_]/ |
例子 | 描述 |
/ruby?/ | Match "rub" or "ruby": the y is optional |
/ruby*/ | Match "rub" plus 0 or more ys |
/ruby+/ | Match "rub" plus 1 or more ys |
/d{3}/ | Match exactly 3 digits |
/d{3,}/ | Match 3 or more digits |
/d{3,5}/ | Match 3, 4, or 5 digits |
例子 | 描述 |
/<.*>/ | Greedy repetition: matches "<ruby>perl>" |
/<.*?>/ | Nongreedy: matches "<ruby>" in "<ruby>perl>" |
例子 | 描述 |
/Dd+/ | No group: + repeats d |
/(Dd)+/ | Grouped: + repeats Dd pair |
/([Rr]uby(, )?)+/ | Match "Ruby", "Ruby, ruby, ruby", etc. |
例子 | 描述 |
/([Rr])uby&1ails/ | Match ruby&rails or Ruby&Rails |
/(['"])(?:(?!1).)*1/ | Single or double-quoted string. 1 matches whatever the 1st group matched . 2 matches whatever the 2nd group matched, etc. |
例子 | 描述 |
/ruby|rube/ | Match "ruby" or "rube" |
/rub(y|le))/ | Match "ruby" or "ruble" |
/ruby(!+|?)/ | "ruby" followed by one or more ! or one ? |
例子 | 描述 |
/^Ruby/ | Match "Ruby" at the start of a string or internal line |
/Ruby$/ | Match "Ruby" at the end of a string or line |
/ARuby/ | Match "Ruby" at the start of a string |
/RubyZ/ | Match "Ruby" at the end of a string |
/bRubyb/ | Match "Ruby" at a word boundary |
/brubB/ | B is nonword boundary: match "rub" in "rube" and "ruby" but not alone |
/Ruby(?=!)/ | Match "Ruby", if followed by an exclamation point |
/Ruby(?!!)/ | Match "Ruby", if not followed by an exclamation point |
例子 | 描述 |
/R(?#comment)/ | Matches "R". All the rest is a comment |
/R(?i)uby/ | Case-insensitive while matching "uby" |
/R(?i:uby)/ | Same as above |
/rub(?:y|le))/ | Group only without creating 1 backreference |
String方法最重要的,使用正则表达式sub 和 gsub,他们就地变种sub! 和 gsub!
所有这些方法执行搜索和替换操作过程中使用一个正则表达式模式。sub & sub!替换第一次出现的模式 gsub & gsub!替换所有出现。
sub! 和 gsub! 返回一个新的字符串,未经修改的原始 sub 和 gsub 他们被称为修改字符串。
phone = "2004-959-559 #This is Phone Number"
# Delete Ruby-style comments
phone = phone.sub!(/#.*$/, "")
puts "Phone Num : #{phone}"
# Remove anything other than digits
phone = phone.gsub!(/D/, "")
puts "Phone Num : #{phone}"
Phone Num : 2004-959-559
Phone Num : 2004959559
text = "rails are rails, really good Ruby on Rails"
# Change "rails" to "Rails" throughout
text.gsub!("rails", "Rails")
# Capitalize the word "Rails" throughout
text.gsub!(/brailsb/, "Rails")
puts "#{text}"
Rails are Rails, really good Ruby on Rails