For example,before the?POSIX?standard,to match alphanumeric characters,you would have had to write?/[A-Za-z0-9]/. If your character set had other alphabetic characters in it,this would not match them,and if your character set collated differently from?ASCII,this might not even match the?ASCII?alphanumeric characters. With the?POSIX?character classes,you can write?/[[:alnum:]]/,and this matches the alphabetic and numeric characters in your character set.
Two additional special sequences can appear in character lists. These apply to non-ASCII?character sets,which can have single symbols (called?collating elements) that are represented with more than one character,as well as several characters that are equivalent for?collating,or sorting,purposes. (E.g.,in French,a plain ‘‘e’’ and a grave-accented e` are equivalent.)
Description |
</tr>
<tr valign="top">
<td width="6%">?</td>
<td>A collating symbol is a multi-character collating element enclosed in?[.?and?.]. For example,if?ch?is a collating element,then[[.ch.]]?is a regular expression that matches this collating element,while?[ch]?is a regular expression that matches eitherc?or?h.</td>
</tr>
<tr valign="top">
<td colspan="2">Equivalence Classes</td>
</tr>
<tr valign="top">
<td width="6%">?</td>
<td>An equivalence class is a locale-specific name for a list of characters that are equivalent. The name is enclosed in?[=?and=]. For example,the name?e?might be used to represent all of ‘‘e,’’ ‘‘e',’’ and ‘‘e.’’ In this case,?<strong>[[=e=]]</strong>?is a regular expression that matches any of?<strong>e</strong>,?<strong>e'</strong>,or?<strong>e .</td>
</tr>
<tr>
<td colspan="2">These features are very valuable in non-English speaking locales. The library functions that?gawk?uses for regular expression matching currently only recognize?POSIXcharacter classes; they do not recognize collating symbols or equivalence classes.</td>
</tr>
<tr>
<td colspan="2">The?y,?B,?&;,?>,?w,?W,?‘,and?’?operators are specific to?gawk; they are extensions based on facilities in the?GNU?regular expression libraries.</td>
</tr>
<tr>
<td colspan="2">The various command line options control how?gawk?interprets characters in regular expressions.</td>
</tr>
<tr valign="top">
<td colspan="2">No options</td>
</tr>
<tr valign="top">
<td width="6%">?</td>
<td>In the default case,?gawk?provide all the facilities of?POSIXregular expressions and the?GNU?regular expression operators described above. However,interval expressions are not supported.</td>
</tr>
<tr valign="top">
<td>--posix</td>
<td valign="bottom">Only?POSIX?regular expressions are supported,the?GNUoperators are not special. (E.g.,?w?matches a literal?w). Interval expressions are allowed.</td>
</tr>
<tr valign="top">
<td colspan="2">--traditional</td>
</tr>
<tr valign="top">
<td width="6%">?</td>
<td>Traditional Unix?awk?regular expressions are matched. The?GNUoperators are not special,interval expressions are not available,and neither are the?POSIX?character classes ([[:alnum:]]?and so on). Characters described by octal and hexadecimal escape sequences are treated literally,even if they represent regular expression metacharacters.</td>
</tr>
<tr valign="top">
<td colspan="2">--re-interval</td>
</tr>
<tr valign="top">
<td width="6%">?</td>
<td>Allow interval expressions in regular expressions,even if?--traditional?has been provided.</td>
</tr>
Actions
Action statements are enclosed in braces,?{?and?}. Action statements consist of the usual assignment,conditional,and looping statements found in most languages. The operators,control statements,and input/output statements available are patterned after those in C.
Operators
The operators in?AWK,in order of decreasing precedence,are
<table class="src" border="1" cellspacing="0" cellpadding="5"> |