|
plgrep - grep enhanced with perl |
|
plgrep [options] pattern [file(s)] |
|
plgrep is a grep program with several enhancements. Some of these are taken from the GNU grep(1), some from the lesser-known rgrep(1), and some are entirely new. plgrep supports the regular UNIX flavors of regular expressions ( grep, fgrep, and egrep), but since it is written in perl(1), it can also use perl-style regular expressions as in perlre(1), which are even more powerful than egrep. The default behavior, however, is still that of plain grep(1). The flags for standard UNIX grep(1) (-chilnsvy) work as expected, though the -c behavior can be modified by other new options. |
|
-c |
print a count (per file) of the number of lines matching the pattern, instead of printing the lines themselves. (But also see -L, -p, -q, and -Q for possible modifications.) |
||
|
-h |
suppress printing of filenames when searching multiple files. |
||
|
-i, -y |
ignore upper/lower case. |
||
|
-l |
print only filenames containing matching lines (files with multiple matches only get their names printed once). |
||
|
-n |
print line numbers in front of matching lines. |
||
|
-s |
suppress warnings about nonexistent/inaccessible files. |
||
|
-v |
reverse the sense of matching, i.e., print or count non-matching lines. |
|
-e pattern |
|
an alternate way of specifying the regular expression;
useful if the pattern starts with a dash. Also has special
behavior with -f; see below.
|
|
-E |
treat the regular expression as in egrep(1). |
|
-f patfile |
| read pattern(s) to search for from patfile. If patfile is ’-’, pattern(s) are read from standard input. Multiple lines in patfile are concatenated with the alternation operator ’|’. Thus, if patfile contains three lines specifying regular expressions A, B, and C, plgrep will search for ’(A|B|C)’. This is useful for working around the problem of overflowing the shell’s command-line buffer when searching for a very large number of alternative patterns. Also, this option can be combined specially with -e (not a GNU feature): if the -e expression contains %s, the ’|’-concatenated regular expression obtained from reading patfile is inserted into the -e regular expression anywhere a ’%s’ appears. For example, specifying |
|
plgrep -e ’/%s:’ -f patfile |
|
where patfile contains the three regular
expressions A, B, and C, results in a final search pattern
of ’/(A|B|C):’.
See the next section for additional extensions to the -f option. |
|
-F |
treat the regular expression as in fgrep(1), i.e., no metacharacters. |
||
|
-G |
treat the regular expression as in grep(1) (the default). |
||
|
-w |
match the pattern only on word boundaries. |
||
|
-x |
match the pattern only if the whole line matches, i.e., match ’^pattern$’. |
|
-a, -aa |
|
read filenames from standard input instead of the command line. Under -a, any whitespace delimits a filename (useful for e.g. ’ls | plgrep -a pat’); under -aa, each line is taken as a single filename. This is useful in a pipeline, or for huge numbers of filenames. |
|
-A N[,N|-N...][delim] |
|
split each line on delimiter delim and grep only in the Nth field(s). Fields are numbered starting from 1 on the left. Field indexes can be given as N,N,N and/or N-N to specify a range. delim may be a single character or a perl(1) regular expression (in single quotes). If delim is not given, splits on white space like awk(1). |
|
-AA N[,N|-N...][delim] |
|
same as -A, except that leading empty fields are not counted; or in other words, leading instances of the delimiter are stripped before counting fields. |
|
-b |
print the matching part(s) of each line in bold. The escape sequences from tput(1) are used to embolden text. |
||
|
-B |
print the entire file, printing matches in bold as in -b; useful for seeing matches in context. |
||
|
-C |
ignore C/C++ comments when matching. |
||
|
-d |
debug: print the regular expression (on stderr) after it has been massaged into its perl form. |
||
|
-D |
debug: print each filename (on stderr) as it is processed. |
|
-f :patfile |
|
If the -f patfile argument is preceded by a ’:’ character, patfile denotes a "color pattern" file, which directs plgrep to colorize its output. A color pattern file has one regex per line preceded by a color specification: |
| colorspec whitespace regex |
|
Regexps are treated in the usual fashion, but "colorspec" indicates how each is colored in the output. The colorspec consists of 0-2 digits optionally followed by "+". A single digit specifies the foreground text color; a second digit specifies the background color; and "+" indicates that the text should be bold (or on terminals that support it, brightened). If colorspec is omitted, the matched regex will be printed but not colorized. Colors are numbered by the terminal’s color palette; typically colors 0-9 are black, red, green, yellow, blue, magenta, cyan, white, white, and "default", respectively. Note: you may need to set your terminal to something like "xterm", and manipulate your path for a modern version of tput(1) (e.g. from the ncurses package) to make colorization work properly. You can test different color outputs using the shell commands |
| [tput setaf #;] [tput setab #;] [tput bold;] echo string; tput sgr0 |
|
where the setaf and setab tput arguments are single digits representing the text and background colors, respectively. |
|
-H |
opposite of -h; force printing of filename even if only one file is given. |
||
|
-I |
set perl’s input record separator, which by default is "\n". This is useful for processing multi-line records. |
|
-j string |
|
only affects the printing behavior of -p; ignored if -p is not given. If an input line has multiple matches, the matching subparts are concatenated using ’join($string,...)’ and output as a single line. For example, |
| plgrep -p -j , ’[A-Z][a-z]+’ |
|
will print a comma-separated list of all capitalized words on each line. |
|
-k N |
under -o, tty interrupts will be sent to the child process rather than to plgrep itself. If N > 0, the child process will also be sent SIGALRM after N seconds. |
||
|
-K |
under -k, print a line to stderr indicating <INT> or <ALRM> when a child process is sent SIGINT or SIGALRM. The line is formatted as if it had been grepped from the file, e.g. using -t. |
||
|
-L N |
print names of files with exactly N matches (equivalent to -L N-N below). |
|
-L [N1]-[N2] |
|
like -LN, but print names of files with N1 <= N <= N2 matches. If given N1 > N2, plgrep swaps the values and treats them as -LN2-N1. Either or both of N1 and N2 may be zero. If omitted, N1 and N2 default to 1 and infinity, respectively, except in the case of -L-0 which is equivalent to -L0. -L1- is equivalent to -l. Note that -c does not cause -l or -L to be ignored, unlike UNIX grep. A file’s name and match-count are printed only if it the count falls in the specified range. -L overrides -l. |
|
-m N[,N...] |
|
ignored unless -p is in effect. Print (or count) only the Nth numbered match(es) on each line, if present. For example, if -p -m 2 are in effect, print only the second instance of a pattern match on any given line. If -p -c -m 2 are in effect, count only lines which have at least 2 instances of the match. If -p -l -m 2 are in effect, list only the files containing a line which has at least 2 instances of the match. Multiple values of N can be specified, causing multiple different matches to be printed or counted per line, if they are present. |