Most used Perl regular expression
Includes | |
---|---|
[0-9] | includes all digits |
\d | includes all digits |
\D | includes all that is not digits |
[a-m] | includes all the character from a to m |
[a-z] | includes all the character from a to z |
[A-Z] | includes all upper character from A to Z |
[^az] | includes all but not a nor z |
\w | includes all character like [0-9a-zA-Z_] |
\W | includes all that is not a character |
\s | include a space, a tab, a new line or a return |
\S | include the opposite of \s |
Ex : [0-9a-zA-Z] will match any digit or letter from a to z uppercase or lowercase.
Pattern | |
---|---|
* | 0 or more |
+ | one or more |
? | 0 ore once |
{n} | n times exactly |
{n,} | at least n times |
{,n} | at best n times |
{n,m} | between n and m times |
Ex : [0-9a-zA-Z]+ will match any number of letter or digit
tips : “\” neutralise a special character ex : “\.txt” neutralise the “.” Example of usage with grep :
grep -oP '/documents/[\w\s]+\.txt'
-o excludes what doesn't match the regular expression
-P use Perl regular expressions
To sum up this command ask grep to exclude all that doesn't start with /documents/ followed by one or more printable-characters-or-digits and ending with .txt
echo "<li class='pure-tree_link'><a href='/documents/Invoice_15_11_2020.pdf' target='_blank'>Invoice</a></li><li class='pure-tree_link'><a href='/documents/Report_15_01_2020.pdf' target='_blank'>Report</a></li><li class='pure-tree_link'><a href='/documents/flag_11dfa168ac8eb2958e38425728623c98.txt' target='_blank'>flag</a></li> </ul>" | grep -oP '/documents/[\w\s]+\.txt'
/documents/flag_11dfa168ac8eb2958e38425728623c98.txt