IPtracking.net

Regex Cheatsheet

In this blog i am going to go though the easy and quick way to learn Regex.

Regular expression are extremely usefull in extracting information from text. In Regex everything is character and we write patterns to match the same and extract the required info.

  1. abc - any ascii characters can be matched using the same ascii chracter.

    • The text “abc123” can be matched with “abc123” regex for full match.
    • if “abc123” , “abc123xyz” are the two text and both can be matched with “abc123” regex. The xyz in the second text is not matched.

  2. \d and \D - \d matches any digits and \D matches any alphabets

    • “abc123” in this regex “\d” matches the first digit 1
    • abc123” in this regex “\D” matches the first alphabet a
    • abc123” in this regex “\D\D\D\d\d\d” matches full text

  3. dot or . - dot is a wildcard match , it is used to match any character. The dot text character can be matched with .

    • The text “abc123” can be matched with “……” regex for full match.
    • only the “.” regex matches the “a” in “abc123”
    • To match “abc.” use “….” regex and “..” matched “c.”

  4. [abc] - the dot meta character is so powerfull and it matches everything, to match only the a or b or c character use [abc] regex and it mathes only the single character.

    • abc123” in this regex [abc] matches the first match of a
    • “abc123” in this regex [123] matched the first match of 1

  5. [^abc] - ignores the abc character from the text. This is used to exclude specific character

    • “abc123”, “xbc123” in this regex [b] matches both the text and [^x][b] ignores the the second text with value x. only the first text is matched.

  6. [a-z] - Character range used to match any character length.

    • abc123” in this regex [a-z] matches the first match of any value between a to z, in this text it is a
    • “abc123” in this regex [1-9] matches the first match of any value between 1 to 9, in this test it is 1
    • \w matches any alpha numeric character
    • \W matches any non alpha numeric character

  7. {m} - To Catch the repeted workds.

    • “abcccc123”, “abc123” in this regex x{4} mathesh 4 times repeated c, in this only the first text gets matched
    • x{2,4} matches the x range from 2 to 4 repeated characters.

  8. ”*” - To match 0 or more repetations of any of the above regex

    • aaabc123” in this regex \D* matches abc
    • “abc11123” in this regex \d* matches 123

  9. ”.” - To match one or more repetations of any of the above regex.

    • aaabc123” in this regex \D. matches aaa
    • “abc11123” in this regex \d. matches 111

  10. ? - metacharacter allows you to match either zero or one of the preceeding character or group.

    • abc123”, “123abc” in this regex \D? only matches the a in the first text and doesnt match the 2nd text.
    • It requeres the text has to start with the regex character.

  11. \s - This matches all whitespaces

    • abc”, “ “abc in this regex \s matches both the single space and multiple space from both the text.

  12. \S - Matches all the non whitespace characters. its just opposite of \s

  13. ^…$ - The ^ (Hat) and $ (dollar) sign used to match the metacharacter at start and end of the line.

    • “abc123” , “123abc” in this text if we use abc regex it will match both the text, in case if we want to match only the first text then we need to use ^abc.
    • To only match the second text from the above use abc$

  14. (…) - To extract information from the text use match group regex.

    • abc123 in this regex (abc) only extracts the abc from the text

  15. (a(bc)) - To capture the Sub-group use the nested groups regex

    • abc123 in this to extract abc123 and 123 separately use regex (\D(\d))

  16. (.*) - To capture all the text use Capture all regex

  17. (abc|def) - Logical or operator to capture abc or xyz from the text.

    • abc123, xyz123 to match both the text use regex (abc|xyz)