Analytics supports regular expressions so you can create more flexible definitions for things like view filters, goals, segments, audiences, content groups, and channel groupings. Show This article covers regular expressions in both Universal Analytics and Google Analytics 4. In the context of Analytics, regular expressions are specific sequences of characters that broadly or narrowly match patterns in your Analytics data. For example, if you wanted to create a view filter to exclude site data generated by your own employees, you could use a regular expression to exclude any data from the entire range of IP addresses that serve your employees. Let’s say those IP addresses range from 198.51.100.1 - 198.51.100.25. Rather than enter 25 different IP addresses, you could create a regular expression like 198\.51\.100\.\d* that matches the entire range of addresses. Or if you wanted to create a view filter that included only campaign data from two different cities, you could create a regular expression like San Francisco|New York (San Francisco or New York). Regex metacharactersWildcards.Matches any single character (letter, number or symbol)1. matches10, 1A 1.1 matches 111, 1A1 Examples?Matches the preceding character 0 or 1 times10? matches 1, 10 Examples+Matches the preceding character 1 or more times10+ matches 10, 100 Examples*Matches the preceding character 0 or more times1* matches 1, 10 Examples|Creates an OR match Do not use at the end of an expression1|10 matches 1, 10 Examples Anchors^Matches the adjacent characters at the beginning of a string^10 matches10, 100, 10x ^10 does not match 110, 110x Examples$Matches the adjacent characters at the end of a string10$ matches 110, 1010 10$ does not match 100, 10x Examples Groups( )Matches the enclosed characters in exact order anywhere in a stringAlso used to group other expressions(10) matches 10, 101, 1011 ([0-9]|[a-z]) matches any number or lower-case letter Examples[ ]Matches the enclosed characters in any order anywhere in a string[10] matches 012, 120, 210 Examples-Creates a range of characters within brackets to match anywhere in a string[0-9] matches any number 0 through 9 Examples Escape\Indicates that the adjacent character should be interpreted literally rather than as a regex metacharacter\. indicates that the adjacent dot should be interpreted as a period or decimal rather than as a wildcard.216\.239\.32\.34 matches 216.239.32.34 Examples
TipsDefault behavior between Universal Analytics and Google Analytics 4By default, regular expressions in Universal Analytics properties are treated as a "partial match." The expression will be true if the pattern you provide is contained anywhere in the data. For example, if you provide the pattern "India" the regex matches "India", "Indian", "Indiana", "Indianapolis", and so on. You don't need to use metacharacters to achieve this partial match. In a Google Analytics 4 property, the default regex is a "full match." The data must exactly match the pattern you provide. For example, the pattern "India" only matches "India." To make this regex act like a partial match, you must use metacharacters: "India.*" will return any value that begins with "India" and ends with anything (or nothing) else. Use simple expressionsKeep your regular expressions simple. Simple regex is easier for another user to interpret and modify. Match metacharactersUse the backslash (\) to escape regex metacharacters when you need those characters to be interpreted literally. For example, if you use a dot as the decimal separator in an IP address, escape it with a backslash (\.) so that it isn’t interpreted as a wildcard. Use metacharacters to limit the matchRegular expressions are greedy by nature: if you don’t tell them not to, they match what you specify plus any adjacent characters. For example, in a partial match, site matches mysite, yoursite, theirsite, parasite--any string that contains “site”. If you need to make a specific match, construct you regex accordingly. For example, if you need to match only the string “site”, then construct your regex so that “site” is the both the beginning and end of the string: ^site$. Regular expressions commonly known as a regex (regexes) are a sequence of characters describing a special search pattern in the form of text string. They are basically used in programming world algorithms for matching some loosely defined patterns to achieve some relevant tasks. Some times regexes are understood as a mini programming language with a pattern notation which allows the users to parse text strings. The exact sequence of characters are unpredictable beforehand, so the regex helps in fetching the required strings based on a pattern definition. Regular Expression is a compact way of describing a string pattern that matches a particular amount of text. As you know, PHP is an open-source language commonly used for website creation, it provides regular expression functions as an important tool. Like PHP, many other programming languages have their own implementation of regular expressions. This is the same with other applications also, which have their own support of regexes having various syntaxes. Many available modern languages and tools apply regexes on very large files and strings. Let us look into some of the advantages and uses of regular expressions in our applications. Advantages and uses of Regular expressions:
We cannot cover everything under this topic, but let us look into some of the major regular expression concepts. The following table shows some regular expressions and the corresponding string which matches the regular expression pattern. Regular ExpressionMatchesgeeksThe string “geeks”^geeksThe string which starts with “geeks”geeks$The string which have “geeks” at the end.^geeks$The string where “geeks” is alone on a string.[abc]a, b, or c[a-z]Any lowercase letter[^A-Z]Any letter which is NOT a uppercase letter(gif|png)Either “gif” or “png”[a-z]+One or more lowercase letters^[a-zA-Z0-9]{1, }$Any word with at least one number or one letter([ax])([by])ab, ay, xb, xy[^A-Za-z0-9]Any symbol other than a letter or other than number([A-Z]{3}|[0-9]{5})Matches three letters or five numbersNote: Complex search patterns can be created by applying some basic regular expression rules. Even many arithmetic operators like +, ^, – are used by regular expressions for creating little complex patterns. Operators in Regular Expression: Let us look into some of the operators in PHP regular expressions. OperatorDescription^It denotes the start of string.$It denotes the end of string..It denotes almost any single character.()It denotes a group of expressions.[]It finds a range of characters for example [xyz] means x, y or z .[^]It finds the items which are not in range for example [^abc] means NOT a, b or c.– (dash)It finds for character range within the given item range for example [a-z] means a through z.| (pipe)It is the logical OR for example x | y means x OR y.?It denotes zero or one of preceding character or item range.*It denotes zero or more of preceding character or item range.+It denotes one or more of preceding character or item range.{n}It denotes exactly n times of preceding character or item range for example n{2}.{n, }It denotes atleast n times of preceding character or item range for example n{2, }.{n, m}It denotes atleast n but not more than m times for example n{2, 4} means 2 to 4 of n.\It denotes the escape character.Special Character Classes in Regular Expressions: Let us look into some of the special characters used in regular expressions. Special CharacterMeaning\nIt denotes a new line.\rIt denotes a carriage return.\tIt denotes a tab.\vIt denotes a vertical tab.\fIt denotes a form feed.\xxxIt denotes octal character xxx.\xhhIt denotes hex character hh.Shorthand Character Sets: Let us look into some shorthand character sets available. ShorthandMeaning\sMatches space characters like space, newline or tab.\dMatches any digit from 0 to 9.\wMatches word characters including all lower and upper case letters, digits and underscore.Predefined functions or Regex library: Let us look into the quick cheat sheet of pre-defined functions for regular expressions in PHP. PHP provides the programmers to many useful functions to work with regular expressions. The below listed in-built functions are case-sensitive. FunctionDefinitionpreg_match()This function searches for a specific pattern against some string. It returns true if pattern exists and false otherwise.preg_match_all()This function searches for all the occurrences of string pattern against the string. This function is very useful for search and replace.ereg_replace()This function searches for specific string pattern and replace the original string with the replacement string, if found.eregi_replace()The function behaves like ereg_replace() provided the search for pattern is not case sensitive.preg_replace()This function behaves like ereg_replace() function provided the regular expressions can be used in the pattern and replacement strings.preg_split()The function behaves like the PHP split() function. It splits the string by regular expressions as its parameters.preg_grep()This function searches all elements which matches the regular expression pattern and returns the output array.preg_quote()This function takes string and quotes in front of every character which matches the regular expression.ereg()This function searches for a string which is specified by a pattern and returns true if found, otherwise returns false.eregi()This function behaves like ereg() function provided the search is not case sensitive.Note:
Example 1:
John Developer0 John Developer1 = John Developer3 ;
John Developer5 John Developer6 John Developer7 John Developer8 $regex Completed graduation in 20020 John Developer1 Completed graduation in 20022 Completed graduation in 20023 Completed graduation in 20024 Completed graduation in 20025 Completed graduation in 20026 Completed graduation in 20027 Completed graduation in 20028 Completed graduation in 20029 134 645 478 6700 134 645 478 6701 134 645 478 6702 134 645 478 6703 Completed graduation in 20023 Completed graduation in 20024 Completed graduation in 20025 134 645 478 6707 Completed graduation in 20027 Completed graduation in 20028 <?php 0134 645 478 6700 134 645 478 6701
Output: Name string matching with regular expression Example 2:
John Developer0
Completed graduation in 20020 // Declare a regular expression 1Completed graduation in 20020 $regex 2134 645 478 6700
Completed graduation in 20024 $regex 2$regex 6$regex 7Completed graduation in 20028 $regex 2= 0= 1;
Output: John Developer Example 3:
John Developer0
Completed graduation in 20025 $regex Completed graduation in 20020 '/^[a-zA-Z ]*$/' 5Completed graduation in 20020 '/^[a-zA-Z ]*$/' 1134 645 478 6700 Completed graduation in 20023 John Developer03 Completed graduation in 20024 '/^[a-zA-Z ]*$/' 1;
Output: Completed graduation in 2002 Example 4:
John Developer0 John Developer12 = John Developer14 ;
John Developer19 ;
John Developer21 John Developer22 John Developer23
John Developer25 $regex Completed graduation in 20020 John Developer12 134 645 478 6700
Completed graduation in 20024 John Developer31 ; Completed graduation in 20024 John Developer34 ; Completed graduation in 20024 John Developer37 ; Completed graduation in 20024 John Developer40 ;
Output: 134 645 478 670 Metacharacters: There are two kinds of characters that are used in regular expressions these are: Regular characters and Metacharacters. Regular characters are those characters which have a ‘literal’ meaning and Metacharacters are those characters which have ‘special’ meaning in regular expression. MetacharacterDescriptionExample.It matches any single character other than a new line././ matches string which has a single character.^It matches the beginning of string./^geeks/ matches any string that starts with geeks.$It matches the string pattern at the end of the string./com$/ matches string ending with com for example google.com etc.*It matches zero or more characters./com*/ matches commute, computer, compromise etc.+It matches preceding character appear atleast once.For example /z+oom/ matches zoom.\It is used to esacape metacharacters in regex./google\.com/ will treat the dot as a literal value, not metacharacter.a-zIt matches lower case letters.geeksA-ZIt matches upper case letters.GEEKS0-9It matches any number between 0 and 9./0-5/ matches 0, 1, 2, 3, 4, 5[…]It matches character class./[pqr]/ matches pqrOther Examples: Regular expressionMeaning^[.-a-z0-9A-Z]It matches string with dot, dash and any lower case letters, numbers between 0 and 9 and upper case letters.+@[a-z0-9A-Z]It matches string with @ symbol in the beginning followed by any lower case letters, numbers between 0 and 9 and upper case letters.+\.[a-z]{2, 6}$/It escapes the dot and then matches string with any lower case letters with string length between 2 and 6 at the end.Note:
POSIX Regular Expressions: Some regular expressions in PHP are like arithmetic expressions which are called POSIX regular expressions. Some times, complex expression are created by combining various elements or operators in regular expressions. The very basic regex is the one which matches a single character. Lets look into some of the POSIX regular expressions. RegexMeaning[0-9]It matches digit from 0 through 9.[a-z]It matches any lowercase letter from a through z.[A-Z]It matches any uppercase letter from A through Z.[a-Z]It matches any lowercase letter a through uppercase letter Z.[:lower:]It matches any lower case letters.[:upper:]It matches any upper case letters.[:alpha:]It matches all alphabetic characters or letters from a-z and A-Z.[[:alpha:]]It matches any string containing alphabetic characters or letters.[:alnum:]It matches all alphanumeric characters i.e all digits(0-9) and letters (a-z A-Z).[[:alnum:]]It matches any string containing alphanumeric characters and digits.[:digit:]It matches all the digits from 0 to 9.[[:digit:]]It matches any string containing digits from 0 to 9.[:xdigit:]It matches all the hexadecimal digits.[:punct:]It matches all the punctuation symbols.[:blank:]It matches blank characters like space and tab.[:space:]It matches all whitespace characters like line breaks.[[:space:]]It matches any string containing a space.[:cntrl:]It matches all control characters.[:graph:]It matches all visible or printed characters other than spaces and control characters.[:print:]It matches all printed characters and spaces other than control characters.[:word:]It matches all word characters like digits, letters and underscore.Quantifiers in Regular Expression: Quantifiers are special characters which tell the quantity, frequency or the number of instances or occurrence of bracketed character or group of characters. They are also called as greedy and lazy expressions. Let us look into some of the concepts and examples of quantifiers. QuantifierMeaninga+It matches the string containing at least one a.a*It matches the string containing zero or more a’s.a?It matches any string containing zero or one a’s.a{x}It matches letter ‘a’ exaclty x times .a{2, 3}It matches any string containing the occurrence of two or three a’s.a{2, }It matches any string containing the occurrence of at least two a’s.a{2}It matches any string containing the occurrence of exactly two a’s.a{, y}It matches any string containing the occurrence of not more than y number of a’s.a$It matches any string with ‘a’ at the end of it.^aIt matches any string with ‘a’ at the beginning of it.[^a-zA-Z]It matches any string pattern not having characters from a to z and A to Z.a.aIt matches any string pattern containing a, then any character and then another a.^.{3}$It matches any string having exactly three characters.Note:
Commonly known regular expression engines:
Conclusion: A regular expression is a pattern that describes some string text in a particular pattern or it is defined as a pattern-matching algorithm expressed in a single line. Regular expressions are very useful in the programming world for validation checks and recognizing specific templates. PHP provides many in-built functions supporting regular expressions. Metacharacters helps in creating complex patterns. |