Cara menggunakan regexmatch excel

Kembali lagi membahas Regular Expression (RegEx), rumus regex ini bisa menyederhanakan formula yang sebelumnya harus menggunakan formula yang rumit menjadi sangat simple, sayang regex ini belum ada di rumus Excel :emotsedih.

Pada Kasus kali ini membahas pertanyaan, bagaimana cara memotong kata dengan rumus? misalnya ingin mengambil 10 kata saja atau 25 kata saja dari data text yang ada.

Untuk Contoh data bisa dilihat dibawah ini :

Cara menggunakan regexmatch excel

Jika menggunakan rumus Excel mungkin perlu kombinasi rumus yang panjang untuk menyelsaikan pertanyaan ini. Untungnya di Google sheet sudah ada rumus Regex yaitu REGEXEXTRACT sehingga bisa melakukan Extract Data beradasarkan Pattern Regex.

Untuk Mengambil Word bisa menggunakan Pattern “[\w]*” ini akan mengextract 1 kata dari data yang ada. Sehingga jika ingin mengambil 2 kata maka Pattern nya menjadi “[\w]* [\w]*”.

=REGEXEXTRACT(A2,"[\w]* [\w]*")
Cara menggunakan regexmatch excel

Dari sini sudah mulai ada gambaran?? untuk mengambil lima kata tentu tinggal ulangi Pattern sampai 5, lalu bagaimana jika 10 kata atau 25 kata apa akan diulangi hinga membuat seperti ini ? “[\w]* [\w]* [\w]* [\w]* [\w]* [\w]* [\w]* [\w]*” ?

Tentu tidak ya, untuk membuat pengulangan bisa menggunakan rumus lain yaitu rumus REPT. Rumus REPT ini berfungsi untuk mengulang Text sebanyak n kali. Sehingga rumusnya menjadi.

=REGEXEXTRACT(A2,REPT(“[\w]* “,5))

Angka 5 pada rumus REPT artinya kata yang akan diambil adalah 5 kata. Sekarang menjadi lebih mudah, jika ingin mengambil 10 kata atau 25 kata cukup ganti nilai n pada rumus REPT.

Issue is I have to catch : Column A > primary Key value always stays in A1 or A cell and capture data using regular expression "BOLT" value in cell or "Bolt, bolt | Both" another kind of cell text with comma and | which might be find in rows from B to AQ cells, into new spread sheet in excel or new column AR

find result: example.
A1 = primary key > Value 001
AR = data captured from all the row cell as B to AQ = Bolt, Bolt|bolt, bolt123, bolt-bolt

hope I have explained it right, I am new to regular expression, manual work killing all my time if you show any solution in automation using regular expression, I am very thankful to you.

Reply

When using multi-line mode (enabled via the (?m) flag), only

SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
0 is recognized as a line terminator. Additionally, the
SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
1 flag is not supported and must not be used.

  • Case-insensitive matching (enabled via the

    SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    
    2 flag) is always performed in a Unicode-aware manner. However, context-sensitive and local-sensitive matching is not supported. Additionally, the
    SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    
    3 flag is not supported and must not be used.

  • Surrogate pairs are not supported. For example,

    SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    
    4 is not treated as
    SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    
    5 and must be specified as
    SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    
    6.

  • Boundaries (

    SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    
    7) are incorrectly handled for a non-spacing mark without a base character.

  • SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    
    8 and
    SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    
    9 are not supported in character classes (such as
    SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
    
    0) and are instead treated as literals.

  • Unicode character classes (

    SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
    
    1) are supported with the following differences:

    • All underscores in names must be removed. For example, use

      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      2 instead of
      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      3.

    • Scripts must be specified directly, without the

      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      4,
      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      5 or
      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      6 prefixes. Example:
      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      7

    • Blocks must be specified with the

      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      8 prefix. The
      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      9 and
      SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
      
      0 prefixes are not supported. Example:
      SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
      
      1

    • Categories must be specified directly, without the

      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      4,
      SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
      
      3 or
      SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
      
      4 prefixes. Example:
      SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
      
      5

    • Binary properties must be specified directly, without the

      SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
      
      4. Example:
      SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
      
      7

  • regexp_extract_all(string, pattern)

    Returns the substring(s) matched by the regular expression

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    8 in
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9:

    SELECT regexp_extract_all('1a 2b 14m', '\d+'); -- [1, 2, 14]
    

    regexp_extract_all(string, pattern, group)

    Finds all occurrences of the regular expression

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    8 in
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9 and returns the
    SELECT regexp_like('1a 2b 14m', '\d+b'); -- true
    
    2:

    SELECT regexp_extract_all('1a 2b 14m', '(\d+)([a-z]+)', 2); -- ['a', 'b', 'm']
    

    regexp_extract(string, pattern) varchar

    Returns the first substring matched by the regular expression

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    8 in
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9:

    SELECT regexp_extract('1a 2b 14m', '\d+'); -- 1
    

    regexp_extract(string, pattern, group) varchar

    Finds the first occurrence of the regular expression

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    8 in
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9 and returns the
    SELECT regexp_like('1a 2b 14m', '\d+b'); -- true
    
    2:

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    

    regexp_like(string, pattern) boolean

    Evaluates the regular expression

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    8 and determines if it is contained within
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9.

    This function is similar to the

    SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'
    
    0 operator, except that the pattern only needs to be contained within
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9, rather than needing to match all of
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9. In other words, this performs a contains operation rather than a match operation. You can match the entire string by anchoring the pattern using
    SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'
    
    3 and
    SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'
    
    4:

    SELECT regexp_like('1a 2b 14m', '\d+b'); -- true
    

    regexp_replace(string, pattern) varchar

    Removes every instance of the substring matched by the regular expression

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    8 from
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9:

    SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'
    

    regexp_replace(string, pattern, replacement) varchar

    Replaces every instance of the substring matched by the regular expression

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    8 in
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9 with
    SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'
    
    9. can be referenced in
    SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'
    
    9 using
    SELECT regexp_replace('1a 2b 14m', '(\d+)([ab]) ', '3c$2 '); -- '3ca 3cb 14m'
    
    1 for a numbered group or
    SELECT regexp_replace('1a 2b 14m', '(\d+)([ab]) ', '3c$2 '); -- '3ca 3cb 14m'
    
    2 for a named group. A dollar sign (
    SELECT regexp_replace('1a 2b 14m', '\d+[ab] '); -- '14m'
    
    4) may be included in the replacement by escaping it with a backslash (
    SELECT regexp_replace('1a 2b 14m', '(\d+)([ab]) ', '3c$2 '); -- '3ca 3cb 14m'
    
    4):

    SELECT regexp_replace('1a 2b 14m', '(\d+)([ab]) ', '3c$2 '); -- '3ca 3cb 14m'
    

    regexp_replace(string, pattern, function) varchar

    Replaces every instance of the substring matched by the regular expression

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    8 in
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9 using
    SELECT regexp_replace('1a 2b 14m', '(\d+)([ab]) ', '3c$2 '); -- '3ca 3cb 14m'
    
    7. The lambda expression
    SELECT regexp_replace('1a 2b 14m', '(\d+)([ab]) ', '3c$2 '); -- '3ca 3cb 14m'
    
    7 is invoked for each match with the passed as an array. Capturing group numbers start at one; there is no group for the entire match (if you need this, surround the entire expression with parenthesis).

    SELECT regexp_replace('new york', '(\w)(\w*)', x -> upper(x[1]) || lower(x[2])); --'New York'
    

    regexp_split(string, pattern)

    Splits

    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    9 using the regular expression
    SELECT regexp_extract('1a 2b 14m', '(\d+)([a-z]+)', 2); -- 'a'
    
    8 and returns an array. Trailing empty strings are preserved: