Using string methods and regexes in python
Replace Occurrences of Substrings in Strings in Python
- str.replace()
- re.sub()
- re.subn()
By using the above-mentioned methods, let’s see how to replace substrings in strings.
1. Replace all occurrences of substring
‘Using str.replace()’
Syntax: str.replace(old,new,count)
Example 1: Replace substring “two” by “one”
s2=s1.replace("two","one")
print (s2)
#Output:one apple,one orange,one banana
By default, str.replace() will replace all occurrences of “two” by “one”
2. Replace only the first occurrence of a substring
Using ‘str.replace’
Example 1: Replace substring “two” by “one” for first occurrence only.
If we want to replace the substring by the first occurrence only, we can mention count =1. Likewise, for the first two occurrences, we can mention count=2.
s1="one apple,two orange,two banana"s2=s1.replace("two","one",1)
print (s2)
#Output:one apple,one orange,two banana
3. Case-insensitive replacement.
Using ‘re.sub()’
Syntax:
re.sub(pattern, repl, string, count=0, flags=0)
By mentioning flag=re.IGNORECASE
Example 1: Have to replace “An” or “an” by “one”.
s1="An apple,an avocado"
pattern = re.compile('an', re.IGNORECASE)
s2=pattern.sub("one",s1)
print (s2)
#Output:one apple,one avocado
- pattern = re.compile(‘an’, re.IGNORECASE) → defined the pattern which matches “an” substring. Flag is set as re.IGNORECASE which means case insensitive. It will match “an”,”AN”,”An” substrings.
- s2=pattern.sub(“one”,s1) →Replacing the matching pattern by “one” for string “s1”
Example 2: Doing case insensitive replacement by using re.subn()
Using ‘re.subn()’
Syntax : re.subn(pattern, repl, string, count=0, flags=0)
Same as re.sub(), but it will return a tuple (new_string, number_of_subs_made)
If we want to know the number of substitutions made, re.subn() can be used.
import res1="An apple,an avocado"
pattern = re.compile('an', re.IGNORECASE)
s2=pattern.subn("one",s1)
print (s2)
#Output:('one apple,one avocado', 2)
- (‘one apple,one avocado’, 2) → Returns the modified string and the number of substitutions made.
4. Avoid replacement on parts of words.
Example 1: To replace “an” by “one”. But it should not replace parts of words.
If we use str.replace(), “an” inside “orange” also gets replaced.
s2=s1.replace("an","one")
print(s2)
#Output:one apple,one oronege
To avoid replacement on parts on words, re.sub() can be used.
import res1="an apple,an orange"
pattern = re.compile(r'\ban\b')
s2=pattern.sub("one",s1)
print (s1)
#Output:an apple,an orange
- pattern = re.compile(r’\ban\b’) → \b matches empty string. Since ‘\ban\b’ matches empty string before and after “an” ,this will avoid replacement on parts of words. So ‘an’ inside ‘orange’ is not replaced.
5.Replace multiple words by one word.
Example 1: Repalce “hr”, “hour” to “Hours”
import res1="hr,hour"
pattern = re.compile('(hr|hour)')
s2=pattern.sub("Hours",s1)
print (s2)
#Output:Hours,Hours
- pattern = re.compile(‘(hr|hour)’)
- () →group
- | → either or
- ‘(hr|hour)’ → matches either “hr” or “hour”
6. Replace a specific set of characters by a single character.
Example: Replace @,#,$,% by ‘-’
s1="1@2#3$4%5"
s2=re.sub("[@#$%]","-",s1)
print (s2)
- re.sub(“[@#$%]”,”-”,s1)
[] → used to indicate a set of characters - “[@#$%]” → pattern is matching any of these characters mentioned within []
- re.sub(“[@#$%]”,”-”,s1) → matched characters then replaced by ‘-’
7. Replace one or more occurrences of a character by a single character.
s1="1.99,2.999,3.9999"
import re
s2=re.sub("[9]+","0",s1)
print (s2)
#Output:1.0,2.0,3.0
- s1=re.sub(“[9]+”,”0",s1)
- “[9]+” → It will match one or more occurrences of 9.
- + → match one or more occurrences of the character mentioned.
- re.sub(“[9]+”,”0",s1) → Replacing one or more occurrence of ‘9’ by ‘0’ in string s1.
Using ‘re.subn()
import res1="1.99,2.999,3.9999"
import re
s2=re.subn("[9]+","0",s1)
print (s2)
#Output:('1.0,2.0,3.0', 3)