string operator
String operations are contained within two quotation marks or two single quotation marks. They can be represented as ” ” or ‘ ‘.
Indexing starts at 0 and counts like “Hello World”; the H starts at 0, then e is 1, and so forth. Below is a code sample that will generate a string based on the index position.
# String indexing
name= "Hellow world"
print(name[::2]) # every second letter. Output is: Hlo ol
print(name[::-1])# reverse the string. Output is: dlrow wolleH
print(name[0:5]) # print the first 5 letters. Output is: Hello
The [::2]
indexing works as follows: [start, end, step forward] | example: [0,5,2]
Start at 0 to 5, and skip every second character between this index.
find() method
The method finds a sub-string. The argument: name= "Hello world" | name.find('wo')
will find on index 6
name = "hello world"
name.find("wo")
# output: 6.
# index of the first letter of the substring "wo" is 6
if the find does not have the strings in the variable, it will error out as index -1
split() method
The split method separates the multiple strings into a list. The list has an index that starts with 0.
# Split the substring into list
name = "Hello World"
split_string = (name.split())
print(split_string)
# Output: ['Hello', 'World']. output as list.
RegEx module
it’s short for Regular Expression. It’s used for matching and handling strings. In order to activate this module,
you have to import it: import re
Special Sequence | Meaning | Example |
---|---|---|
\d | Matches any digit character (0-9) | “123” matches “\d\d\d” |
\D | Matches any non-digit character | “hello” matches “\D\D\D\D\D” |
\w | Matches any word character (a-z, A-Z, 0-9, and _) | “hello_world” matches “\w\w\w\w\w\w\w\w\w” |
\W | Matches any non-word character | “@#$%” matches “\W\W\W\W” |
\s | Matches any whitespace character (space, tab, newline, etc.) | “hello world” matches “\w\s\w\w\w\w\w” |
\S | Matches any non-whitespace character | “hello_world” matches “\S\S\S\S\S\S\S\S\S” |
\b | Matches the boundary between a word character and a non-word character | “cat” matches “\bcat\b” in “The cat sat on the mat” |
\B | Matches any position that is not a word boundary | “cat” matches “\Bcat\B” in “category” but not in “The cat sat on the mat” |
In this example, make sure to import the Regular Expression module. In this example, we will use \d to match all ten numbers in the sequence:
import re
pattern = r"\d\d\d\d\d\d\d\d\d\d" # Matches any ten consecutive digits
text = "My Phone number is 1234567890"
match = re.search(pattern, text) # search for the pattern in the text. re. is the module name
if match:
print("Phone number found:", match.group())
# output: Phone number found: 1234567890
else:
print("No match")
# output: No match
Another example uses: r – raw string with \W special sequence to find any of [^a-zA-Z0-9_!@]
import re
pattern = r"\W" # Matches any non-word character (equivalent to [^a-zA-Z0-9_]).
# The r is for raw string
text = "Hello, world!"
matches = re.findall(pattern, text)
print("Matches:", matches) #output: Matches: [',', '!']
findall() function
This findall(
) function finds all occurrences of specified patterns within a string:
import re
s2 = "The setup was awsome as always! 'King of Rock'"
# Use the findall() function to find all occurrences of the "as" in the string
result = re.findall("as", s2) #re.findall(pattern, string, flags=0) flags=0.
#Means no special conditions such as ignore case or multiline or varaible
# Print out the list of matched words
print(result) #output: ['as', 'as']
In this example, the combination of function = re.findall
(pattern, string, flags=0) flags=0
split() function
A regular expression split()
Function splits a string into an array of substrings based on specified patterns:
import re
s2 = "The setup was awsome as always! 'King of Rock'"
# Use the split function to split the string by the "\s"
split_array = re.split("\s", s2) #\s is a whitespace character
# The split_array contains all the substrings, split by whitespace characters
print(split_array)
# output: ['The', 'setup', 'was', 'awsome', 'as', 'always!', "'King",
# 'of', "Rock'"]
Note: Make sure to use re
import on these examples since it uses a regular expression module.
Sub() function
The sub()
the function of a RegEx in Python replaces all occurrences of a pattern in a string with a specified replacement.
import re
s2 = "The setup was awsome as always! 'King of Rock'"
# Define the regular expression pattern to search for
pattern = r"King of Rock" #r means raw string literal and is used to avoid escaping characters
# Define the replacement string
replacement = "legend of Rock! "
# Use the sub function to replace the pattern with the replacement string
new_string = re.sub(pattern, replacement, s2, flags=re.IGNORECASE) #re.IGNORECASE means
# ignore case. The new_string contains the original string
# with the pattern replaced by the replacement string.
print(new_string)
#output: The setup was awsome as always! 'legend of Rock'
Replace() function
The replace() Function Takes A
string as input and returns a new string with all occurrences of the old substring (left) replaced with the new substring (right).
string = "This is a test sentence."
print(string)
new_string = string.replace("test", "real")
print(new_string) #output: This is a real sentence.
Concatenate list
To concatenate variables in Python, you can use the +
operator. For example, the following code concatenates the lists A
and B
:
A = [1, 'a']
B = [2, 1, 'd']
A + B