Splitting a String into separate words is one of the most common operations performed on a String. We can use the split() method of the str class to perform the split operation. In this Python Split String article, we will learn how to split string in Python based on a delimiter, comma, space, character, regex, and multiple delimiters.
Python Split String
All developers would have come across this situation, where we need to split a complete string into separate words.
Suppose if we have a string of usernames separated by a comma, we need to split them into individual username so that we can perform any operation on it, such as counting the number of users available.
>>> usernames = "Tim, Bob, Bill, Tom, Sam"
We can use the split() method to split the users based on comma delimiter. Let’s understand the split() method first before getting to the solution.
String split() method
The split() method returns a list of words in the string separated by the delimiter passed. The syntax of the split() method is
str.split(separator, maxsplit)
- The separator acts as the delimiter, and the string gets split based on the separator. If no separator is specified, then the split() method splits the string based on whitespace
- The maxsplit number specifies the maximum number of Split. If the maxsplit is not specified, then it is considered as -1 meaning no limit.
As a first step, let’s try to fix the issue stated above.
>>> usernames = "Tim, Bob, Bill, Tom, Sam" >>> listOfUsers = usernames.split(',') >>> type(listOfUsers) <class 'list'> >>> print("Number of Users available are "+str(len(listOfUsers))) Number of Users available are 5
Now the listOfUsers will have all the individual users separated.
Let’s pass the maxsplit number as 2
>>> listOfUsers = usernames.split(',',2)
>>> listOfUsers
['Tim', ' Bob', ' Bill, Tom, Sam']
We can see that only 2 splits have happened ‘Tim’ and ‘Bob’ are separated, and the remaining users [Bill, Tom, Sam] are not split
Let’s try maxsplit as 3 this time
>>> listOfUsers = usernames.split(',',3)
>>> listOfUsers
['Tim', ' Bob', ' Bill', ' Tom, Sam']
Now 3 splits have happened ‘Tim’, ‘Bob’, and ‘Bill’ are separated.
Split by Space
Whenever the delimiter is not specified or is null, then the string will be split using the Space / Whitespace as a delimiter.
>>> msg = "Welcome to Java Interview Point" >>> words = msg.split() >>> words ['Welcome', 'to', 'Java ', 'Interview', 'Point']
Whenever there are multiple consecutive whitespaces, it is considered as a single separator. In the below snippet, we have multiple spaces between each word.
>>> msg = "Welcome to Java Interview Point" >>> words = msg.split() >>> words ['Welcome', 'to', 'Java', 'Interview', 'Point']
Whitespace includes newline (\n) and tab space (\t) characters as well. So if the string contains \n or \t, it is considered as space only.
>>> msg = "Welcome\nto\tJava Interview Point" >>> words = msg.split() >>> words ['Welcome', 'to', 'Java','Interview','Point']
Python Split String by Character
There are three different ways by which we can split a string into a list of characters.
- Using List slice assignment
- By Passing the String to the list constructor
- With For Loop
1.Using List slice assignment
Slice Assignment is a special syntax for Lists, using which we can alter the contents of the lists. Let’s split the string into characters using List slice assignment
>>> msg = "Welcome" >>> chars = [] >>> chars[:] = msg >>> chars ['W', 'e', 'l', 'c', 'o', 'm', 'e']
By specifying chars[:] on the left side of the = operator, we are telling Python to use Slice Assignment.
2. By Passing the String to list constructor
The list() constructor takes a single iterable argument which can be a sequence or any iterator object
We just need to pass the sequence (string) to the list() constructor, as it is a type of iterable, the list() constructors splits them into individual characters.
>>> msg = "Welcome" >>> chars = list(msg) >>> chars ['W', 'e', 'l', 'c', 'o', 'm', 'e']
3. With For Loop
This is a kind of manual approach where we take each character of the string and append it to the list.
>>> for char in msg: chars.append(char) >>> chars ['W', 'e', 'l', 'c', 'o', 'm', 'e']
Python String Split by regex
We can use a regular expression to split a string, we need to import re module and use the split() method of it.
The syntax is
re.split(pattern, string, maxsplit, flags)
For example, let’s take a string separated by underscore ‘_’, we just need to pass the delimiter inside the square brackets []
>>> import re >>> message = "Welcome_to_Javainterview_Point" >>> words = re.split('[_]', message) >>> words ['Welcome', 'to', 'Javainterview', 'Point']
Let’s try with the maxsplit as 2
>>> words = re.split('[_]', message,2)
>>> words
['Welcome', 'to', 'Javainterview_Point']
The flags parameter allows you to modify the way the Regular expression works. We can use flags in two ways, either the long name or short name.
For example, if we want the regex to ignore the cases, then we can use the flag IGNORECASE, or I.
>>> numbers = "1234aaaa567BBB890ccc987" >>> numberList = re.split('[a-c]+', numbers, flags = re.IGNORECASE) >>> numberList ['1234', '567', '890', '987']
Splitting with Multiple Delimiters
We can also pass multiple delimiters to the re.split() method. Let’s try to split the string based on the semicolon, comma, and space as delimiters
>>> text = "one,two;three four,five six" >>> numbers = re.split('[;,\s]+', text) >>> numbers ['one', 'two', 'three', 'four', 'five', 'six']
rsplit() method – Split from right
rsplit() is similar to the split() method of the str class, except for the fact that it starts splitting the string from the right end.
>>> usernames = "Tim, Bob, Bill, Tom, Sam" >>> users = usernames.rsplit(',') >>> users ['Tim', ' Bob', ' Bill', ' Tom', ' Sam']
Looks the same, right? We can see the difference only when we give the maxsplit argument.
>>> users = usernames.rsplit(',', 2) >>> users ['Tim, Bob, Bill', ' Tom', ' Sam'] >>> users = usernames.rsplit(',', 3) >>> users ['Tim, Bob', ' Bill', ' Tom', ' Sam']
Now we can see that it starts splitting the words from the right.
splitlines() method – Splitting String by line break
The splitlines() method splits the string based on the line break characters such as \n, \r, \r\n, etc..
>>> msg = "Welcome\nTo\rJavaInterview\r\nPoint" >>> words = msg.splitlines() >>> words ['Welcome', 'To', 'JavaInterview', 'Point']
I hope, I have covered most of the ways to split a string in Python. Feel free to drop a comment if you found anything missing or needs to be added.
Happy Learning!!
Leave a Reply