The split()
is a built-in method in Python and is used to split a string into substrings (words) based on a separator or delimiter which it takes as an argument. It returns a list containing the substrings separated by a specified delimiter(separator).
In simple terms, you have a sentence and you want to split it into words making them different entities, you can use the split()
method and it will return a list containing the words.
1 2 3 4 5 |
sentence = "You are good" # Splitting sentence into array of words result = sentence.split() print(result) |
The above code will split the string stored in the sentence
variable into three words within a list.
1 |
['You', 'are', 'good'] |
You may be wondering how the string got separated when there is no delimiter specified as an argument.
Syntax
.split(sep=None, maxsplit=-1)
sep
– If sep
is not specified, the whitespace character is used as the delimiter otherwise, the string will be split based on the specified sep
.
maxsplit
– If maxsplit
is not specified, the string will split until it reaches the end, creating as many elements as possible. Otherwise, if maxsplit
is specified, the string will be split into maxsplit + 1
elements. For example, if you’ve specified maxsplit=1
, then the string will split into 2 elements.
Splitting based on delimiter
Suppose you have a comma-separated sentence and you want to split that sentence into a list of substrings based on the comma, you can simply pass the separator as a comma (","
) and be done.
1 2 3 4 5 |
sentence = "Sachin, Yashwant, Rishu, Abhishek, are good" # Splitting sentence based on comma result = sentence.split(sep=",") print(result) |
If you run this code, you’ll get the following output.
1 |
['Sachin', ' Yashwant', ' Rishu', ' Abhishek', ' are good'] |
You can see that the split(sep=",")
split the sentence where the comma is placed and you get five substrings within a list.
You can split the string based on any delimiter such as commas, spaces, tabs, semicolons, etc., depending on the specific requirements of the data being processed.
Suppose you have the data containing names separated by a weird expression and you want to take out the names from the data.
1 2 3 4 5 |
sentence = "Sachin/<>/Yashwant/<>/Rishu/<>/Abhishek/<>/Yogesh" # Splitting result = sentence.split(sep="/<>/") print(result) |
This will work as fine as the code from the previous and you’ll get the list of names from the data and you can perform whatever operation you want.
1 |
['Sachin', 'Yashwant', 'Rishu', 'Abhishek', 'Yogesh'] |
Using maxsplit
If you have a specific number in mind for how your string should be split, you can use the maxsplit
to specify that number.
1 2 3 4 5 |
sentence = "Sachin/<>/Yashwant/<>/Rishu/<>/Abhishek/<>/Yogesh" # Using maxsplit result = sentence.split(sep="/<>/", maxsplit=3) print(result) |
In the above code, the maxsplit
is set to 3
which means the string will be split into four substrings.
1 |
['Sachin', 'Yashwant', 'Rishu', 'Abhishek/<>/Yogesh'] |
The string got split into four substrings instead of three despite setting the maxsplit
to 3. This happened because, by default, one is added to the number of maxsplit
, which means maxsplit=3
is equivalent to maxsplit=3 + 1
, resulting in a maximum of 4 parts after splitting.
Example
Assume you have a file containing a person’s information (name, age, and email address), and you want to access the data from the file and perform some operations on it.
1 2 3 4 5 6 7 |
def get_details(filename): with open(filename, "r") as file: lines = file.readlines() result = [line.split() for line in lines] return result details = get_details("names") |
The function, get_details
, reads a file given by the filename
parameter. It then reads all the lines from the file using file.readlines()
. Each line is split into a list of strings using the split()
method, which splits the line based on whitespace by default. This creates a list of lists, where each inner list contains the individual words from each line.
Finally, the function returns this list of lists, where each inner list represents the words from each line in the file.
You can use the object (details
) to access information about a specific person.
1 2 3 |
print(details[3]) print(details[1]) print(details[9]) |
This will print the information for the person on the third, first, and ninth indexes respectively.
1 2 3 |
['Isabella,', 'isabellanguyen@gmail.com,', '30'] ['Sophia,', 'sophiamartinez@yahoo.com,', '28'] ['Charlotte,', 'charlotte.gonzales@hotmail.com,', '34'] |
Conclusion
The split()
method is used to split a string into substrings within a list. You can also split your string based on a delimiter which you can pass as an argument to the split()
method and you can control the splits by setting a value to the maxsplit
parameter.
Resource: https://docs.python.org/3.3/library/stdtypes.html#str.split
πOther articles you might be interested in if you liked this one
β Why if __name__ == β__main__β is used in Python programs?
β Create a WebSocket server and client in Python.
β How to use map() in Python?
β How to use pytest to test your code in Python?
β Serialize and deserialize Python objects using the pickle module.
β Hash password using the bcrypt package in Python.
That’s all for now
Keep Codingββ