Python String is one of the most popular data types in Python. A string is a sequence of characters such as names, special characters, or numbers to display. In this article, we will learn about the Python Strings Basics and some String Operations.
Python Strings
Python has a built-in string class named str. We can simply create a String in Python by enclosing a sequence of characters in quotes. Python treats both single quotes and double quotes as the same. Strings are immutable, and hence if we try to change the value of it after creation will result in the creation of a new string.
Creating a String
We can create String by enclosing the characters in a Single or Double quote, like below
>>> 'Hello' 'Hello' >>> "Good Morning" 'Good Morning'
The IDLE prints both the strings with a single quote; this gives us a hint that both the strings get treated the same way.
If both are treated the same, say then why we need two kinds of quotes? Let’s look in the below code for a better understanding, suppose we are creating a string like below.
>>> 'This is Jim's Cat'
SyntaxError: invalid syntax
In the above code, we have a single quote within a single-quoted string, and hence the interpreter will throw Syntax Error.
To overcome these situations, we are having two kinds of quotes. A Double-quoted string allows you to have a Single quote inside it, and a Single-quoted string allows you to have a Double quote inside it.
For Example,
>>> "This is Jim's Cat" "This is Jim's Cat" >>> '"Hello" Good Morning' '"Hello" Good Morning'
Alternatively, we can also use the backslash character (\) to escape the quotes in the String, like this.
>>> 'This is Jim\'s Cat' "This is Jim's Cat" >>> "\"Hello\" Good Morning" '"Hello" Good Morning'
We can also use triple single quotes (”’) or triple double quotes (“””), for creating a string.
>>> '''Coding Pointers''' 'Coding Pointers' >>> """Python""" 'Python'
The Triple Quote strings are not useful for creating a short string, and the common usage is that it allows you to create multi-line strings, like this.
>>> ''' Welcome to Coding Pointers Let's learn Python Programming ''' " Welcome to Coding Pointers\n\tLet's learn Python Programming "
There is one final approach for creating a Python String, is by using the str() function. We can pass in any object type, which is convertible to String.
>>> str('Hello') 'Hello'
Escape Sequences
While creating a String, we have used backslash (\) character to escape the single and double quote. The backslash allows you to add special characters and Unicode characters know as Escape Sequences.
Escape Sequences allows you to escape some of the characters which you cannot type on the keyboard. For example, if we need a tab space between the words or if we need the words to print in the next line, we can use the \t or \n escape characters.
>>> print('Hello\tWorld') Hello World >>> print('Hello\nWorld') Hello World
Let’s take a look into another example, and if we need to print a backslash character in the String, then we can use the double backslash escape sequence.
>>> div = ' 4\\2 = 2' >>> print(div) 4\2 = 2
The Escape sequence used in the above code will not be stored in the String memory; they describe the special character values to be stored in the String. Below is the list of complete escape sequences supported by Python.
Escape Sequence | Meaning |
---|---|
\newline | Ignored (continuation line) |
\\ | Backslash (\) |
\’ | Single quote (‘) |
\” | Double quote (“) |
\a | Bell |
\b | Backspace |
\f | Formfeed |
\n | Linefeed |
\r | Carriage Return |
\t | Horizontal Tab |
\v | Vertical Tab |
\ooo | Character with octal value ooo (up to 3 digits) |
\xhh | Character with hex value hh (exactly 2 digits) |
\N{name} | Named Unicode character |
\uxxxx | Unicode character with 16-bit hex value |
\Uxxxxxxxx | Unicode character with 32-bit hex value |
Raw Strings suppress Escape Sequence
We have seen how Escape Sequences are handy in embedding the special characters in the String, but sometimes these backslash character will cause trouble in the code, especially when someone new to Python uses the below code.
For example, If we wanted to open a file located in “c:\notes\testing.txt”
>>> location = "c:\notes\testing.txt" >>> location 'c:\notes\testing.txt'
Everything looks fine, right? let’s try to print it
>>> print(location) c: otes esting.txt
It is not what we wanted, instead of printing “c:\notes\testing.txt”, it has replaced \n as newline character and \t as a tab and printed “c:(newline)otes(tab)esting.txt”
This is not what we are after, and we can escape the backslash itself like below.
>>> location = "c:\\notes\\testing.txt" >>> location 'c:\\notes\\testing.txt' >>> print(location) c:\notes\testing.txt
The above approach looks fine all you need to do is just escape the backslash, but the problem with this approach is that we need to add a lot of double backslashes for long paths.
>>> location = "c:\\notes\\codes\\loc\\docs\\info\\testing.txt" >>> location 'c:\\notes\\codes\\loc\\docs\\info\\testing.txt'
Raw strings are useful in such cases. They will not treat the backslash as a special character. Every character we put into a raw string stays the way we wrote it. We just need to add an ‘r’ within the print() function.
>>> print(r'location = "c:\notes\testing.txt')
If we need to print a variable as raw String, then we can use the repr() function.
print(repr(location))
String Immutability
Python Strings are immutable; that is, every String operation performed will create a new String. In other words, we can say we can never overwrite the value of the String.
For example, we cannot change the value of the String at a particular position, though we can build a new String and assign a new value to it. Let’s try to change the character of a string at a particular position.
>>> msg = "coding" >>> msg[0] ="C" Traceback (most recent call last): File "<pyshell#4>", line 1, in msg[0] ="C" TypeError: 'str' object does not support item assignment
We will get “str object does not support item assignment” error, because Strings are immutable. Though we can perform the same operation using String slicing and concatenation as below
>>> msg = "C" + msg[1:] >>> msg 'Coding'
It looks like we have achieved what we wanted, but internally Python would have created a new String object with the same name ‘msg’. Python will store every variable at a particular memory address, and we can find out where in memory, they are stored using the id() function.
Let’s understand what has happened?
>>> msg = "coding" >>> id(msg) 45051424 >>> msg = "C" + msg[1:] >>> id(msg) 45516128
We can see that the memory address of the msg object has changed because Python has created a new object for the String operation.
Basic String Operations
Let’s perform some basic operations on Strings
Concatenating Strings with + operator
We can concatenate both String literal and String variable with the + operator.
>>> "Good" + "Morning" 'GoodMorning' >>> g = "Good" >>> m = "Morning" >>> g + m 'GoodMorning'
The String literals can be concatenated even without the + operator.
>>> "Good" "Morning" 'GoodMorning'
If we try the to concatenate String variables, without the + operator, we will get “NameError” like below
>>> g = "Good" >>> m = "Morning" >>> gm Traceback (most recent call last): File "<pyshell#5>", line 1, in gm NameError: name 'gm' is not defined
String Duplication with * operator
We can use the * operator to duplicate the String. Suppose if we multiply a String by 2 then the String will be printed twice.
For example, In the below code we need to repeat ‘Very!’ twice.
>>> print('Very! Very! Good Morning') Very! Very! Good Morning
Instead of writing it twice, we can use the * operator to duplicate the String like below.
>>> print('Very! ' *2 +'Good Morning')
Very! Very! Good Morning
String Indexing with []
All the String will have an index associated with it, the left-most index will be 0, and the right-most index will be the length of the String – 1
When we save the string “Hello” in a variable, Python identifies each character of the String using the character’s position.
- 0th index is ‘H’
- 1st index is ‘e’
- 2nd index is ‘l’
- 3rd index is ‘l’
- 4th index is ‘o’
We can grab each character of a string using the index, like below
>>> msg[0] 'H' >>> msg[1] 'e' >>> msg[2] 'l' >>> msg[3] 'l' >>> msg[4] 'o'
We can also use negative indexing to go backward.
>>> msg[-5] 'H' >>> msg[-4] 'e' >>> msg[-3] 'l' >>> msg[-2] 'l' >>> msg[-1] 'o'
String Slicing
We can get the substring of a string by using a Slice, the slice takes two mandatory parameters the start index and end index and an optional parameter step size. The slice will include the characters from the start index until the character before the end index [excluded end index].
- [:] returns the complete String
- [ start index :] returns a specific part of the String from the start index to the end.
- [: end index ] returns the String from the beginning to the end index – 1.
- [ start index : end index] grabs the String from the start index to the end index – 1.
- [ start index : end index : step ] grabs the String from the start index to the end index – 1, skipping characters by step.
We all know the index of a String grows from 0 to length -1. If we didn’t specify the start index, the slice uses 0 as the start index, and if we didn’t specify the end index, then the Slice uses the end of the String as the end index.
>>> msg = "Welcome"
Let’s start with a plain : this returns the entire String as such.
>>> msg[:] 'Welcome'
Now let’s give the start index alone.
>>> msg[3:] 'come'
This time with end index alone
>>> msg[:4] 'Welc'
With start and end index
>>> msg[1:5] 'elco'
From the start to the end, with a step 2 characters.
>>> msg[::2] 'Wloe'
Happy Learning!!
Leave a Reply