Python String Slicing
String slicing in Python is like wielding a sharp knife to carve out exactly what you need from a string. Whether you are extracting key data from user input, manipulating large text datasets or simply refining how information is displayed, slicing gives you precise control over your string’s content. In this article, we will dive deep into the mechanics of string slicing, uncovering how a few well-placed slice operations can transform your code into a more efficient, readable and powerful tool. Ready to cut through the noise and master Python’s slicing magic.
Table of Content
- String slicing
- String concatenation and repetition
- Checking membership
- Adding white spaces to strings
- Creating multi line strings
String Slicing
We have seen how to get a single character from a string by specifying an
index using square brackets. Using the same square brackets, we can also
access a portion of the string. It is called slicing the string. To extract a part
of the string, we must specify 2 integers inside square brackets.
s[i:j]
Inside the square brackets, we have two integers, i and j separated by a
colon. The expression s[i:j] is a slice of the string; it gives us a new
string object that is a copy of the portion of the string s from index i to
index j-1. Note that the first index is included while the second index is
excluded. So, the slice s[i:j] returns a new string object that contains all
the characters of string s, from index i up to but not including index j.
The original string object does not change.
>>> s = 'homogeneous'
>>> s[2:6]
Output
'moge'
The expression s[2:6] gives us a new string object that contains all the
characters of string s from index 2 to index 5. The sliced object can be
assigned to a name.
>>> s1 = s[4:7]
>>> s1
Output
'gen'
The name s1 refers to the sliced object returned by the expression s[4:7].
The original object referred to by s remains unchanged.
>>> s
Output
'homogeneous'
>>> id(s)
Output
2182966396016
Now we make the name s refer to a new sliced object.
>>> s = s[3:7]
>>> s
Output
'ogen'
>>> id(s)
Output
2182965695664
id of s has changed, which shows that it refers to a new object.
While writing the slicing expression, we can omit the first or the second
number or both. If we omit the first index, it is assumed to be 0, the
beginning of the list. So, the slice s[:j] indicates a part of the strings
from index 0 to index j-1. It is equivalent to writing s[0:j]. If we omit
the second index, it is assumed to be the end of the string. So, the slice
s[i:] indicates a part of the string s from index i to index n-1 where n is
the length of the string. It is equivalent to writing s[i:n].
- s[:j] Part of string s from index 0 to index j-1 (same as s[0:j])
- s[:7] Part of string s from index 0 to index 6 (same as s[0:7])
- s[i:] Part of string s from index i to index n-1 (same as s[i:n], n is length of string)
- s[3:] Part of string s from index 3 to index n-1 (same as s[3:n], n is length of string)
We can omit both the indices inside the brackets. Therefore, the slice s[:]
extracts the entire string from the beginning till the end. It gives an exact
copy of the entire string. It is the same as writing s[0:n].
s[:] Part of string s from index 0 to index n-1 (same as s[0:n],
n is length of string).
- s[:] Part of string s from index 0 to index n-1 (same as s[0:n],
n is length of string).
So, when slicing from the start of the string, we can omit zero and when
slicing to the end of the string, we can omit n, as they are redundant.
>>> s = 'homogeneous'
>>> s[:4]
Output
'homo'
>>> s[5:]
Output
'eneous'
>>> s[:]
Output
'homogeneous'
Omitting both indexes gives us a string object that is an exact copy of the
string. So, if we must make a new string that is a copy of the string, we can
do it this way.
>>> scopy = s[:]
>>> scopy
Output
'homogeneous'
You can specify a negative index also while slicing.
s[0:-1] Part of string s from index 0 to index -2 (same as s[0:n-1]).
The slice s[0:-1] indicates a part of the string from index 0 to index -1–1 i.e. -2. As we have seen earlier, writing 0 as the first integer is redundant,
so you can omit the zero and just write it as s[:-1].
The slice s[:-1] represents the whole string, excluding the last character.
If you want a part of the string that excludes the last two characters, you can
use the slice s[:-2]. In general, s[:-m] gives us a string that excludes
the last m characters.
>>> s = 'homogeneous'
>>> s[:-1]
Output
'homogeneou'
This gives a string object that contains the whole string except the last
character. If you want a string object in which the last three characters are
removed, you can write this s[:-3].
>>> s[:-3]
Output
'homogene'
Now, let us write a slice with a negative number as the first index.
>>> s[-5:]
Output
'neous'
The slice s[-5:] starts at index -5 and goes up to the last index, so it
gives you the last 5 characters of the string. Similarly, the slice s[-3:] will
give you the last 3 characters of the string.
When both the indexes are equal, we get an empty string.
>>> s[3:3]
Output
''
We have seen that if we index a string and give an invalid index inside
square brackets, an IndexError occurs. Let us see what happens if we
provide a bad index in slicing.
>>> s[2:100]
Output
'mogeneous'
The end index is greater than the size of the string, but we did not get any
IndexError. We got a slice from index 2 to the end of the string. So, if
the index is greater than or equal to n length of the string, it means the end
of the list. Similarly, if the first index is less than or equal to -n, it means the
start of the string.
>>> s[-50:6]
Output
'homoge'
Here, the first index is assumed to be at the start of the string. You can see
that slicing is more forgiving than indexing. While indexing, if you give
such bad indexes, then you will get an error.
>>> s[100]
Output
IndexError: string index out of range
While slicing, you can also use a third integer inside the square brackets,
which is the stride or step of the slice.
- s[i:j:k] Part of the string s from index i to index j-1, with a
step of k. - s[3:10:2] Part of the string containing characters at indexes 3,5,7,9.
- s[3:18:3] Part of the string containing characters at indexes
3,6,9,12,15. - s[i:j:1] Equivalent to s[i:j]
s[6:1:-1] Part of the string containing characters at indexes
6,5,4,3,2. - s[20:5:-2] Part of the string containing characters at indexes
20,18,16,14,12,10,8,6. - s[::-1] String in reverse order.
The slice s[i:j:k] will extract characters from index i to index j-1,
with each subsequent index incremented by k. When the step is omitted, it is
assumed to be 1, so s[i:j:1] is equivalent to s[i:j]. In the previous
examples that we had written, it was assumed to be 1. We can give negative
steps also. In the slice s[6:1:-1] we start at 6 and add -1 each time, so
we get indexes 6,5,4,3,2. Thus, the effect of using a negative slice is that we
get the items in reverse order. The slice s[::-1] will give the whole string
in reverse order.
>>> s = 'Today is the day.'
>>> s[3:13:2]
Output
'a ste
Each alternate character of the string from index 3 to index 12 is displayed.
>>> s[::2]
Output
'Tdyi h a.'
Each alternate character of the whole string is displayed.
>>> s[::3]
Output
'Tait y'
The whole string is displayed with a step of three characters.
>>> s[::-1]
Output
'.yad eht si yadoT'
This gives the reverse of the whole string.
String Concatenation and Repetition
We know that when the operators + and * are used on numeric types, they
add and multiply numbers. These operators can also be used on strings, but
they are interpreted differently. The operator + performs string
concatenation, and the operator * performs string repetition.
String literals or string variables can be combined by using the + operator.
>>> 'ab' + 'cd'
Output
'abcd'
>>> name = 'Dev'
>>> 'Hello' + name
Output
'HelloDev'
In the first example, we have combined two string literals. In the second one,
we have combined a string literal with a string variable. In both these cases,
a new string object is created, which is displayed at the prompt. In the
second example, no space is added between the two words. If you want a
space, you must add it explicitly.
>>> 'Hello' + ' ' + name
Output
'Hello Dev'
>>> name = name + 'raj'
>>> name
Output
'Devraj'
The asterisk symbol, when used with a string and integer, acts as a repetition
operator. We can use the repetition operator to repeat a string.
>>> name = 'Dev'
>>> name * 3
Output
'DevDevDev'
The expression name * 3 returns a string object that contains the
characters of the string name repeated three times. The integer denotes the
number of times the string is repeated. You can think of it as an abbreviation
for n times concatenation. name * 3 is the same as name + name + name. The expression 3 * name also has the same effect but is less
intuitive.
>>> 'Hello ' * 5
Output
'Hello Hello Hello Hello Hello '
>>> print('-' * 40)
>>> s = 'Hee..'
>>> s = s * 3
>>> s
Output
'Hee..Hee..Hee..'
In the statement s = s * 3, we assign the string object returned by the
expression s * 3 to the variable s.
Augmented assignment syntax can be used for both concatenation and
repetition operators.
>>> s = 'butter '
>>> s += 'scotch '
>>> s
Output
'butter scotch '
>>> s *= 3
>>> s
Output
'butter scotch butter scotch butter scotch '
s += 'scotch' is equivalent to s = s + 'scotch' and the s *=
3 is equivalent to s = s * 3
The augmented assignment does not make any changes to the original
object. It reassigns the variable name to a new object.
>>> s1 = 'Good Morning !'
>>> s2 = 'Bye Bye See you'
We have these two strings and we must make a string by concatenating the
first four characters of the first string and the first three characters of the
second string. We can do this by combining the slices of the two strings.
>>> s3 = s1[:4] + s2[:3]
>>> s3
Output
'GoodBye'
This slice s1[:4] gives a string object that contains the first four characters
of the string s1 and the slice s2[:3] gives a new object that contains the
first three characters of the string s2. When these objects are combined
using the + operator, we get a new string object assigned to the name s3.
Now, we want to make a new string from the string s1, such that the first
four characters are repeated three times and the last character is repeated
five times.
>>> s4 = s1[:4] * 3 + s1[4:-1] + s1[-1] * 5
>>> s4
Output
'GoodGoodGood Morning !!!!!'
If we assign the result to the name s1, we get the effect of changing the
string s1.
>>> s1 = s1[:4] * 3 + s1[4:-1] + s1[-1] * 5
>>> s1
Output
'GoodGoodGood Morning !!!!!'
String literals can also be combined by writing them one after the other.
>>> 'abc''def''hij'
Output
'abcdefhij'
Hence, adjacent string literals are concatenated. This feature is applicable
only for literals. You cannot join string variables or expressions by using this
feature. It is useful when you want to break long string literals.
Checking Membership
The in and not in operators can be used to test for the existence of a
character or substring inside a string. The in operator returns True if a
character or substring exists in the given string; otherwise, it returns False.
The not in operator returns True if a character or substring is not present
in the string.
>>> s = 'good morning !'
>>> 'ing' in s
Output
True
>>> '?' in s
Output
False
>>> 'good morning !' in s
Output
True
>>> 'Good' in s
Output
False
>>> 'you' not in s
Output
True
>>> 'morning' not in s
Output
False
Adding White Spaces to Strings
You can add whitespace to your string to organize and present it in a
readable way. Whitespace in programming includes tabs, newlines and
spaces. The character combination '\n' adds a newline, and the
combination '\t' adds a tab to your string.
>>> print('Sun\tMon\tTue')
Output
Sun Mon Tue
>>> print('Sun\nMon\nTue\n')
Output
Sun
Mon
Tue
>>> print('Days : \n\tSun\n\tMon\n\tTue\n')
Output
Days :
Sun
Mon
Tue
A single print call gives multiple lines of output due to the inclusion of
'\n' character. This way, we can generate multiple lines of output with
only a few lines of code. However, some programmers prefer writing
separate print calls as the '\n' embedded inside a string is difficult to
read.
Creating Multi Line Strings
A string literal enclosed in single or double quotes cannot span more than
one line of a program. Such a string should be contained in a single line
only. The ending quote should appear on the same line as the starting quote.
You will get a syntax error if you try to write a multiline string inside single
or double quotes.
>>> s = 'Let us get up and get going, With a strong heart for whatever may come our way. Keep working, keep trying, Learn to work hard and be patient each day.'
Output
SyntaxError: unterminated string literal (detected at line 1)
If you want a string literal that spans across multiple physical lines, you can
use the continuation character.
>>> s = 'Let us get up and get going,\ With a strong heart for whatever may come our way.\ Keep working, keep trying,\ Learn to work hard and be patient each day.'
>>> s
Output
'Let us get up and get going,With a strong heart
for whatever may come our way.Keep working, keep
trying,Learn to work hard and be patient each day.'
>>> print(s)
Output
Let us get up and get going,With a strong heart for
whatever may come our way.Keep working, keep
trying,Learn to work hard and be patient each day.
The backslash indicates that the string is continued on the next line. Now, we
could define the string literal on multiple lines, but when this string is
printed, we do not get the literal printed on different lines. To achieve this,
we can include newline characters in between the literal. We already know
that '\n' is the newline control character used to begin a new line on a
screen, so we can use it inside the string.
>>> s = 'Let us get up and get going,\n\
With a strong heart for whatever may come our
way.\n\ Keep working, keep trying,\n\
Learn to work hard and be patient each day.'
The '\n' adds a newline character, and the backslash indicates that the
string is continued on the next line.
>>> print(s)
Output
Let us get up and get going,
With a strong heart for whatever may come our way.
Keep working, keep trying,
Learn to work hard and be patient each day.
A better and more common way is to use triple-quoted strings. If you put a
string literal inside triple quotes, it spans across multiple lines naturally. The
triple quotes can consist of three consecutive single quotes(’’’abc’’’) or
three consecutive double quotes("""abc""").
s = '''Let us get up and get going,
With a strong heart for whatever may come our way.
Keep working, keep trying,
Learn to work hard and be patient each day.'''
If your literal starts with a triple quote, you can keep adding text to it on
multiple lines. The literal ends with terminating triple quotes.
>>> print(s)
Output
Let us get up and get going,
With a strong heart for whatever may come our way.
Keep working, keep trying,
Learn to work hard and be patient each day.
>>> s
Output
'Let us get up and get going,\nWith a strong heart
for whatever may come our way.\nKeep working, keep
trying,\nLearn to work hard and be patient each
day.'
When you delimit a string literal inside triple quotes, Python adds a newline
character at the end of each line. When you print such a string with the
print function, you can see the original lines because each newline
character is interpreted.
When we used backslash to join the lines, then the newline was not added
automatically. If you want to prevent some newlines in a triple-quoted string,
add a backslash at the end of those particular lines.
>>> s = '''Let us get up and get going,
With a strong heart for whatever may come our
way.\Keep working, keep trying,Learn to work hard and be patient each day.'''
>>> print(s)
Output
Let us get up and get going,
With a strong heart for whatever may come our
way.Keep working, keep trying,
Learn to work hard and be patient each day.
Python supports triple-quoted strings so that we can write multiline strings.
Using triple quotes improves the readability of long multiline strings in the
source code. Generally, these are used in doctsrings, that we will discuss
later. Another advantage of triple-quoted strings is that we can use them to
write string literals that have to include both single and double quotes.
>>> print('''My height is 5'3" ''')
Output
My height is 5'3"
We have seen that in Python, adjacent string literals are concatenated. If we
place more than one string literal adjacent to each other on a line with
optional whitespace in between, then they will be automatically
concatenated.
>>> s = ('Let us get up and get going,''With a strong heart for whatever may come our way.''Keep working, keep trying,' 'Learn to work hard and be patient each day. ')
Output
>>> print(s)
Let us get up and get going,With a strong heart for
whatever may come our way.Keep working, keep
trying, Learn to work hard and be patient each day.
This can be another way of writing strings that span multiple lines. This
approach does not add any newline characters in the string. If you need
newlines, you need to add the newline character explicitly in the literals.
This approach can be helpful if you need to add comments to separate lines
of the string.
>>> s = ('Let us get up and get going,''With a strong heart for whatever may come our way.' # prepared for anything' Keep working, keep trying,''Learn to work hard and be patient each day. ')
# patience is the key
>>> print(s)
Output
Let us get up and get going,With a strong heart for
whatever may come our way.Keep working, keep
trying,Learn to work hard and be patient each day.
The comments are not included in the string. We do not see them when we
print the string. In triple-quoted strings, if you try to add comments like this,
those comments will be added to the string.
In the previous section, we had seen that for adding a multiline comment, we
had to precede each line with a # sign. We can also use single triple quotes or
double triple quotes to insert multiline comments in our code.
# This is a multiline comment
# It explains the code
# It has no effect on the code
''' This is also a multiline comment
It explains the code
It has no effect on the code
'''
The triple quoted string is written all by itself. We are not printing it or
assigning it to any variable. It is an unused string, so we can use it as a
comment. However, this style of writing comments is not recommended and in most places, you will find comments that use the # sign. The triple-quoted
strings are used for doc strings, which we will discuss later.
Code bundle here:
https://github.com/codeaihub1998/Ultimate_Python_1-to-21
If you read this article till the end, please consider the following:
- Follow the author to get updates of upcoming articles 🔔
- If you like this article, please consider a clap 👏🏻
- Highlight that inspired you
- If you have any questions regarding this article, please comment your precious thoughts 💭
- Share this article by showing your love and support ❤️