+27833657551     admin@starscientist.org
    +27833657551     admin@starscientist.org

Introduction to Python (part-2)

Python for Beginners (part-2)

This is a continuation of the python for beginners’ lessons in the previous page. We have seen how to install python, how to use the python console, and how to get help. I have also introduced you to python identifiers and variables. In this section, we will now look into different python data types and their methods.

Getting started with Python (part-2)

Python Data types

There are two main categories of Python data types: immutable and mutable. In Python, immutable data types are objects whose values cannot be changed or updated once they are created. Inversely, mutable objects can be changed or updated after creation. The mutability or immutability of Python objects depends on their data types. For example, string, tuple, int, float, complex, decimal, bool, range, complex, frozenset, and bites are immutable. Mutable objects include list, dictionary, numpy arrays, set, and bitearray. We will learn more about them in the next sections.

Python Numeric Types

Numeric types fall under the categories of immutable Python data types. There are three distinct numeric types:

  Integers: represent whole numbers and negative numbers without fractional parts or complex components
Example: …, -10, -9, -8,…, 0, 1,.., 8, 9, 10,… Note that other number systems are supported: Base 2 (Binary), Base 8 (Octal), Base 16 (Hexadecimal). Integer numbers are of type int in Python 3.

>>>type(-10000)
<class 'int'>
>>>0b1100100 # Binary
100
>>>type(0b1100100) 
<class 'int'>
>>>0o144 # Octal 
100
>>>type(0o144)
<class 'int'>
>>>0x64 # Hexadecomal
100
>>>type(0x64)
<class 'int'>

Notice that we had to put a prefix before the numbers to distinguish them from the usual base 10 systems, 0b for binary, 0o for octal, and 0x for hexadecimal numbers.

  Floating point numbers: represent any real number with a decimal point
Example: 3.5, 3.5e+10, 53E-10, 53E+10. Floating point numbers are of type int in Python.

>>>3.5
3.5
>>>type(3.5)
<class 'float'>
>>>53E+10
530000000000.0
>>>type(53E+10)
<class 'float'>

  Complex numbers: To represent complex numbers of the form a + jb where a and b are real numbers and j is an imaginary number with j2 = -1. Python complex numbers are written as a + bj or a+bJ. Complex numbers can also be created using the complex() class. Complex numbers are of type complex.

>>>1 + 5j
(1+5j)
>>>1 + 5J
(1 + 5j)
>>>complex(1, 5)
(1+5j)
>>>type(1 + 5J)
<class 'complex'>

The real part and the imaginary part of a complex variable c can be obtained using c.real and c.imag, respectively.

>>>c = 1 + 5j
>>>c.real
1.0
>>>c.imag
5.0

Boolean types

Python boolean data types have two possible values: True or False. Comparison operators return a boolean value: ==, !=, >, >=, < , <= . Expressions that return a boolean value are called boolean expressions.

>>>5 == 5
True
>>>type(True)
<class 'bool'>
>>>3 != 5
True
>>>3 > 5
False
>>>3 < 5 
True
>>>5 <= 5 
True

Note that True has a value of 1 and False has a value of 0 internally. Therefore, it is completely Ok to do the following:

>>>complex(True, False)
(1+0j)
>>>True + 10
11
>>>True + False 
1
>>>True * False 
0

Logical operators are typically (but not necessarily) used with boolean expressions to yield a single boolean value (True or False). There are three logical operators in Python: and, or, not.

>>>(1 == 1) or (3 > 24)
True
>>>(1 == 1) or (3 < 24)
True
>>>(1 == 1) and (3 < 24)
True
>>>(1 == 1) and (3 > 24)
False
>>>not (1 == 1)
False

The or operator returns True if any of the boolean operands is True. Python then returns the value of the first True operand and stops evaluating the rest of the expressions. The or operator returns False if everything is False and yields the value of the last operand.

>>>(1==3) or (1==1) or (5 <10) 
True
# (1 ==3) is False, (1==1) is True so Python does not care about evaluating (5<10).
>>>False or True or False
True
>>> False or False or False
False
 >>>() or 10 or {} 
10
# () is False, 10 is True, {} is False and thus the result is 10
>>>() or {} or []   
[]
# () is False, {} is False, [] is False. Since all expressions are False, Python returns the last operand which is [].
>>>0 or 1 or 20
1

The and operator return False if any of the operands are False. Python yields the value of the first False operand and does not evaluate the rest of the expressions. The result of the and operator is True if everything is True. In this case, Python returns the value of the last operand.

>>>(1==3) and (1==1) and (5 <10) 
False
#(1==3) is False and thus Python stops evaluating the rest of the expressions
>>>False and True and False
False
>>>True and True and True
True
>>>() and 10 and {} 
()
>>>() and {} and [] 
()
>>>0 and 1 and 20
0
>>>30 and 1 and 20
20

The not operator reverse the value of boolean expressions. Therefore not True is False and not False is True.

>>>not True 
False
>>>not False
True
>>>not {}
True
>>>not 0
True
>>>not 10
False

Strings

Strings are a sequence of characters enclosed in single quotes, double quotes or triple quotes. They are usually used to store text, the name of persons, things or animals.

>>>myname = "Ianja"
>>>type(myname)
<class 'str'>
>>>message = "Python is a versatile programmimg language!"

To create multi-line strings, triple quotes can be used.

multi_line_string	=	"""This	is	a multi-line string created using triple quotes.
                                    This	is	the	second line.
                                    And this is	the	third	line."""
print(multi_line_string)

Upon executing the above code, we get the following output

This is a multi-line string created using triple quotes.
This is the second line.
And this is the third line.

Muli-line strings can also be created using single quotes or double-quotes. In this case, the \n escape character must be used:

multi_line_string = "This is a multi-line string created using double-quotes. \n This is the second line.\n And this is the third line."
print(multi_line_string)

This produces the following:

This is a multi-line string created using double-quotes.
This is the second line.
And this is the third line.

The problem with the above expression is that the line is very long. Python style guide suggests that the maximum line length should be 79 characters. To comply with this guideline, we can enclose the string in round brackets () as follows and get the same output as above:

multi_line_string = ("This is a multi-line string created using double-quotes\n" 
                                "This is the second line.\n"
                                 "And this is the third line.")
print (multi_line_string) 

String falls into the mutable Python data types and thus, it cannot be changed once it is created. Let’s see this in the following code.

>>> strname = "This is a string"
>>> strname.upper() # makes all characters uppercase
'THIS IS A STRING'
>>> strname # the original string has not been changed
'This is a string'
>>> strname[0] = 'A' # Let's try to change the first character
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
List of String Methods in Python 3
Methods Description Examples
capitalize() Convert the first character of a string to uppercase letter.

>>>strg = "this is a string"
>>>strg.capitalize()
"This is a string"
casefold() Convert all characters in a string into lowercase

>>>strg = "The Universe Is Beautiful"
>>>strg.casefold()
"the universe is beautiful"
center() Syntax: string.center(length, fillchar). It returns a string, strg, centered around fillchar. The newly returned string has a length length

>>>strg = "The universe is beautiful" 
>>>strg.center(50, "-")
"------------The universe is beautiful-------------"
count() Syntax: string.count(substring, start, end). Returns the number of occurence of a character or a substring within a string. Note that start and end are optionals. They indicate the starting index and the stoping index within which the search is performed.

>>>strg = "stars are shiny, stars are many"
>>>strg.count("stars")
2
>>>strg.count("stars", 0, 16)
1
encode() Syntax: string.encode(encoding, errors). Encode a character string, strg, into a bytes object. The encoding type is specified by encoding (e.g., "utf-8", "ascii", "latin1"); the default is "utf-8". The error handling scheme is specified by errors. Possible values are "strict", "replace", "ignore", "xmlcharrefreplace", "backslashreplace" and other names in codecs.register_error(). The default is "strict" which will raise a UniCodeError.

        >>> "héliosphère".encode("utf-8")
b"h\xc3\xa9liosph\xc3\xa8re"
endswidth() Syntax: string.endswith("character", start, end). Returns True if the string ends with the character specified by "character" and False otherwise. The starting index and the stoping index are optionals.

>>> strg = "Are we alone in the Universe?"
>>> strg.endswith("?")
True
expandtabs() Syntax: string.expandtabs(tabsize). Fill all tab characters \t in a string with white spaces until the next tab stop. The size of the white space for each \t character depends on their positions within the string and the tab size specified by tabsize. The default tab size is 8. Note that this method is NOT equivalent to just replacing all \t into spaces of size tabsize.
>>>strg = "abcde\tfgt\tddfg\ty"
>>>strg.expandtabs(4)
"abcde   fgt ddfg    y"
As you can see, the space allocated for each \t character is not the same. The first \t was allocated 3 spaces, the second one was allocated one space and the last one 4 spaces. This can be understood as follows. The tabsize is 4, thus the tab stopes are positions 3, 7, 11, 15, 19,…(tabsize-1, tabsize*2-1, tabsize*3-1,…). The position of the first tab character is 5, thus the next tab stop is 7. As a result, the first \t got allocated 3 spaces to end up at position 7. Now the position of the second \t is 11, the next tab stop is 11. Thus, the second \t got 1 space to stop at position 11. The last \t is now at position 16 and the next tab stop is 19. So the last \t got 4 spaces to end up at postion 19. Try to guess the result of the following code:

>>>"abcdefg\tabc\ta&".expandtabs(5)
find() Syntax: string.find(subtring, start, end). Return the position (index) of a character or substring within a string. The starting index and the stoping index are optionals. By default, the search is performed within the entire string. If the specified characters or substring is not found, -1 is returned. Moreover, if the specified characters or substring occur more than once, the position of the first occurence is returned unless the start and the stoping index are specified.

>>>strg = "abcde"
>>>strg.find("e")
4
>>>strg = "abcde ghi&quot
>>>strg.find(g)
6
>>>strg = "abcdefghie"
>>>strg.find("e")
4
>>>strg.find("e", 5)
9
>>>strg.find("m")
-1
format() Replace format items within a string and return a string in a desired formatting scheme.

>>>strg = "{0} rounded to 2 decimal places = {1:.2f}"
>>>strg.format(0.1345, 0.1345)
"0.1345 rounded to 2 decimal places = 0.13"
>>>strg = "Formatting {0} with 6 decimal places = {1:.6f}"
>>>strg.format(0.1345, 0.1345)
"Formatting 0.1345 with 6 decimal places = 0.134500"
>>>strg = "Displaying 0.1345 using 10 characters, two decimal places:\n {0: 10.2f}"
>>>strg_formatted = strg.format(0.1345)
>>>print(strg_formatted)
Displaying 0.1345 using 10 characters, two decimal places:
       0.13
# More examples
>>>"{:d}".format(4)
"4"
>>>"{:+d}".format(4)
"+4"
# left indent
>>>"{:>12d}".format(4)
"           4"
# right indent
>>>"{:<12d}".format(4)
"4           "
# replacing default blank spaces with *
>>>"{:*<22}".format("Galaxies have many ")
"Galaxies have many ***"
>>>"{:*>25}".format(" are many in galaxies.")
"*** are many in galaxies."
>>>"{:+^34}".format("I am in the middle")
"++++++++I am in the middle++++++++"
# Using keyword arguments
>>>strg = "My name is {name}, I play {instrument}"
>>>strg.format(name = "Ianja", instrument="the piano")
"My name is Ianja, I play the piano"
# Using a dictionary, in this case format_map() is better
>>>strg = "My name is {name}, I play {instrument}"
>>>mapping = {"name": "Ianja", "instrument": "the piano"}
>>>strg.format(**mapping)
"My name is Ianja, I play the piano"
format_map() Syntax: string.format_map(mapping). Similar to format(**mapping) but format_map() uses the mapping parameter directly without copying it to a dictionary. In addition, format_map() can handle missing values (KeyError) by defining the __missing__ method of the mapping class.

>>>student = {'name': 'John', 'age':20, 'country': 'Madagascar'}
>>>strg = "{name} is {age} years old."
>>>strg.format_map(student)
"John is 20 years old."
>>>strg = "{name} works as a {job}"
strg.format_map(student)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: "job"
>>>class MissingKeyHandler(dict):
...        def __missing__(self, key):
...            return key + " not in the data base yet."
>>>strg.format_map(MissingKeyHandler(student))
"John works as a job not in the data base yet."
# Using nested dictionaries.
>>>student = {"John": {"age": 20, "country": "Madagascar"}, "Mozart": {"age":20, "country": "Germany"}}
>>>strg = "John is {age} years old."
>>>strg.format_map(student["John"])
"John is 20 years old."
index() Syntax: string.index(string, start, end). Same as the find() method except that index() raises ValueError if the string to search for is not found. Recall that find() returns -1 in this case.

>>>strg = "abcdefghie"
>>>strg.index("e")
4
>>>string.index("t")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: substring not found
isalnum() Syntax: string.isalnum(), returns True if all the characters within the string are alphanumeric. Returns False if at least one character is not alphanumeric (e.g., space character, #$%&*^!~{}[])

>>>"galaxy10".isalnum()
True
>>>"galaxy 10".isalnum()
False
>>>"galaxy&".isalnum()
False
>>>"galaxy".isalnum()
True
>>>"123".isalnum()
True
isaplha() Syntax: string.isalpha(). Returns True if all characters in the string belong to the letters of the alphabet and False otherwise.

>>>strg = "galaxy"
>>>strg.isalpha()
True
>>>strg = "galaxy1"
>>>strg.isalpha()
False
>>>strg = "galaxy "
>>>strg.isalpha()
False
isdecimal() Syntax: string.isdecimal(). Returns True if all characters within the string are decimals and False otherwise.

>>>strg = "78"
>>>strg.isdecimal()
True
>>>strg = " 78"
>>>strg.isdecimal()
False
>>>strg = "galaxy78"
>>>strg.isdecimal()
False
>>>strg = "78.5"
>>>strg.isdecimal()
False
isdigit() TBD

isidentifier() TBD

islower() Returns True if all the alphabets of the string are lowercase, and False otherwise.

>>>strg = "test lowercase"
True
>>>strg = "Test lowercase"
False
>>>strg = "test lowercase num 2"
True
>>>strg = "test Lowercase num 2"
False
isnumeric() TBD

isprintable() TBD

isspace() Returns True if only whitespace characters are found in the string, otherwise False.

>>>strg=""
>>>strg.isspace()
False
>>>strg=" "
>>>strg.isspace()
True
>>>strg=" 2"
>>>strg.isspace()
False
>>>strg="    "
>>>strg.isspace()
True
istitle() TBD

isupper() Returns if all the alphabets in the string are uppercase, and False otherwise.

>>>strg = "UNIVERSE"
>>>strg.isupper()
True
>>>strg = "Universe"
>>>strg.isupper()
False
>>>strg = "END 2 END"
>>>strg.isupper()
True
>>>strg = "END 2 end"
>>>strg.isupper()
False
join() Usage: strg1.join(strg2). Concatenates the elements of strg2 using strg1 as separators or delimiters.

>>>strg = "World"
>>>"-".join(strg)
"W-o-r-l-d"
>>>strg = ("Hello", "World")
>>>"-".join(strg)
"Hello-World"
>>>strg = ("Hello", "World")
>>>"-".join(strg)
"Hello-World"
>>>" ".join(strg)
"Hello World"
>>>strg = "Hello"
>>>strg2 = "World"
>>>strg.join(strg2)
"WHellooHellorHellolHellod"
ljust() Syntax: strg.ljust(width[,fillchar]); returns a left-justified string of specified width (length). The remaining space is filled by fillchar if the latter is specified. Note that the length of fillchar must be 1. If width is less than or equal to the length of the original string, then strg.ljust() just returns the original string. If width is greater than the length of the original string, then the length of the returned string is equal to width.

>>>strg = "Universe"
>>>strg.ljust(11, "*")
"Universe***"
>>>strg.ljust(11) # no fillchar is provided so space is used as the default fillchar.
"Universe   "
>>len(strg.ljust(11))
11
>>>strg.ljust(4, "*") # 4 is less than len(strg) so the original string is returned.
"Universe"
>>>strg.ljust(11, "***") # len("***") is greater than 1, only one character should be given.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: The fill character must be exactly one character long.
lower() Makes all uppercase alphabets in the string to lowercase.

>>>strg = "Universe"
>>>strg.lower()
"universe"
lstrip() Syntax: strg.lstrip([chars]). Trims or removes characters chars from the beginning of the string until a character not found in chars is encountered. chars is optional and if omitted, all leading spaces are removed.

>>>strg = "*****Stars are many**"
>>>strg.lstrip("*")
"Stars are many**"
>>>strg = "   There is no empty space in the Universe"
>>>strg.lstrip()
"There is no empty space in the Universe"
>>>strg = "   There is no empty space in the Universe"
>>>strg.lstrip("There is")
"no empty space in the Universe"
>>>strg = "****There are many stars in the Universe"
>>>strg.lstrip("There are")
"****There are many stars in the Universe"
>>>strg = "****There are many stars in the Universe"
>>>strg.lstrip("*There are")
"many stars in the Universe"
maketrans() Syntax: str.maketrans(x[, y[, z]]). Makes a translation table, which can be used in the translate method to replace and/or delete characters in the original string. Some rules: if one argument is specified then it must be a dictionary. In this case, the keys specify the characters that need to be replaced in the original string. The keys must be of length 1. The dictionary’s values specify the corresponding replacement characters. If a particular key has a value None, the corresponding matched character in the original string will be deleted. Note that the values of the dictionary must be of type integer, None or str. If two arguments are specified, they must be strings and have the same length. In this case, each characters in the first string will be replaced by the characters at the same index position in the second string. The third (optional) string argument specifies the string to be deleted.

>>>strg = "123to135 km/s"
>>>table = str.maketrans({"t":"-", "o":None})
>>>strg.translate(table)
'123-135 km/s'
>>>table = str.maketrans("t","-","o")
>>>strg.translate(table)
'123-135 km/s'
>>>table = str.maketrans("tkms","-mih","o")
>>>strg.translate(table)
'123-135 mi/h'
>>>table = str.maketrans({"t":" km/s - ", "o":None})
>>>strg.translate(table)
'123 km/s - 135 km/s'
partition() Syntax: strg.partition(separator_chars). Returns a tuple with three-string elements. The first element is the string before the first occurence of separator_chars. The second element is the separator_chars itself, the third element is the string after the first occurence of separator_chars. If separator_chars is not found in strg, then the first element of the tuple is the original string, and the last two elements are empty strings.

>>>strg = "There are many stars in galaxies; stars are born in molecular clouds. "
>>>strg.partition("stars")
("There are many ", "stars", " in galaxies; stars are born in molecular clouds. ")
>>>strg.partition("planets")
("There are many stars in galaxies; stars are born in molecular clouds. ", "", "")
replace() Syntax: strg.replace(oldchar, newchar[, num_occurence]). Replace the first num_occurence occurence of oldchar in strg with newchar. If num_occurence is omitted, all occurence of oldchar in strg will be replaced by newchar.

>>>strg = "How does the Universe work? the Universe is so fascinating!"
>>>strg.replace("the Universe", "it")
"How does it work? it is so fascinating!"
>>>strg.replace("the Universe", "it", 1)
"How does it work? the Universe is so fascinating!"
rfind() Syntax: strg.rfind(chars,[begin=0, end=len(strg)]). Returns the index of the last occurence of chars in strg. The search option can be restricted within begin and end if specified.

>>>strg = "Find the index of the last occurence of index within this string."
>>>strg.rfind("index")
40
>>>strg.rfind("index", 0, 40)
9
>>>strg.rfind("notfound")
-1
rindex() Syntax: strg.rindex(chars[, begin=0, end=len(strg)]). Returns the index of the last occurence of chars in strg. If chars is not found, an exception is raised.

>>>strg = "Find the index of the last occurence of index within this string."
>>>strg.rindex("index")
40
>>>strg.rindex("index", 0, 40)
9
>>>strg.rindex("notfound")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: substring not found
rjust() Syntax: strg.rjust(width[,fillchar]). Returns a right-justified string. The length of the returned string is specified by width. If width is less than or equal to the length of the original string, then no change is made. If width is greater than the length of strg, and fillchar is specified, the remaining space on the left side of the string is filled by fillchar.

rpartition() Syntax: strg.rpartition(separator_chars). Returns a tuple with three-string elements. The first element is the string before the last occurence of separator_chars. The second element is the separator_chars itself, the third element is the string after the last occurence of separator_chars. If separator_chars is not found in strg, then the first element of the tuple is the original string, and the last two elements are empty strings.

>>>strg = "There are many stars in galaxies; stars are born in molecular clouds. "
>>>strg.rpartition("stars")
("There are many stars in galaxies; ", "stars", " are born in molecular clouds. ")
>>>strg.partition("planets")
("There are many stars in galaxies; stars are born in molecular clouds. ", "", "")
rsplit() TBD

rstrip() TBD

split() Syntax: strg.split([chars]). Split string on whitespace characters, including newline character “\n”, and tab character “\t”. If the optional argument chars is specified, then the splitting is done on the specified substring chars. The returned type is a list.

>>>strg = "The Universe is expanding"
>>>strg.split()
['The', 'Universe', 'is', 'expanding']
>>>strg.split('e')
['Th', ' Univ', 'rs', ' is ', 'xpanding']
>>>strg = "The Universe is expanding \n at an accelerated rate"
>>>strg = "The Universe is expanding, \n and this is because of dark energy"
>>>strg.split()
['The', 'Universe', 'is', 'expanding,', 'and', 'this', 'is', 'because', 'of', 'dark', 'energy']
>>>strg.split('because')
['The Universe is expanding, \n and this is ', ' of dark energy']
splitlines() TBD

startswith() Syntax: strg.startswith(chars[, begin=0, end=len(strg)]). Returns True if the string (strg) starts with the specified character chars and False otherwise. The search can be restricted within the optionl arguments begin and end.

>>>strg = "Where do galaxies et their gas?"
>>>strg.startswith("W")
True
>>>strg.startswith("K")
False
>>>strg.startswith("W", 1) # we begin the search from index 1
False
>>>strg.startswith("W", 0) # we begin the search from index 0
True
>>>strg.startswith("h", 1) # we begin from index 1, and strg[1] = "h" so the result is True
True
strip() Syntax: strg.strip([chars]). Removes leading and trailing characters specified by the optional argument chars. All combination of the elements of chars are removed. If chars is omitted, leading and trailing spaces are removed.

>>>strg = "  Where do galaxies et their gas?  "
>>>strg.strip()
'Where do galaxies et their gas?'
>>>strg = "***Where do galaxies et their gas?***"
>>>strg.strip("*")
'Where do galaxies et their gas?'
>>>strg = "###This is the comment#1"
>>>strg.strip("#")
'This is the comment#1'
>>>strg = "#The script is called script.pynb"
>>>strg.strip("#bypn.")
'The script is called script'
swapcase() TBD

title() TBD

translate() Syntax: strg.translate(table). Returns a string whose characters have been mapped using the translation table table. This can be used to replace/delete characters in a string. See examples given in maketrans().
upper() Makes all lowercase alphabets in the string to uppercase.

>>>strg = "Universe"
>>>strg.upper()
"UNIVERSE"
zfill() Syntax: strg.zfill(length). Returns a new string with a new length specified by length, where length must be of type int. The left side of the new string is filled with 0 to achieve the specified length. If the string is preceded by either a plus sign or a minus sign, then the 0s are inserted after the signs. If length is less than or equal to the length of the original string, the no change is made.

>>>strg = "345"
>>>strg.zfill(13)
'0000000000345'
>>>strg.zfill(1)
'345'
>>>strg = "+345"
>>>strg.zfill(13)
'+000000000345'

Python Compound Data Types

Lists

Python lists are ordered and mutable data types. They can be created by placing items inside a square bracket: items= [item1, item2, item3,…]. The items inside [] can be of any data types. Therefore, a list can be placed inside a list.

>>> mylist = [1, 4.5, True, complex(1,2)]
>>> mylist
[1, 4.5, True, (1+2j)]
>>> type(mylist)
<class 'list'>
>>> list_of_list = [1,2,3,[],[2,3,5]]
>>> list_of_list
[1, 2, 3, [], [2, 3, 5]]
>>>mylist = "ianja"
>>>mylist.count(ianja, 50)

Lists have many interesting properties and built-in methods.