More on Python Basics¶
The content and the code excerpts have been derived from the following sources:- 7 Python Random Module Functions You Should Know. https://www.techbeamers.com/using-python-random/
- What's the difference between Randrange and Randint functions. https://www.codecademy.com/en/forum_questions/521bcf2b548c359b28000367
- Python Regular Expressions https://www.tutorialspoint.com/python/python_reg_expressions.htm
- Python Regular Expression Tutorial https://www.datacamp.com/community/tutorials/python-regular-expression-tutorial
- Regular Expressions https://developers.google.com/edu/python/regular-expressions
- Map Filter and Reduce http://book.pythontips.com/en/latest/map_filter.html
- Python File Handling https://www.pythonforbeginners.com/cheatsheet/python-file-handling
Random Functions (from the Random Module)¶
Random values can be generate using the following functions of the Random module in Python.
1. Randrange() Function.¶
Syntax: Randrange(stop) or, Randrange (start, stop[,step]) The stop value sets the boundary of the range.
In [3]:
import random
print(random.randrange(999))
In [6]:
print(random.randrange(0,1000,20)) #generate a random value in the intervals of 20 between 0 and 1000
In [10]:
print (random.randrange(20,25)) #generates a number between 20 and 25
In [50]:
print(random.randrange(0.2, 0.5)) #randrange takes only integer arguments... hence the ERROR
2. Random.Randint(Low, High) Function.¶
The randint() function is one of many functions which handle random numbers. It has two parameters low and high and generates an integer between low and high, inclusive.
In [16]:
i = 0
while i < 5:
# Get random number in range 0 through 9.
r = random.randint(0, 9)
print(r)
i += 1
There is one slight difference between randrange and randint when used with just two parameters. randint(x,y) will return a value >= x and <= y, while randrange(x,y) will return a value >=x and < y (n.b. not less than or equal to y)
When using 3 parameters, randint can take only 2 parameters.
When using 3 parameters, randint can take only 2 parameters.
3. Random.Choice(Seq) Function.¶
The choice() function arbitrarily determines an element from the given sequence.
In [18]:
# Generates a random string from the list of strings
print(random.choice( ['Apple', 'Ball', 'Cat'] ))
# Generates a random number from a list
print(random.choice([-1, 1, 3.5, 9, 15]))
# Generates a random number from a tuple
print(random.choice((1.1, -5, 6, 4, 7)))
# Generate as random char from a string
print(random.choice('Life is Beautiful'))
4. Random.Shuffle(List) Function.¶
Purpose- The shuffle() function rearranges the items of a list in place so that they occur in a random order.For shuffling, it uses the Fisher-Yates algorithm which has O(n) complexity. It starts by iterating the last element in the array to the first entry, then swap each entry with an entry at a random index below it.
In [20]:
from random import shuffle
mylist = [11,21,31,41,51]
shuffle(mylist)
print(mylist)
5. Random.Sample(Collection, Random List Length) Function.¶
The sample() function randomly selects N items from a given collection (list, tuple, string, dictionary, set) and returns them as a list. It works by sampling the items without replacement. It means a single element from the sequence can appear in the resultant list at most once.
In [22]:
from random import sample
print(sample('Nepal',3)) # Selects any 2 chars from a string
print(sample((21, 12, -31, 24, 65, 16.3), 3)) # Creates a tuple of any three elements from a base tuple
print(sample([11, 12, 13, 14, -11, -12, -13, -14], 3)) # Randomly selects a list of three elements from a base list
print(sample({110, 120, 130, 140}, 3)) # Randomly selects a subset of size three from a given set of numbers
print(sample({'ABC', 'BCD', 'CDE', 'EFG'}, 3)) # Randomly selects a subset of size three from a given set of strings
6. Random.Random() Function.¶
It selects the next random floating point number from the range [0.0, 1.0]. It is a semi-open range as the random function will always return a decimal no. less than its upper bound. However, it may return 0.
In [26]:
from random import random
print(random())
print(random())
print(random())
7.Random.Uniform(Lower, Upper) Function.¶
It is an extension of the random() function. In this, you can specify the lower and upper bounds to generate a random number other than the ones between 0 and 1.
In [31]:
import random
print(random.uniform(500, 900))
print(random.uniform(500, 900))
print(random.uniform(500, 900))
In [34]:
# Generate a floating-point random number with fixed precision
import random
lower = 1.0; upper = 2.0; fixed_precision = 2
random_float = random.uniform(lower, upper)
print(round(random_float, fixed_precision))
#USING THE ROUND function to roundup to the fixed precision which is 2 in this case
REGEX - REGular EXpressions in Python¶
A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern. The module re provides full support for Perl-like regular expressions in Python. The re module raises the exception re.error if an error occurs while compiling or using a regular expression.
Match() Function¶
This function attempts to match RE pattern to string with optional flags.Syntax: re.match(pattern, string, flags=0)
In [8]:
import re
line = "Maths is easier than Literature"
matchObj = re.match( r'(.*) is (.*?) .*', line, re.M|re.I)
if matchObj:
print "matchObj.group() : ", matchObj.group()
print "matchObj.group(1) : ", matchObj.group(1)
print "matchObj.group(2) : ", matchObj.group(2)
else:
print "No match!!"
Search() Function¶
This function searches for first occurrence of RE pattern within string with optional flags.Syntax: re.search(pattern, string, flags=0)
In [10]:
import re
line = "Maths is easier than Literature";
searchObj = re.search( r'(.*) is (.*?) .*', line, re.M|re.I)
if searchObj:
print "searchObj.group() : ", searchObj.group()
print "searchObj.group(1) : ", searchObj.group(1)
print "searchObj.group(2) : ", searchObj.group(2)
else:
print "Nothing found!!"
Match() checks for a match only at the beginning of the string, while Search() checks for a match anywhere in the string
In [11]:
import re
line = "Cars are faster than bicycles";
matchObj = re.match( r'bicycles', line, re.M|re.I)
if matchObj:
print "match --> matchObj.group() : ", matchObj.group()
else:
print "No match!!"
searchObj = re.search( r'bicycles', line, re.M|re.I)
if searchObj:
print "search --> searchObj.group() : ", searchObj.group()
else:
print "Nothing found!!"
Sub()¶
This method replaces all occurrences of the RE pattern in string with repl, substituting all occurrences unless max provided. This method returns modified string.Syntax: re.sub(pattern, repl, string, max=0)
In [16]:
import re
phone = "610-845-2975 # This is Phone Number"
num = re.sub(r'#.*$', "", phone) #Removing # by replacing it wih ""
print ("Phone Num : ", num)
num = re.sub(r'\D', "", phone) #Removing all characters except digits.
print ("Phone Num : ", num)
num = re.sub(r'610',"972", phone) #replacing the area code of the phone number
print ("Phone Num : ", num)
Wild Card Characters: Special Characters¶
Special characters are characters which do not match themselves as seen but actually have a special meaning when used in a regular expression.The most widely used special characters are:
. - A period. Matches any single character except newline character.
In [19]:
re.search(r'Co.k.e', 'Cookie').group() #The group() function returns the string matched by the re.
Out[19]:
\w - Lowercase w. Matches any single letter, digit or underscore.
In [20]:
re.search(r'Co\wk\we', 'Cookie').group()
Out[20]:
\W - Uppercase w. Matches any character not part of \w (lowercase w).
In [22]:
re.search(r'C\Wke', 'C@ke').group()
Out[22]:
\s - Lowercase s. Matches a single whitespace character like: space, newline, tab, return.
\S - Uppercase s. Matches any character not part of \s (lowercase s).
In [25]:
print(re.search(r'Eat\scake', 'Eat cake').group())
print(re.search(r'Cook\Se', 'Cookie').group())
In [32]:
#\d - Lowercase d. Matches decimal digit 0-9.
print(re.search(r'c\d\dkie', 'c00kie').group())
#^ - Caret. Matches a pattern at the start of the string.
print(re.search(r'^Eat', 'Eat cake').group())
#$ - Matches a pattern at the end of string.
print(re.search(r'cake$', 'Eat cake').group())
#[abc] - Matches a or b or c.
#[a-zA-Z0-9] - Matches any letter from (a to z) or (A to Z) or (0 to 9). Characters that are not within a range can be matched by complementing the set. If the first character of the set is ^, all the characters that are not in the set will be matched.
print(re.search(r'Number: [0-6]', 'Number: 5').group())
#Matches any character except 5
print(re.search(r'Number: [^5]', 'Number: 0').group())
#\A - Uppercase a. Matches only at the start of the string. Works across multiple lines as well.
print(re.search(r'\A[A-E]ookie', 'Cookie').group())
#\b - Lowercase b. Matches only the beginning or end of the word.
print(re.search(r'\b[A-E]ookie', 'Cookie').group())
#\ - Backslash. If the character following the backslash is a recognized escape character, then the special meaning of the term is taken. For example, \n is considered as newline. However, if the character following the \ is not a recognized escape character, then the \ is treated like any other character and passed through.
# This checks for '\' in the string instead of '\t' due to the '\' used
print(re.search(r'Back\\stail', 'Back\stail').group())
# This treats '\s' as an escape character because it lacks '\' at the start of '\s'
print(re.search(r'Back\stail', 'Back tail').group())
Repetitions¶
It becomes quite tedious if you are looking to find long patterns in a sequence. Fortunately, the re module handles repetitions using the following special characters:
In [33]:
# + - Checks for one or more characters to its left.
print(re.search(r'Co+kie', 'Cooookie').group())
# * - Checks for zero or more characters to its left.
# Checks for any occurrence of a or o or both in the given sequence
print(re.search(r'Ca*o*kie', 'Caokie').group())
# ? - Checks for exactly zero or one character to its left.
# Checks for exactly zero or one occurrence of a or o or both in the given sequence
print(re.search(r'Colou?r', 'Color').group())
But what if you want to check for exact number of sequence repetition?
For example, checking the validity of a phone number in an application. re module handles this very gracefully as well using the following regular expressions:
{x} - Repeat exactly x number of times.
{x,} - Repeat at least x times or more.
{x, y} - Repeat at least x times but no more than y times.
The + and * qualifiers are said to be greedy.
For example, checking the validity of a phone number in an application. re module handles this very gracefully as well using the following regular expressions:
{x} - Repeat exactly x number of times.
{x,} - Repeat at least x times or more.
{x, y} - Repeat at least x times but no more than y times.
The + and * qualifiers are said to be greedy.
In [34]:
re.search(r'\d{9,10}', '0987654321').group()
Out[34]:
Here's a cheat sheet for the regular expressions: https://www.dataquest.io/blog/regex-cheatsheet/¶
Map, Filter and Reduce¶
Map()¶
Map() applies a function to all the items in an input_list. Syntax: map(function_to_apply, list_of_inputs)
In [36]:
#Without using Map()
items = [1, 2, 3, 4, 5]
squared = []
for i in items:
squared.append(i**2)
print(squared)
In [38]:
#Using Map()
items = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x**2, items))
print(squared)
In [39]:
#We can also use a list of functions
def multiply(x):
return (x*x)
def add(x):
return (x+x)
funcs = [multiply, add]
for i in range(5):
value = list(map(lambda x: x(i), funcs))
print(value)
Filter()¶
Filter creates a list of elements for which a function returns true.
In [40]:
number_list = range(-5, 5)
less_than_zero = list(filter(lambda x: x < 0, number_list))
print(less_than_zero)
Reduce()¶
Reduce is a really useful function for performing some computation on a list and returning the result. It applies a rolling computation to sequential pairs of values in a list.
In [43]:
product = reduce((lambda x, y: x * y), [1, 2, 3, 4])
print (product) # prints the product of the integerst in the list [1,2,3,4]
File Handling in Python¶
File handling in Python requires no importing of modules. Instead we can use the built-in object "file". That object provides basic functions and methods necessary to manipulate files by default.Open()¶
The open() function is used to open files in our system, the filename is the name of the file to be opened. The mode indicates, how the file is going to be opened "r" for reading, "w" for writing and "a" for a appending.
In [45]:
filename = "hello.txt"
file = open(filename, "r")
for line in file:
print line,
Read ()¶
The read functions contains different methods, read(),readline() and readlines() read() -- return one big string readline -- #return one line at a time readlines -- returns a list of linesWrite ()¶
This method writes a sequence of strings to the file.write() -- Used to write a fixed sequence of characters to a file writelines() -- writelines can write a list of strings.
Append ()¶
The append function is used to append to the file instead of overwriting it. To append to an existing file, simply open the file in append mode ("a"):Close()¶
When you’re done with a file, use close() to close it and free up any system resources taken up by the open file
In [49]:
#To open a text file, use:
fh = open("hello.txt", "r")
#To read a text file, use:
fh = open("hello.txt","r")
print fh.read()
#To read one line at a time, use:
fh = open("hello.txt", "r")
print fh.readline()
#To read a list of lines use:
fh = open("hello.txt.", "r")
print fh.readlines()
#To write to a file, use:
fh = open("hello.txt","w")
fh.write("Hello World")
fh.close()
#To write to a file, use:
fh = open("hello.txt", "w")
lines_of_text = ["a line of text", "another line of text", "a third line"]
fh.writelines(lines_of_text)
fh.close()
#To append to file, use:
fh = open("Hello.txt", "a")
fh.write("Hello World again")
fh.close
#To close a file, use
fh = open("hello.txt", "r")
print fh.read()
fh.close()
List Comprehensions¶
List comprehensions provide a concise way to create lists. It consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The expressions can be anything, meaning you can put in all kinds of objects in lists. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it.The list comprehension always returns a result list.
Syntax: [expression for item in list if conditional]
In [53]:
x = [i for i in range(10)]
print x
In [55]:
squares = []
for x in range(10):
squares.append(x**2)
print squares
# Or you can use list comprehensions to get the same result:
squares = [x**2 for x in range(10)]
print squares
In [56]:
listOfWords = ["this","is","a","list","of","words"]
items = [ word[0] for word in listOfWords ]
print items
In [57]:
[x.lower() for x in ["A","B","C"]]
Out[57]:
In [65]:
string = "Hello 12345 World"
numbers = [x for x in string if x.isdigit()]
print numbers