More on Python Basics¶

The content and the code excerpts have been derived from the following sources:

7 Python Random Module Functions You Should Know. https://www.techbeamers.com/using-python-random/
What's the difference between Randrange and Randint functions. https://www.codecademy.com/en/forum_questions/521bcf2b548c359b28000367
Python Regular Expressions https://www.tutorialspoint.com/python/python_reg_expressions.htm
Python Regular Expression Tutorial https://www.datacamp.com/community/tutorials/python-regular-expression-tutorial
Regular Expressions https://developers.google.com/edu/python/regular-expressions
Map Filter and Reduce http://book.pythontips.com/en/latest/map_filter.html
Python File Handling https://www.pythonforbeginners.com/cheatsheet/python-file-handling

Random Functions (from the Random Module)¶

Random values can be generate using the following functions of the Random module in Python.

1. Randrange() Function.¶

Syntax: Randrange(stop) or, Randrange (start, stop[,step]) The stop value sets the boundary of the range.

import random
print(random.randrange(999))

476

print(random.randrange(0,1000,20)) #generate a random value in the intervals of 20 between 0 and 1000

40

print (random.randrange(20,25)) #generates a number between 20 and 25

24

print(random.randrange(0.2, 0.5)) #randrange takes only integer arguments... hence the ERROR


NameErrorTraceback (most recent call last)
<ipython-input-50-bc44f1bc2056> in <module>()
----> 1 print(random.randrange(0.2, 0.5)) #randrange takes only integer arguments... hence the ERROR

NameError: name 'random' is not defined

2. Random.Randint(Low, High) Function.¶

The randint() function is one of many functions which handle random numbers. It has two parameters low and high and generates an integer between low and high, inclusive.

i = 0
while i < 5:
    # Get random number in range 0 through 9.
     r = random.randint(0, 9)
     print(r)
     i += 1

9
0
6
1
7

There is one slight difference between randrange and randint when used with just two parameters. randint(x,y) will return a value >= x and <= y, while randrange(x,y) will return a value >=x and < y (n.b. not less than or equal to y)
When using 3 parameters, randint can take only 2 parameters.

3. Random.Choice(Seq) Function.¶

The choice() function arbitrarily determines an element from the given sequence.

# Generates a random string from the list of strings
print(random.choice( ['Apple', 'Ball', 'Cat'] ))

# Generates a random number from a list
print(random.choice([-1, 1, 3.5, 9, 15]))

# Generates a random number from a tuple
print(random.choice((1.1, -5, 6, 4, 7)))

# Generate as random char from a string
print(random.choice('Life is Beautiful'))

Ball
1
7
B

4. Random.Shuffle(List) Function.¶

Purpose- The shuffle() function rearranges the items of a list in place so that they occur in a random order.
For shuffling, it uses the Fisher-Yates algorithm which has O(n) complexity. It starts by iterating the last element in the array to the first entry, then swap each entry with an entry at a random index below it.

from random import shuffle

mylist = [11,21,31,41,51]
shuffle(mylist)

print(mylist)

[41, 51, 31, 11, 21]

5. Random.Sample(Collection, Random List Length) Function.¶

The sample() function randomly selects N items from a given collection (list, tuple, string, dictionary, set) and returns them as a list. It works by sampling the items without replacement. It means a single element from the sequence can appear in the resultant list at most once.

from random import sample

print(sample('Nepal',3)) # Selects any 2 chars from a string
print(sample((21, 12, -31, 24, 65, 16.3), 3)) # Creates a tuple of any three elements from a base tuple
print(sample([11, 12, 13, 14, -11, -12, -13, -14], 3)) # Randomly selects a list of three elements from a base list
print(sample({110, 120, 130, 140}, 3)) # Randomly selects a subset of size three from a given set of numbers
print(sample({'ABC', 'BCD', 'CDE', 'EFG'}, 3)) # Randomly selects a subset of size three from a given set of strings

['a', 'p', 'N']
[21, 24, -31]
[-12, -14, 11]
[140, 120, 130]
['EFG', 'ABC', 'CDE']

6. Random.Random() Function.¶

It selects the next random floating point number from the range [0.0, 1.0]. It is a semi-open range as the random function will always return a decimal no. less than its upper bound. However, it may return 0.

from random import random
print(random())
print(random())
print(random())

0.261579937049
0.608479898102
0.657566887448

7.Random.Uniform(Lower, Upper) Function.¶

It is an extension of the random() function. In this, you can specify the lower and upper bounds to generate a random number other than the ones between 0 and 1.

import random
print(random.uniform(500, 900))
print(random.uniform(500, 900))
print(random.uniform(500, 900))

506.356121176
733.243790287
729.920066651

# Generate a floating-point random number with fixed precision
import random

lower = 1.0; upper = 2.0; fixed_precision = 2
random_float = random.uniform(lower, upper)
print(round(random_float, fixed_precision)) 
#USING THE ROUND function to roundup to the fixed precision which is 2 in this case

1.32

REGEX - REGular EXpressions in Python¶

A regular expression is a special sequence of characters that helps you match or find other strings or sets of strings, using a specialized syntax held in a pattern. The module re provides full support for Perl-like regular expressions in Python. The re module raises the exception re.error if an error occurs while compiling or using a regular expression.

Match() Function¶

This function attempts to match RE pattern to string with optional flags.
Syntax: re.match(pattern, string, flags=0)

import re

line = "Maths is easier than Literature"

matchObj = re.match( r'(.*) is (.*?) .*', line, re.M|re.I)

if matchObj:
    print "matchObj.group() : ", matchObj.group()
    print "matchObj.group(1) : ", matchObj.group(1)
    print "matchObj.group(2) : ", matchObj.group(2)
else:
    print "No match!!"

matchObj.group() :  Maths is easier than Literature
matchObj.group(1) :  Maths
matchObj.group(2) :  easier

Search() Function¶

This function searches for first occurrence of RE pattern within string with optional flags.
Syntax: re.search(pattern, string, flags=0)

import re

line = "Maths is easier than Literature";

searchObj = re.search( r'(.*) is (.*?) .*', line, re.M|re.I)

if searchObj:
    print "searchObj.group() : ", searchObj.group()
    print "searchObj.group(1) : ", searchObj.group(1)
    print "searchObj.group(2) : ", searchObj.group(2)
else:
    print "Nothing found!!"

searchObj.group() :  Maths is easier than Literature
searchObj.group(1) :  Maths
searchObj.group(2) :  easier

Match() checks for a match only at the beginning of the string, while Search() checks for a match anywhere in the string

import re

line = "Cars are faster than bicycles";

matchObj = re.match( r'bicycles', line, re.M|re.I)
if matchObj:
    print "match --> matchObj.group() : ", matchObj.group()
else:
    print "No match!!"

searchObj = re.search( r'bicycles', line, re.M|re.I)
if searchObj:
    print "search --> searchObj.group() : ", searchObj.group()
else:
    print "Nothing found!!"

No match!!
search --> searchObj.group() :  bicycles

Sub()¶

This method replaces all occurrences of the RE pattern in string with repl, substituting all occurrences unless max provided. This method returns modified string.
Syntax: re.sub(pattern, repl, string, max=0)

import re

phone = "610-845-2975 # This is Phone Number"

num = re.sub(r'#.*$', "", phone) #Removing # by replacing it wih ""
print ("Phone Num : ", num)

num = re.sub(r'\D', "", phone) #Removing all characters except digits.   
print ("Phone Num : ", num)

num = re.sub(r'610',"972", phone) #replacing the area code of the phone number
print ("Phone Num : ", num)

('Phone Num : ', '610-845-2975 ')
('Phone Num : ', '6108452975')
('Phone Num : ', '972-845-2975 # This is Phone Number')

Wild Card Characters: Special Characters¶

Special characters are characters which do not match themselves as seen but actually have a special meaning when used in a regular expression.
The most widely used special characters are:
. - A period. Matches any single character except newline character.

re.search(r'Co.k.e', 'Cookie').group() #The group() function returns the string matched by the re.

'Cookie'

\w - Lowercase w. Matches any single letter, digit or underscore.

re.search(r'Co\wk\we', 'Cookie').group()

'Cookie'

\W - Uppercase w. Matches any character not part of \w (lowercase w).

re.search(r'C\Wke', 'C@ke').group()

'C@ke'

\s - Lowercase s. Matches a single whitespace character like: space, newline, tab, return. \S - Uppercase s. Matches any character not part of \s (lowercase s).

print(re.search(r'Eat\scake', 'Eat cake').group())
print(re.search(r'Cook\Se', 'Cookie').group())

Eat cake
Cookie

#\d - Lowercase d. Matches decimal digit 0-9.
print(re.search(r'c\d\dkie', 'c00kie').group())

#^ - Caret. Matches a pattern at the start of the string.
print(re.search(r'^Eat', 'Eat cake').group())

#$ - Matches a pattern at the end of string.
print(re.search(r'cake$', 'Eat cake').group())

#[abc] - Matches a or b or c.
#[a-zA-Z0-9] - Matches any letter from (a to z) or (A to Z) or (0 to 9). Characters that are not within a range can be matched by complementing the set. If the first character of the set is ^, all the characters that are not in the set will be matched.
print(re.search(r'Number: [0-6]', 'Number: 5').group())

#Matches any character except 5
print(re.search(r'Number: [^5]', 'Number: 0').group())
#\A - Uppercase a. Matches only at the start of the string. Works across multiple lines as well.
print(re.search(r'\A[A-E]ookie', 'Cookie').group())
#\b - Lowercase b. Matches only the beginning or end of the word.
print(re.search(r'\b[A-E]ookie', 'Cookie').group())

#\ - Backslash. If the character following the backslash is a recognized escape character, then the special meaning of the term is taken. For example, \n is considered as newline. However, if the character following the \ is not a recognized escape character, then the \ is treated like any other character and passed through.

# This checks for '\' in the string instead of '\t' due to the '\' used 
print(re.search(r'Back\\stail', 'Back\stail').group())

# This treats '\s' as an escape character because it lacks '\' at the start of '\s'
print(re.search(r'Back\stail', 'Back tail').group())

c00kie
Eat
cake
Number: 5
Number: 0
Cookie
Cookie
Back\stail
Back tail

Repetitions¶

It becomes quite tedious if you are looking to find long patterns in a sequence. Fortunately, the re module handles repetitions using the following special characters:

# + - Checks for one or more characters to its left.
print(re.search(r'Co+kie', 'Cooookie').group())

# * - Checks for zero or more characters to its left.
# Checks for any occurrence of a or o or both in the given sequence
print(re.search(r'Ca*o*kie', 'Caokie').group())

# ? - Checks for exactly zero or one character to its left.
# Checks for exactly zero or one occurrence of a or o or both in the given sequence
print(re.search(r'Colou?r', 'Color').group())

Cooookie
Caokie
Color

But what if you want to check for exact number of sequence repetition?
For example, checking the validity of a phone number in an application. re module handles this very gracefully as well using the following regular expressions:
{x} - Repeat exactly x number of times.
{x,} - Repeat at least x times or more.
{x, y} - Repeat at least x times but no more than y times.
The + and * qualifiers are said to be greedy.

re.search(r'\d{9,10}', '0987654321').group()

'0987654321'

Here's a cheat sheet for the regular expressions: https://www.dataquest.io/blog/regex-cheatsheet/¶

Map, Filter and Reduce¶

Map()¶

Map() applies a function to all the items in an input_list. Syntax: map(function_to_apply, list_of_inputs)

#Without using Map()
items = [1, 2, 3, 4, 5]
squared = []
for i in items:
    squared.append(i**2)
print(squared)

[1, 4, 9, 16, 25]

#Using Map()
items = [1, 2, 3, 4, 5]
squared = list(map(lambda x: x**2, items))
print(squared)

[1, 4, 9, 16, 25]

#We can also use a list of functions
def multiply(x):
    return (x*x)
def add(x):
    return (x+x)

funcs = [multiply, add]
for i in range(5):
    value = list(map(lambda x: x(i), funcs))
    print(value)

[0, 0]
[1, 2]
[4, 4]
[9, 6]
[16, 8]

Filter()¶

Filter creates a list of elements for which a function returns true.

number_list = range(-5, 5)
less_than_zero = list(filter(lambda x: x < 0, number_list))
print(less_than_zero)

[-5, -4, -3, -2, -1]

Reduce()¶

Reduce is a really useful function for performing some computation on a list and returning the result. It applies a rolling computation to sequential pairs of values in a list.

product = reduce((lambda x, y: x * y), [1, 2, 3, 4])
print (product) # prints the product of the integerst in the list [1,2,3,4]

24

File Handling in Python¶

File handling in Python requires no importing of modules. Instead we can use the built-in object "file". That object provides basic functions and methods necessary to manipulate files by default.

Open()¶

The open() function is used to open files in our system, the filename is the name of the file to be opened. The mode indicates, how the file is going to be opened "r" for reading, "w" for writing and "a" for a appending.

filename = "hello.txt"
file = open(filename, "r")
for line in file:
    print line,

This is a hello.txt.
Using this for file handling.

Read ()¶

The read functions contains different methods, read(),readline() and readlines() read() -- return one big string readline -- #return one line at a time readlines -- returns a list of lines

Write ()¶

This method writes a sequence of strings to the file.
write() -- Used to write a fixed sequence of characters to a file writelines() -- writelines can write a list of strings.

Append ()¶

The append function is used to append to the file instead of overwriting it. To append to an existing file, simply open the file in append mode ("a"):

Close()¶

When you’re done with a file, use close() to close it and free up any system resources taken up by the open file

#To open a text file, use:
fh = open("hello.txt", "r")

#To read a text file, use:
fh = open("hello.txt","r")
print fh.read()

#To read one line at a time, use:
fh = open("hello.txt", "r")
print fh.readline()

#To read a list of lines use:
fh = open("hello.txt.", "r")
print fh.readlines()

#To write to a file, use:
fh = open("hello.txt","w")
fh.write("Hello World")
fh.close()

#To write to a file, use:
fh = open("hello.txt", "w")
lines_of_text = ["a line of text", "another line of text", "a third line"]
fh.writelines(lines_of_text)
fh.close()

#To append to file, use:
fh = open("Hello.txt", "a")
fh.write("Hello World again")
fh.close

#To close a file, use
fh = open("hello.txt", "r")
print fh.read()
fh.close()

a line of textanother line of texta third line
a line of textanother line of texta third line
['a line of textanother line of texta third line']
a line of textanother line of texta third lineHello World again

List Comprehensions¶

List comprehensions provide a concise way to create lists. It consists of brackets containing an expression followed by a for clause, then zero or more for or if clauses. The expressions can be anything, meaning you can put in all kinds of objects in lists. The result will be a new list resulting from evaluating the expression in the context of the for and if clauses which follow it.
The list comprehension always returns a result list.
Syntax: [expression for item in list if conditional]

x = [i for i in range(10)]
print x

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

squares = []
for x in range(10):
    squares.append(x**2)
print squares

# Or you can use list comprehensions to get the same result:
squares = [x**2 for x in range(10)]
print squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

listOfWords = ["this","is","a","list","of","words"]
items = [ word[0] for word in listOfWords ]
print items

['t', 'i', 'a', 'l', 'o', 'w']

[x.lower() for x in ["A","B","C"]]

['a', 'b', 'c']

string = "Hello 12345 World"
numbers = [x for x in string if x.isdigit()]
print numbers

['1', '2', '3', '4', '5']

Big Data with Jenish

Search This Blog

Tutorial 3 - More on Python Basics