intro2cs tirgul 5 1. what we will cover today? file io dictionaries xml importing files ...
TRANSCRIPT
Intro2cs
Tirgul 5
1
What we will cover today?
File IO Dictionaries XML Importing files Function with unknown number of arguments List comprehension
2
Files - why are they needed?
❏The containers we know (list\tuple etc.) are limited in the amount of information they can hold (~536M items on a 32bit system[ref]) - files can maintain a greater capacity.❏The output format we are familiar with (print to screen) is lost with every system restart and cannot be shared between different programs. Files’ information is maintained across system shut downs\reboots, and could be accessed by different programs.
3
Files - What are they?
A file is an ordered sequence of bytes.Upon reading a file, a suitable program interprets the content of a file (playing music, displaying images and text).
To create files, write to files and read from files Python defines a custom file handling API (Application Programming Interface).
4
Creating a file object
f = open(fileName)open is python built-in function which returns a file object, with which we can work with files.fileName is a string which indicates the file
location within the file system (absolute or relative to the folder from which the program is running).
5
File ‘open’ing mode
f = open(fileName,[mode])When opening a file we should declare our intentional use of this file. Protects us against undesired consequences.
Two basic types of modes : Read (default) Write
6
File ‘open’ing mode (2)7
●The multiple modes “defend” us so we would not do what we don’t mean to (e.g unwanted changes in the file)
●We have to declare in advance what are our intentions toward the open file (do we wish to manipulate it? read only?)
open in read mode (defualt)
f = open(fileName,'r')If fileName does not exist :Traceback (most recent call last): File "<pyshell#0>", line 1, in <module> f = open('filename')FileNotFoundError: [Errno 2] No such file or directory: 'filename'
8
open in write mode
f = open(fileName,'w')If fileName does not exist:python will create such.If fileName does exist:python will override (delete) the existing version and replace it with a new (the current) file. – Beware!
9
open in append mode
f = open(fileName,'a')If fileName does not exist:python will create such.If fileName does exist:python will add any written data to the end of the file.
10
11
n = f.write(s)Write the string s to the file f.n = number of written characters.
f = open("file.txt")n = f.write('Happy New Year')
write to file
write to file (2)
f = open("file.txt", 'a')n = f.write('Happy New Year')write adds content to the end of the file but does not intsert line breaks. E.g :
• f.write('Don')• f.write('key')f content :
Happy New YearDonkey
12
write multiple lines – use Join
f = open("file.txt", 'a')L = [‘Happy’,’new’,’year’]n = f.write('\n'.join(L))f content :
Happynewyear
Join - Concatenate strings using the first string
write multiple terms to file
f.writelines(seq)Write the strings contained in seq to the file one by one.For example:
• my_strings = ['Don', ‘key']• f.writelines(my_strings)f content :
Donkeywritelines expects a list of strings,
while write expects a single string.
14
closeing a file
f.close()After completing the usage of the file, it should be closed using the close method.
• Free the file resource for usage of other processes.• Some environments will not allow read-only and write
modes opening simultaneously
15
the open with statement
Things don’t always go as smoothly as we plan, and sometimes causes programs to crash.E.g. trying a 0 division.
If the program crashes, we don’t get a chance to close (and free the resource of) the open file. To verify the appropriate handling of files, even in case of program crash we can make use of the with statement.
16
the with statement
with open(file_path, 'w') as our_open_file: #we can work here with the open file and be sure it is properly closed in every scenario
The with statement guarantees that even if the program crashed inside the with block, it will still be properly closed.
17
You can’t write to a closed file
f.close()f.writelines(['hi','bi','pi'])Traceback (most recent call last): File "<pyshell#20>", line 1, in <module> f.writelines(['hi','bi','pi'])ValueError: I/O operation on closed file.
18
reading from file
f.read(n)Read at most n characters from f (default = all file)
f.readline()Read one line.
f.readlines()Read all lines in the file - returns a list of strings ( list
of the file’s lines).
https://docs.python.org/3/tutorial/inputoutput.html
19
reading a 'w' mode file
f = open(fileName,'w')f.read()Traceback (most recent call last): File "<pyshell#1>", line 1, in <module> f.read()io.UnsupportedOperation: not readableUse ‘r+’ mode for reading and writing
20
Iterating over a file
When we read a file, Python defines a pointer (an iterator) which advances with every consecutive reading.
f.readline() # 1st time >> Hi :)f.readline()# 2nd time >> Bi :(
f content
Hi :)Bi :(MitzPaz
21
Iterating over a file - tell
The tell() method returns the current position within the file.
f.readline() # 1st time >> Hi :)f.tell() >> 5
f content
Hi :)Bi :(MitzPaz
22
Access (e.g. print) all the lines in a file
f = open(fileName)for line in f:
print(line)
When do the iteration stop? When reaching end-of-file (EOF)
We may think on a file as a sequence - thus we stop when there’s no more items in F - when we reach EOF
23
Programs with multiple files
24
Importing files
Suppose foo.py has code we want to use in bar.pyIn bar.py we can write:
Here’s more info on handling cyclic imports. https://docs.python.org/3/faq/programming.html#how-can-i-have-modules-that-mutually-import-each-other
import foofoo.func1()
from foo import func1,func2func1()
from foo import *func1()
preferred alternatives. No problems with name collisionimport foo as my_foo
my_foo.func1()
Importing runs the code
import foo
foo.func1()…more_stuff…
def func1():dostuff…
print(“hello”)
Run this. Okay.If we run this, also prints.
import foo
foo.func1()…more_stuff…
def func1():dostuff…
if __name__ == “__main__”: print(“hello”)
Run this. Okay.Run this. Okay.
foo.py
foo.py
bar.py
bar.py
__name__ will have the value “__main__” only if the current module is first to execute
We want to be able to run foo.py, but also want to use its code in bar.py
The dictionary data structure
• A dictionary is a another type of dataset• Similar to lists - Dictionaries are mutable• The main difference between dictionaries
and list is how to retrieve values• In list we had indexes (L[0])• In dicts we use keys (d[‘key’])
The dictionary data structure
Keys can be any Python data type Because keys are used for indexing, they
should be immutableValues can be any Python data type Values can be mutable or immutable
Creating a dictionary
>>> eng2sp = dict()>>> print (eng2sp){}
>>> eng2sp['one'] = ‘uno'
>>> print (eng2sp){'one': ‘uno'}>>> eng2sp['one']‘uno’
>>> eng2sp['two'] = 'dos'>>> print (eng2sp){'one': 'uno', 'two': 'dos'}
empty dict
Dicts are mutable thus we may change them
a keya value
Creating a dictionary
* Creating a dictionary with terms:
>>> eng2sp = {'one': 'uno', 'two': 'dos',... 'three': 'tres'}>>> print(eng2sp){'three': 'tres', 'one': 'uno', 'two': 'dos'}
•In general, the order of items in a dictionary is unpredictable•Dictionaries are indexed by keys, not integers
Dictionary indexing
>>> print (eng2sp['three'])tres>>> >>> print (eng2sp['five'])Traceback (most recent call last): File "<stdin>", line 1, in <module>KeyError: 'five'
* If the index is not a key in the dictionary, Python raises an exception
The in operator
•Note that the in operator works differently for dictionaries than for other sequences
•For offset indexed sequences (strings, lists, tuples), x in y checks to see whether x is an item in the sequence•For dictionaries, x in y checks to see whether x is a key in the dictionary
>>> 'onu' in eng2spFalse>>> 'one' in eng2spTrue
Keys and values
• The keys method returns a sequence of the keys in a dictionary
>>> eng2sp.keys()dict_keys(['three', 'one', ‘two'])>>> list(eng2sp.keys())['three', 'one', 'two']
• The values method returns a list of the values>>> eng2sp.values()dict_values(['tres', 'uno', 'dos'])>>> list(eng2sp.values())['tres', 'uno', 'dos']
• The items method returns a list of tuple pairs of the key-value pairs in a dictionary
>>> eng2sp.items()dict_items([('three', 'tres'), ('one', 'uno'), ('two', 'dos')])>>> list(eng2sp.items())[('three', 'tres'), ('one', 'uno'), ('two', 'dos')]
Keys are unique - values not necessarily
>>> d[1] = 'a'>>> d[2] = 'b'>>> print(d){1: 'a', 2: 'b'}>>> d[1] = 'c'>>> print(d){1: 'c', 2: 'b'}>>> d[3] = 'b'>>> print(d){1: 'c', 2: 'b', 3: 'b'}
Chang the value of d[1]
Add another key with value ‘b’
Example: histrogram (histogram.py)
def histogram(seq): d = dict() for element in seq: if element not in d: d[element] = 1 else: d[element] += 1 return d
h = histogram('brontosaurus')>>>print(h){'b': 1, 'n': 1, 't': 1, 'a': 1, 's': 2, 'r': 2, 'u': 2, 'o': 2}
Key-Value pairs
Change the print_hist function:
def print_hist(hist): for key, value in hist.items(): print (key, value, end = ',')
h = histogram('brontosaurus')print_hist(h)b 1,n 1,t 1,a 1,s 2,r 2,u 2,o 2,
Sorting the keys
Change the print_hist function:
def print_hist(hist): keys = sorted(hist.keys()) for key in keys: print (key, hist[key], end=',')
h = histogram('brontosaurus')print_hist(h)a 1,b 1,n 1,o 2,r 2,s 2,t 1,u 2,
Using lists as values: invert.py
def invert_dict(d): inv = dict() for key in d: val = d[key] if val not in inv: inv[val] = [key] else: inv[val].append(key) return invhist = histogram('parrot')inverted = invert_dict(hist)print (inverted){1: ['a', 'p', 'o', 't'], 2: ['r']}
Using tuples as keys: troupe.py
troupe = {('Cleese', 'John'): [1,2,3], ('Chapman', 'Graham'): [4,5,6], ('Idle', 'Eric'): [7,8,9], ('Jones', 'Terry'): [10,11,12], ('Gilliam', 'Terry'): [13,14,15,16,17,18], ('Palin', 'Michael'): [19,20]}
for last, first in troupe: print (first, last, troupe[last, first])
Terry Gilliam [13, 14, 15, 16, 17, 18]Eric Idle [7, 8, 9]Michael Palin [19, 20]John Cleese [1, 2, 3]Graham Chapman [4, 5, 6]Terry Jones [10, 11, 12]
XML developed by World Wide Consortium’s (W3C’s) XML Working Group (1996)XML portable, widely supported, open technology for describing dataXML quickly becoming standard for data exchange between applications
XML (Extensible Markup Language)
XML Documents
XML documents end with .xml extensionXML marks up data using tags, which are names enclosed in angle brackets <tag> elements </tag>Elements: individual units of markup (i.e., everything
included between a start tag and its corresponding end tag)Nested elements form hierarchiesRoot element contains all other document elements
View XML documentsAny text editor
Internet Explorer, Notepad, Visual Studio, etc.
XML Tree Structure
XML: Example<?xml version="1.0"?> <!-- Bread recipie description --> <recipe name="bread" prep_time="5 mins" cook_time="3 hours"> <title>Basic bread</title> <ingredient amount="8" unit="dL">Flour</ingredient> <ingredient amount="10" unit="grams">Yeast</ingredient> <ingredient amount="4" unit="dL" state="warm">Water</ingredient> <ingredient amount="1" unit="teaspoon">Salt</ingredient> <instructions> <step>Mix all ingredients together.</step> <step>Knead thoroughly.</step> <step>Cover with a cloth, and leave for one hour in warm room.</step> <step>Knead again.</step> <step>Place in a bread baking tin.</step> <step>Cover with a cloth, and leave for one hour in warm room.</step> <step>Bake in the oven at 180(degrees)C for 30 minutes.</step> </instructions> </recipe>
XML: Well-formed XML
All XML elements must have a closing tag XML tags are case sensitive All XML elements must be properly nested All XML documents must have a root tag Attribute values must always be quoted
ElementTree
Access the contents of an XML file in a "pythonic" way.
Use iteration to access nested structure Use dictionaries to access attributes Each element/node is an "Element"
Google "ElementTree python" for docs
XML: Back to example<?xml version="1.0"?> <!-- Bread recipie description --> <recipe name="bread" prep_time="5 mins" cook_time="3 hours"> <title>Basic bread</title> <ingredient amount="8" unit="dL">Flour</ingredient> <ingredient amount="10" unit="grams">Yeast</ingredient> <ingredient amount="4" unit="dL" state="warm">Water</ingredient> <ingredient amount="1" unit="teaspoon">Salt</ingredient> <instructions> <step>Mix all ingredients together.</step> <step>Knead thoroughly.</step> <step>Cover with a cloth, and leave for one hour in warm room.</step> <step>Knead again.</step> <step>Place in a bread baking tin.</step> <step>Cover with a cloth, and leave for one hour in warm room.</step> <step>Bake in the oven at 180(degrees)C for 30 minutes.</step> </instructions> </recipe>
Basic ElementTree Usageimport xml.etree.ElementTree as ET
# Parse the XML file and get the recipe elementdocument = ET.parse("recipe.xml")root = document.getroot()
# What is the root?print (root.tag)
# Get the (single) title element contained in the recipe elementelement = root.find('title')print (element.tag, element.attrib, element.text)
# All elements contained in the recipe elementfor element in root: print (element.tag, element.attrib, element.text)
# Finds all ingredients contained in the recipe elementfor element in root.findall('ingredient'): print (element.tag, element.attrib, element.text)
# Continued…
Basic ElementTree Usage
# Finds all steps contained in the root element# There are none!for element in root.findall('step'): print ("!",element.tag, element.attrib, element.text)
# Gets the instructions elementmentinstructions = root.find('instructions')# Finds all steps contained in the instructions elementfor element in instructions.findall('step'): print (element.tag, element.attrib, element.text)
# Finds all steps contained at any depth in the recipe elementfor element in root.getiterator('step'): print (element.tag, element.attrib, element.text)
Basic ElementTree Usageimport xml.etree.ElementTree as ET
# Parse the XML file and get the recipe elementdocument = ET.parse("recipe.xml")root = document.getroot()
element = root.find('title')
for element in root.findall('ingredient'): print (element.attrib['amount'], element.attrib['unit‘]) print ("Instructions:“)element = root.find('instructions')for i,step in enumerate(element.findall('step')): print (i+1, step.text)
Basic ElementTree Usageimport xml.etree.ElementTree as ET
# Parse the XML file and get the recipe elementdocument = ET.parse("recipe.xml")root = document.getroot()
ingredients = []for element in root.findall('ingredient'): ingredients.append([element.text, element.attrib['amount'],
element.attrib['unit']]) if element.attrib.get('state'): ingredients.append(element.attrib['state'])
School Solution of EX5
• Keep up with the GUI demonstration!
• Idea: Comparing stores by prices
Fucntion Extras
•You can define functions that accept as many arguments as the user wishes to provide.•Arguments are placed in a tuple.
52
def is_word_in(word, *text_list): for text in text_list: if word in text: print(word, "is in" ,text) is_word_in("run", "rune", "cat",
"prune", "apple" ,"runner")is_word_in("run")
Passing dictionaries to functionsdef print_dict(name, uid, homedir):
print(“uid = “, uid, “ name = “, name)
print(“homedir = “, homedir)
user_info = dict(name="David", uid=593, homedir="/home/users/david")
print_dict(**user_info) # note: need ** here!!output: (note: dictionary entries are unordered) uid = 593 name = Davidhomedir = /home/users/david
54
List Comprehensions – motivation
Say you would like to get a list of squares, from some list/tuple/iterable. You would have probably written:
But this seems like a lot of code for such a simple operation.
In these cases you should remember your zen!
Simple is better, but how can we simplify this?
def get_squares(seq): lst = [] for x in seq: lst.append(x**2) return lst
55
List Comprehensions - 1
Is a way to create a list from another iterable/sequence, where each element in the created list: Satisfies a certain condition. Is a function of a member in the original
sequence/iterabel. It allows for the creation of lists in an
understandable syntax. You just write your expression inside the
‘[]’!
56
List Comprehensions - 2
Returning to our previous example:
Could be simplified to:
Much easier right?
def get_squares(seq): return [x**2 for x in seq]
def get_squares(seq): lst = [] for x in seq: lst.append(x**2) return lst
57
Conditioning
You can add conditions inside an expression, allowing for the creation of a sublist in a more “pythonic” way.
What do you think these examples do? [i for i in seq if i in range(min,max)] [i**2 for i in range(num) if is_prime(i)] [chr for chr in string if chr.islower()] [str(round(pi, i)) for i in range(1, 6)] vec = [[1,2,3], [4,5,6], [7,8,9]][num for elem in vec for num in elem]