python programming - xii. file processing
DESCRIPTION
TRANSCRIPT
XII . FILE PROCESSINGEngr. Ranel O. Padon
PYTHON PROGRAMMING TOPICS
I• Introduction to Python Programming
II• Python Basics
III• Controlling the Program Flow
IV• Program Components: Functions, Classes, Packages, and Modules
V• Sequences (List and Tuples), and Dictionaries
VI• Object-Based Programming: Classes and Objects
VII• Customizing Classes and Operator Overloading
VIII• Object-Oriented Programming: Inheritance and Polymorphism
IX• Randomization Algorithms
X• Exception Handling and Assertions
XI• String Manipulation and Regular Expressions
XII• File Handling and Processing
XIII• GUI Programming Using Tkinter
FileProcessing
Data Hierarchy
File-Open Modes
Dissecting Files
The Power of Buffering
FILE HANDLING
variables offer only temporary storage of data
they are lost when they “goes out of scope” or
when the program terminates
FILE HANDLING
files are used for long-term retention of
large amounts of data, even after the program
that created the data terminates.
data maintained in files is called persistent data
FILE HANDLING | Data Hierarchy
Bit (“Binary digit”) => the smallest computer data item
Bit is a digit that can assume one of two values
FILE HANDLING | Data Hierarchy
Programming with low-level bit formats is tedious & boring.
use decimal digits, letters, and symbols instead.
FILE HANDLING | Data Hierarchy
Characters are made-up of digits, letters, and characters.
Characters are represented as combination of bits (bytes).
FILE HANDLING | Data Hierarchy
FILE HANDLING | Data Hierarchy
Field (Column) is a collection of characters,
represented as words.
Record (Row) is a collection of fields,
represented as a tuple, dictionary, instance of a class.
File (Table) is a collection of records,
implemented as sequential access or random-access.
Database (Folder) is a collection of files,
handled by DBMS softwares.
FILE HANDLING | Data Hierarchy
FILE HANDLING | Data Hierarchy
FILE HANDLING | open() & close()
magical_file.close()
magical_file = open(“file_name.txt” [, a|r|r+|w|w+] [, buffer_mode])
FILE HANDLING | Other Functions
FILE HANDLING | open()
Open Mode Read Write Appends Overwrites CreatesCursor @
Start
Cursor @
EOF
r
r+
w
w+
a
a+
FILE HANDLING | Common Modes
Open Mode Read Write Appends Overwrites CreatesCursor @
Start
Cursor @
EOF
r
w
FILE HANDLING | open()
“r” is the default file-open mode
open(“input.dat”) = open(“input.dat”, “r”)
FILE HANDLING | r
FILE HANDLING | r
FILE HANDLING | r
FILE HANDLING | w
try removing line #6
try removing "\n" in lines #3 and #4
FILE HANDLING | w
FILE HANDLING | with-as Keyword
FILE HANDLING | Parsing
Paninda.txt
FILE HANDLING | Parsing | split
FILE HANDLING | Parsing | split
FILE HANDLING | Parsing | csv
Paranormal_Sightings.csv
FILE HANDLING | Parsing | strip
FILE HANDLING | Parsing | strip
FILE HANDLING | Parsing & Classes
FILE HANDLING | Parsing & Classes
FILE HANDLING | Parsing & Classes
FILE HANDLING | Parsing & Classes 2
FILE HANDLING | Parsing & Classes 2
FILE HANDLING | Parsing & Classes 2
FILE HANDLING | Parsing & Classes 2
FILE HANDLING | HTML Parsing
MangJose.html
FILE HANDLING | HTML Parsing
MangJose.html
FILE HANDLING | HTML Parsing
FILE HANDLING | HTML Parsing
FILE HANDLING | HTML Parsing
FILE HANDLING | HTML Parsing
FILE HANDLING | HTML Parsing
FILE HANDLING | HTML Parsing
FILE HANDLING | r+, w+, a+
All of the "plus" modes allow reading and writing:
the main difference between them is where
we're positioned in the file.
“r+” puts us at the beginning
“w+” puts us at the beginning & the end,
because the file's truncated
“a+” puts us at the end.
FILE HANDLING | w+
FILE HANDLING | Buffering
FILE HANDLING | Buffering
“-1” is the default file-open buffering mode
open(“input.dat”) = open(“input.dat”, “r”, “-1”)
Flag Meaning
0 unbuffered
1 buffered line
n buffered with size n
-1 system default
FILE HANDLING | Creating A Big File!
FILE HANDLING | Unbuffered r
Then, let’s read that big file.
FILE HANDLING | Buffered r
Now, with the help of buffering.
FILE HANDLING | Buffered By Default
In other languages, like C or Java,
buffering is not the default mode.
FILE HANDLING | What else?
1. Random-Access Files: for fast searching/editing of records
* use the shelve module
* shelve.open()
2. Serialization: compressing file as objects for efficiency;
useful for transferring data (objects, sequences, etc)
across a network connection or saving states of a game
* use the pickle or cPickle module
* cPickle.dump(stringList_to_be_written, serialized_file)
* records = cPickle.load(serialized_file)
PRACTICE EXERCISE| MORSE CODE
PRACTICE EXERCISE| MC CHART
PRACTICE EXERCISE| MC CHART
PRACTICE EXERCISE| MORSE CODE
A. Read a file containing Filipino/English-language
phrases and encodes it into Morse code.
B. Read a Morse code file and converts it into the
Filipino/English-language equivalent.
Use one blank between each Morse-coded letter and three blanks between each Morse-coded word.
REFERENCES
Deitel, Deitel, Liperi, and Wiedermann - Python: How to Program (2001).
Disclaimer: Most of the images/information used here have no proper source
citation, and I do not claim ownership of these either. I don’t want to reinvent the
wheel, and I just want to reuse and reintegrate materials that I think are useful or
cool, then present them in another light, form, or perspective. Moreover, the
images/information here are mainly used for illustration/educational purposes only,
in the spirit of openness of data, spreading light, and empowering people with
knowledge.