file i/o: low-level

55
1 FILE I/O: Low-level

Upload: lelia

Post on 24-Feb-2016

54 views

Category:

Documents


1 download

DESCRIPTION

FILE I/O: Low-level. The Big Picture. Reading vs. Writing. "Reading a file" To obtain data from a file (typically on a disk) and move a copy of it into RAM "Writing a file" To copy data from RAM into a file (typically on a disk). Why Use Files?. Volatile Memory - PowerPoint PPT Presentation

TRANSCRIPT

Page 4: FILE I/O:  Low-level

4

Why Use Files?Volatile MemoryComputers today use “volatile RAM” – contents get erased when the

power goes out. Someday, computers will use “non-volatile RAM” (e.g. flash RAM). Until then, we make use of “secondary storage” (hard drive) to save information more permanently.

Too much dataFrequently we have more information available than we can work with in main memory (RAM). So we store most of it on the hard drive (secondary storage) and retrieve only what we need.

PackagingPutting information in a file gets it organized and keeps it together

Page 6: FILE I/O:  Low-level

6

Low-Level, cont. Some files are mixed format that are not readable by high-

level functions such as xlsread()

Since the data is not easily recognized by the high-level function, every step to read the file requires a separate MATLAB command:1. Open a file, either to read-from or to write-to2. Read or write data from/to the file with the specific delimiters3. Close the file

Page 8: FILE I/O:  Low-level

8

Opening and closing a file Template to open a file

fh = fopen(<filename>,<mode>);

fh is known as a file handle or file identifier. It is used in future function calls to identify this is the file to use. It is like a "nickname" and we will use it instead of the filename when working with the file.

<filename> represents a string that is the name of the file with its extension (the letters after the dot). It can either be hardcoded, or within a variable

<mode> is a string specifying the purpose of opening the file – called the "access mode". The most commonly used are 'r' to read only from a file – bring data into memory from the disk 'w' to write to a file – put data from memory onto the disk 'a' to append to a file (add data to the end of the file) Add + to the string to combine reading and writing (e.g. 'r+', 'w+')

Template to close a filefclose(fh);

Page 9: FILE I/O:  Low-level

9

Some examples…% Example 1) open a file from which to readfileGrades = fopen('grades.txt', 'r'); %hardcode the filename<code block to be inserted here>%close filefclose(fileGrades);

% Example 2) ask user for a filename, then open it to readnameFile = input('Name of file with grades? (e.g. grades.txt): ', 's');fileGrades = fopen(nameFile, 'r'); <code block to be inserted here>%close filefclose(fileGrades);

Use the file handle – not the file name!

no quotes: a variable

Notice that the file handle variable can be any acceptable variable name

Page 10: FILE I/O:  Low-level

10

Closing Files

After working with a file, it is important to close the file. Other than being good form, it is critical when writing to the file.

When the OS is supposed to put information on disk it frequently waits until it determines the best time. This is known as "write caching".

Page 11: FILE I/O:  Low-level

11

Closing FilesYou have seen write-caching with “safe remove"

warning on USB drives.

The OS may wait to write data. If your program finishes and the data hasn't been written, it will not be written at all!

Close the file before finishing the program - this forces the OS to write the data to the disk.

Page 12: FILE I/O:  Low-level

12

More Examples…% Open a file for reading and writingfh = fopen('my_project.abc', 'r+');

File name: The file extension (.abc here) is used by Windows – but only to tell it what program should be used if the Windows users wants to open the file. You are free to use any extension you want with your data files. The only impact will be that Windows may not know what program should use that file.

Would 'w+' work also? Maybe – see the next slide…

Page 13: FILE I/O:  Low-level

13

Opening Files:Super-secret Access-mode Codes

The “access mode” codes indicate how you will be using a file after you open it.

Since the operating system has permissions assigned to files, when you request access to a file you must tell the system in what mode you will be using the file.

The codes used for this tell the OS what it needs to know, and has an impact on how you will use the file.

Page 14: FILE I/O:  Low-level

14

Opening Files:File Position PointerWhen a file is opened, a “file position pointer” is

created. The system keeps track of the point in the file to which your program has read or written.

Think of it like a cursor that moves as you read or write the file.

The file position pointer is set initially to different locations depending on the access mode

Page 16: FILE I/O:  Low-level

16

Opening Files:Access Mode & File Existence

In addition to the file position pointer, the system also has to decide what will happen if the file does or does not exist when you try to open it.

If it already exists, should the file be deleted?

If it doesn’t yet exist, should it be created?

Page 18: FILE I/O:  Low-level

18

Opening Files:Choosing an access mode

A “log file” is a file that keeps a history of events. Many programs keep log files. They help programmers see what occurred in the past so that a problem can be fixed.

If your program is going to keep a log file, what is the best mode to use when opening this file? Why?

Page 19: FILE I/O:  Low-level

19

Opening Files:Choosing an access mode

You are writing a program that will manage a database. You will be accessing files at different times within the program, so you decide to close and reopen the file several times. For each of these times, how should you open the file?

1. User wants to view a record in the database

2. User wants to modify a record in the database

3. User wants to add a record to the database

Page 20: FILE I/O:  Low-level

20

Writing Text Filesfprintf(<file handle>, … The rest is as usual...);

Don’t forget the semi-colon!Otherwise, MATLAB displays in the command window a

number! fprintf() default output is how many characters were printed.

20

File handle – not the file name!

Example:fh = fopen('log_file.txt', 'a');fprintf(fh, 'Event #%d: \t%s\n', event_num, event_description);

Page 21: FILE I/O:  Low-level

21

MS Windows Text filesWhen writing to a text file, MATLAB will write only

a single newline character to the end of a line – yet Windows requires two different characters there. So if you open the file in Notepad, it will not look like you expect:

Page 23: FILE I/O:  Low-level

23

Writing Text Files

Inserting data into the middle of a text file

Writing to text files is not like working in Word!

When you write to a text file, the data added to the file will write over any existing data in the file after the files position pointer – there is no “insert mode”!

Page 27: FILE I/O:  Low-level

27

Reading text files

Reading an entire line as a string including storing the new line character in the variable

str = fgets(<file handle>); without storing the new line character in the variable

str = fgetl(<file handle>);

Reading numeric data data = fscanf(<file handle>);

Page 33: FILE I/O:  Low-level

33

Using fscanf()

fscanf() is like the reverse of fprintf() . You specify the format you want to match and fscanf() will read from the file as long as it can match that format.

fscanf() is not good for reading strings because it will save the characters as their ASCII equivalents.

Page 36: FILE I/O:  Low-level

36

Using fscanf()Change the function call to:

data = fscanf(fh, '%d\t%d', [2, 3])

And you get out: Add this argument

MATLAB is still reading the data in line-order, and still storing the data in column-order. But we've now specified how big the columns will be – two rows each.

Page 37: FILE I/O:  Low-level

37

Using fscanf()But we may want the data to be in the form of the file. Unfortunately,

changing the third argument doesn’t help:

data = fscanf(fh, '%d\t%d', [3, 2])

Original file data:

This is because fscanf() is still filling the variable in “column-order” – it fills a column first and then moves onto the next column.

Page 39: FILE I/O:  Low-level

39

Using fscanf()But suppose we don’t know how many sets of data will be

in the file?

Use MATLAB’s inf constant. It means “as many as needed”

data = fscanf(fh, '%d\t%d', [2, inf])

Now, if the data file gets larger, your program can still handle it.

Will this work? data = fscanf(fh, '%d\t%d', [inf, 2])

Page 40: FILE I/O:  Low-level

40

Moving around within filesWhen reading and writing to files, the system

maintains a “file position pointer”. Think of it as a cursor tracking your position in the file.

Every time you read from the file, the file position pointer moves past all of the characters you have read.

Ever time you write to the file, the file position pointer remains immediately after the last character you wrote.

Page 41: FILE I/O:  Low-level

41

Moving around within filesfseek()

Move to specific byte position within the file

frewind()Move to beginning of file

ftell()Return a file position pointer’s byte position (number of bytes from beginning of file)

feof()Returns 1 (true) if the file position pointer is at the end of the file. Note that the file position pointer must be past any non-visible characters (newlines, tabs, spaces, etc) for this to occur.

Page 43: FILE I/O:  Low-level

43

Moving around within filesfseek(fh, 0, 'cof');

Why would we want to move 0 bytes from the current position?

Because there is a (frequently unmentioned) property of files: You cannot read from and then write to a file (or write to and then read from a file) without an intervening setting of the file position pointer. The command above sets the file position pointer without moving it.

Page 44: FILE I/O:  Low-level

44

Moving around within filesExample: Suppose testfile.txt exists already. We

want to find a location within the file, and then write to the file.

Doesn’t work:

fh = fopen('testfile.txt', 'r+');. . .x = fgets(fh);fprintf(fh, 'fred'); fclose(fh);

Works:

fh = fopen('testfile.txt', 'r+');. . .x = fgets(fh);fseek(fh, 0, 'cof');fprintf(fh, 'fred'); fclose(fh);

Page 45: FILE I/O:  Low-level

45

Moving around within filesfrewind(fh)

Essentially the same as fseek(fh, 0, 'bof')

ftell(fh)Returns the byte position within the file.

Example:p = ftell(fh);...fseek(fh, p, 'bof')

CAUTION: Byte positions depend on the format of the file – do not assume that a byte and a character are the same thing!

Page 46: FILE I/O:  Low-level

46

Moving around within filesfeof() – normally used as a condition

fh = fopen('datafile.txt', 'r');data = [];

while (fh>0 && ~feof(fh))

s = fgetl(fh);data = strvcat(data, s);

end

fclose(fh);

What does this mean?

Note that the order of these boolean expressions is important – we want to test for a valid file handle before we use it in the feof() function call

Page 47: FILE I/O:  Low-level

47

EXTRA: Binary Files(not on any exam…)Many programs today do not use ASCII text for

their files.

ASCII is great for being able to read the data file, but it can make the file unwieldy.

As an alternative, files can be stored with "binary data". The data stored is not intended to be read as ASCII.

Page 52: FILE I/O:  Low-level

52

Binary FilesJust as with ASCII files, the format of the file

must be known in order to work with it.

Once you know the format, you can read and write to the file – but first you must open it in "binary" mode. In Windows, just add a "b" to the access mode:

fh = fopen('myfile.bin', 'rb+');