public management: concepts and big...
TRANSCRIPT
Data Structure & File Structure
Hun Myoung Park, Ph.D.,Public Management and Policy Analysis Program
Graduate School of International RelationsInternational University of Japan
OutlineData TypeData StructureFile StructureFile SystemDirectoryBinary versus Text Files
2
Data TypeAbstract data types (ADT) include
Stack (LIFO) Queue (FIFO) List, tree and graph
Primitive type: number, string, booleanComposite type (bundling of elements):
array and record
3
Data StructureA way of storing and organizing data. “A collection of related variables that can be
accessed individually or as a whole”“A set of data items that share a specific
relationship”Implements abstract data typesArrays, records, and linked lists.Array (fixed-size sequence of elements) vs. list
(variable size)
4
Data Structure: ArrayA sequenced collection of elements Elements may or may not share the same
data type Array name, or array name and index
(subscript) to refer to elements. a[0], a[1], a[2]… instead of a0, a1, a2, …Array name alone a means a[0], a[1],…Multi-dimensional array a[0][3]
5
Data Structure: RecordA collection of related elements (fields or
attributes) of an entityThe name of a record is the whole structure
name (e.g., student)The names of fields (e.g., name, id,…)student.name, student.id, student.age, …Array of records (e.g., student[1].name,
student[1].id, … student[2].name, student[2].id, …)
6
Data Structure: Linked List“A collection of data in which each element
contains the location of the next element.”Consists of data and linkData contain information to be processedLink contains a pointer (address) that
identifies the next element in the list. The last element contains data only (null
pointer)
7
Array versus Linked List
8
File Structure“An external collection of related data
treated as a unit”Used to store data permanently in a
secondary storage device or auxiliaryExamples are a MS Word file and display of
information on the screenSequential access (one record after another
from the beginning to end) versus random access
9
Sequential Files 1Sequential access methodEach record is accessed one after another
from the beginning to end. Master and transaction files for updateCost saving (efficiency) and data security
10
Sequential Files 2
11
Indexed Files 1Random access methodConsists of a data file and its indexAn index contains the key of the data file and
the address (record number) of the corresponding record
An index is sorted based on the key values (attributes) of the data file
Find the desired key and retrieve its address, and then access the record.
12
Indexed Files 2
13
Hashed Files 1Random access methodUse a mathematical function for mapping a
key to the addressUser gives a key the hash function maps
the key to the address then passes to the OS record is retrieved
No need to have an indexDirect, modulo division, digit extraction, and
collision hashing
14
Hashed Files 2
15
File SystemsControl how data are stored and readUnix/Linux: ext2, ext3, ext4, and othersMac: HFS, HFS PlusMS Windows:
• FAT (File Allocation Table)• FAT32 • NTFS (New Technology File System)
16
Directories 1A special type of file containing information
about other files A directory itself is a fileAn index telling where files are located Organized as a tree (hierarchical structure)Each directory except the root directory
has a parent directory.
17
Directories 2Root directory (/)Working directory (current directory)Parent directory versus child directoryAbsolute path versus relative path/home/kucc625/wwwkucc625/www (assuming /home as a
working directory)
18
Binary FilesA collection of data stored in the internal
format of the computer.Use all 256 (8 bits) bit-string patterns Data can be character, integer, floating-
point numbers, and/or other type of data.Object files, images, videos, sounds, and
formatted text files (e.g., MS Word file) are binary files
19
Text FilesA sequence of lines and plain textsA file of characters.Each byte is written in 128 ASCII codes
(MSB is 0 and remaining 7 bits are used)Even a text file eventually stores data in 0’s
and 1’sReadable in text editors and many
applications as well
20
ASCII Files 1Text format containing ASCII charactersDepending on the delimiter (separating
date items)Free format (space delimited)Comma delimited format or comma
separated values (CSV). Text in quotesTab delimited formatFixed format
21
ASCII Files 2
22
Files for Specific ApplicationsFormatted for specific applicationsMS Word (.doc & .docx) and Excel (.xls
& .xlsx) have their own format.Unlikely to be shared by multiple
applications.One application (program) and its specific
data format.One program-one data file?
23
24
ReferencesForouzan, Beherouz. 2013. Foundations of
computer science, 3nd ed. Cengage Learning EMEA.