data backup recovery training
DESCRIPTION
Dehner M. De LeonOfficer – Database AdministrationIRRI Social Sciences DivisionTRANSCRIPT
DATA BACKUP/RECOVERY DATA BACKUP/RECOVERY TRAININGTRAINING
Dehner M. De LeonDehner M. De Leon
Officer – Database AdministrationOfficer – Database Administration
IRRI Social Sciences DivisionIRRI Social Sciences Division
WHY THERE IS A NEED?WHY THERE IS A NEED?
The unexpected computer The unexpected computer glitches always happen glitches always happen
whether it's the hardware whether it's the hardware failure, failure,
files become corrupted, files become corrupted,
viruses attacksviruses attacks
WHY THERE IS A NEED?WHY THERE IS A NEED?
power failures and spikes, or power failures and spikes, or
you accidentally delete an essential you accidentally delete an essential file, orfile, or
something else makes it impossible to something else makes it impossible to open and read a file. open and read a file.
WHY THERE IS A NEED?WHY THERE IS A NEED?As per IRRI’s BCM-RMQA As per IRRI’s BCM-RMQA
Yearly risk assessment (2008-2010)Yearly risk assessment (2008-2010) Possible loss of research dataPossible loss of research data
o Risk sources – technological failureRisk sources – technological failureo Physically hazardous environmentsPhysically hazardous environmentso Lack of training/ awarenessLack of training/ awareness
WHY THERE IS A NEED?WHY THERE IS A NEED?
As per IRRI’s BCM-RMQAAs per IRRI’s BCM-RMQA
oDisasters/ CalamitiesDisasters/ Calamities
oRelocation of IT Equipments (2011)Relocation of IT Equipments (2011)
SSD Data Backup/ Retrieval ProcedureSSD Data Backup/ Retrieval Procedure
Objective:Objective:
The purpose of this procedure is to The purpose of this procedure is to safeguardsafeguard all the all the important/working files and important/working files and databasesdatabases at IRRI SSD/GIS Workstations and at IRRI SSD/GIS Workstations and laptops. laptops. in compliancein compliance with IRRI’s Business with IRRI’s Business Continuity Management (BCM). To the event Continuity Management (BCM). To the event that a data was loss a quick recovery plan is in that a data was loss a quick recovery plan is in place. place.
SSD Data Backup/ Retrieval ProcedureSSD Data Backup/ Retrieval ProcedureData to be backed Data to be backed
upup
Data commonly stored on the secondary Data commonly stored on the secondary partitionpartition
WHAT IS A PARTITION?WHAT IS A PARTITION?Disk partitioningDisk partitioning is the act of dividing a is the act of dividing a hard disk drivehard disk drive into multiple logical into multiple logical storage units referred to as storage units referred to as partitionspartitions, to , to treat one physical disk drive as if it were treat one physical disk drive as if it were multiple disksmultiple disks
500GB
250GB
250GB
1 Drive 1 Drive/ 2 Partitions
Primary Partition
Secondary Partition
•System Drive C:•Program Files•Temp Files
•Data Drive E:•My Docs•pst etc.•Customized Programs
OS?Data?Etc?
SSD Data Backup/ Retrieval ProcedureSSD Data Backup/ Retrieval ProcedureData to be backed Data to be backed
upup
Commonly stored on the secondary Commonly stored on the secondary partitionpartition Research Data Files: Research Data Files: All of your files including All of your files including
Microsoft Office documents, spreadsheets, Microsoft Office documents, spreadsheets, presentations, databases, graphics files, audio & presentations, databases, graphics files, audio & video files, software, etc. video files, software, etc.
Outlook – pst files Outlook – pst files
SSD Data Backup/ Retrieval ProcedureSSD Data Backup/ Retrieval ProcedureData that are not necessary to Data that are not necessary to
backup backup
Files on Files on USB Flash “thumb” drives USB Flash “thumb” drives Partition Drive C: (Operating System)Partition Drive C: (Operating System) NAS or \\Netwin\SSDNAS or \\Netwin\SSD Stored data on the CloudStored data on the Cloud Non-official dataNon-official data If the source data capacity is larger than the backup If the source data capacity is larger than the backup
storage capacity (520GB > 500GB)storage capacity (520GB > 500GB)
What is a Backup?What is a Backup?
In In Information TechnologyInformation Technology, a , a backupbackup or the process or the process of of backing upbacking up refers to making copies of refers to making copies of datadata so so that these additional copies may be used that these additional copies may be used to to restorerestore the original after a the original after a data lossdata loss event . event .
Synchronizing/ Mirroring - in computing is the process of Synchronizing/ Mirroring - in computing is the process of making sure that files in two or more locations are updated making sure that files in two or more locations are updated through certain rules through certain rules
Imaging - A Imaging - A disk imagedisk image is a single is a single filefile or storage device or storage device containing the complete contents and structure representing a containing the complete contents and structure representing a data storage medium or device data storage medium or device
Modes of BackupModes of Backup
One-way BackupOne-way Backup
Source Target
Two-way BackupTwo-way Backup
One-Way SynchronizationOne-Way Synchronization (a.k.a. file mirroring / file (a.k.a. file mirroring / file replication / file backup): replication / file backup):
Files are expected to change in one location only. To Files are expected to change in one location only. To reconcile the changes, the synchronization process reconcile the changes, the synchronization process copies files only in one direction. The two locations copies files only in one direction. The two locations are not considered equivalent. One location is are not considered equivalent. One location is considered the Source and the other is considered considered the Source and the other is considered the Target. Files are pushed from Source to Target the Target. Files are pushed from Source to Target (or files are pulled from Source to Target, but always (or files are pulled from Source to Target, but always in one direction only). Source is said to be mirrored in one direction only). Source is said to be mirrored to Target. to Target.
Source Target Two-Way SynchronizationTwo-Way Synchronization (a.k.a. bi-directional (a.k.a. bi-directional synchronization or both-ways synchronization): synchronization or both-ways synchronization):
This synchronization process copies files in both directions This synchronization process copies files in both directions to reconcile changes as needed. Files are expected to reconcile changes as needed. Files are expected to change in both locations. The two locations are to change in both locations. The two locations are considered equivalent.considered equivalent.
Modes of BackupModes of Backup
One-way BackupOne-way Backup
Source Target
Two-way BackupTwo-way Backup
ProsPros File ReplicationFile Replication File BackupFile Backup Can keep Can keep
previous dataprevious data
Source Target
ConsCons Consumes more Consumes more
space if unattendedspace if unattended New file on target New file on target
will not reflect to will not reflect to SourceSource
File File deleted/renamed to deleted/renamed to Source will remain Source will remain in Targetin Target
ProsPros Source and Source and
Target are up to Target are up to datedate
ConsCons Once deleted at the Once deleted at the
Source same goes Source same goes to the Target to the Target
What are the available Backup Systems out What are the available Backup Systems out there??there??
Proprietary Proprietary
Bundles with your External Storage Drive:Bundles with your External Storage Drive:
Seagate, Maxtor, Western Digital, etc.Seagate, Maxtor, Western Digital, etc.
Free Windows Backup SoftwareFree Windows Backup Software
Cobian,Todo, Delta copy, Ace, Sync Toy, Windows 7 new featureCobian,Todo, Delta copy, Ace, Sync Toy, Windows 7 new feature
What are the available Backup Systems out What are the available Backup Systems out there?there?
One-way BackupOne-way Backup
Source Target
Two-way BackupTwo-way Backup
ProsPros File ReplicationFile Replication File BackupFile Backup Can keep Can keep
previous dataprevious data
Source Target
ConsCons Consumes more Consumes more
space if unattendedspace if unattended New file on target New file on target
will not reflect to will not reflect to SourceSource
File File deleted/renamed to deleted/renamed to Source will remain Source will remain in Targetin Target
ProsPros Source and Source and
Target are up to Target are up to datedate
ConsCons Once deleted at the Once deleted at the
Source same goes Source same goes to the Target to the Target
2 Main assets of SSD2 Main assets of SSD
1. Data1. Data
2 Main risks2 Main risks1. Loss / Quality of Data 1. Loss / Quality of Data
2. People2. People
2. Loss of Life 2. Loss of Life (endanger)(endanger)
Let’s focus on DataLet’s focus on DataThe SSD DatabaseThe SSD Database
A comprehensive digital databaseA comprehensive digital database It is an integrated data center for the socio-It is an integrated data center for the socio-
economic data on rice production system at economic data on rice production system at the farm level, rice demand, supply and related the farm level, rice demand, supply and related data on the national, and regional leveldata on the national, and regional level
It is composed of primary and secondary data It is composed of primary and secondary data collected through the research projects and collected through the research projects and activities of SSD activities of SSD
It is organized and accessible in a user friendly It is organized and accessible in a user friendly fashionfashion
Household Survey DatabaseHousehold Survey Database It is a rich collection of actual farm It is a rich collection of actual farm and household level data on rice and household level data on rice production collected throughproduction collected through
personal farmer personal farmer interviews, interviews, farm record keeping, and farm record keeping, and periodic monitoringperiodic monitoring
of farm activities from various sites of farm activities from various sites in different rice growing countries of in different rice growing countries of Asia.Asia.
SSD Work Process FlowSSD Work Process Flow
ConceptualizeConceptualize
Pre-testing of the Pre-testing of the survey survey questionnairequestionnaire
SSD Work Process FlowSSD Work Process Flow
Field SurveyField Survey
Training of Training of enumeratorsenumerators
SSD Work Process FlowSSD Work Process FlowCoding and do quality Coding and do quality control of the datacontrol of the data..
ConfirmationConfirmation
SSD Work Process FlowSSD Work Process Flow
Data Entry/ CleaningData Entry/ Cleaning
Data AnalysisData Analysis
SSD Work Process FlowSSD Work Process FlowIncorporate resultsIncorporate resultsPublicationsPublications
PresentationsPresentations
Data entry
Data cleaning (key variables)
Upload to public domainShare data among members
Data merge
Data cleaning
Variable construction
Analysis
CSPro
Excel/STATA
STATA (program)
Current status of the household survey Current status of the household survey
data setsdata sets
The SSD survey data sets are all over the place, The SSD survey data sets are all over the place, each researcher kept his own project data set. each researcher kept his own project data set.
It is kept in a format known only to the It is kept in a format known only to the researcherresearcher
No focal person to ask who keep such and such No focal person to ask who keep such and such data setdata set
Lack of standard protocol for the repository of Lack of standard protocol for the repository of data collected by SSD and NARES collaboratorsdata collected by SSD and NARES collaborators
Lack of standard system of Lack of standard system of collecting/organizing the data setscollecting/organizing the data sets
Where’s the Data?? Where’s the Data??
STRASA
Africa Rice CSISA
PRSSP
GSR
VDS
Bohol Project
CGISA
SSD
What steps are involved to build What steps are involved to build this database?this database?
1.1. Develop a standard format that defines Develop a standard format that defines how all data sets by projects must be how all data sets by projects must be formatted—how do we do this?formatted—how do we do this?
involve majority if not all researchers involve majority if not all researchers in SSD who collect, summarize, in SSD who collect, summarize, analyze and manage the data sets of analyze and manage the data sets of all completed and on going projects in all completed and on going projects in SSD. SSD.
a series of meetings to develop a a series of meetings to develop a common template for the data base. common template for the data base.
a workshop was held to implement a workshop was held to implement this template to at least one data set this template to at least one data set for participant and to further refine for participant and to further refine the template in terms of applicability the template in terms of applicability to majority of the data sets in SSDto majority of the data sets in SSD
22. Merge and clean all the datasets to adhere . Merge and clean all the datasets to adhere to a common format, codes etc. to establish to a common format, codes etc. to establish the master file databasethe master file database
3. With further consultation, agreed on which 3. With further consultation, agreed on which components of the data sets will be freely components of the data sets will be freely open to the public and which one will be of open to the public and which one will be of limited access.limited access.
4. Do an additional processing of selected 4. Do an additional processing of selected variables to create the data set that will be variables to create the data set that will be accessible to the general publicaccessible to the general public
Steps involved…Steps involved…
Initial accomplishmentsInitial accomplishments
Number of data sets Number of data sets 1010
Inclusive period Inclusive period 1993-20081993-2008
SitesSites 9 countries9 countries
Total number of recordsTotal number of records 6,6226,622
Hundreds of variables:Hundreds of variables: inputs and inputs and outputs of outputs of rice rice production, production, demographics, income, demographics, income,
land profile, water use, land profile, water use, variety planted etc.variety planted etc.
SSD DatabaseSSD Database
Unified Database
IRRINAS
External HDD
HDD RaidSSD WorkstationsSSD Workstations
INTERNETINTERNET
SSD / GIS Servers managed in coordination with SSD / GIS Servers managed in coordination with ITSITS
SSD NAS (1TB) SSD NAS (1TB) Working files only!Working files only!
SignificanceSignificance Once completed it will be the first Once completed it will be the first
comprehensive digital socioeconomic comprehensive digital socioeconomic database on farm level rice production database on farm level rice production in the rice growing areas of Asia in the rice growing areas of Asia (Africa) . (Africa) . for use by researchers, govt and academic for use by researchers, govt and academic
institutions, donors and other interested institutions, donors and other interested members of society members of society
a a gold minegold mine of information on what is of information on what is actually happening at the farmer’s actually happening at the farmer’s fieldfield
To attest SSD’s RMQA complianceTo attest SSD’s RMQA compliance Visit Visit www.irri.orgwww.irri.org under Our Sciences\Social under Our Sciences\Social
Science & EconomicsScience & Economics Farm household survery (Farm household survery (
http://geo.irri.org:8180/householdshttp://geo.irri.org:8180/households)) World rice statistics (World rice statistics (http:/geo.irri.org:8180/wrshttp:/geo.irri.org:8180/wrs))
Procedures are in place (Netwin, Back-up Procedures are in place (Netwin, Back-up monitoring)monitoring)
SSD RMQA poster SSD RMQA poster Awareness (OU heap approval, notifications Awareness (OU heap approval, notifications
via email and mousepad imprint) via email and mousepad imprint)
People that work hard to make this happenPeople that work hard to make this happen
SSD Database TeamSSD Database Team
SSD Household DatabaseSSD Household Database
SDD RMQATeamSDD RMQATeam
What will it contain?What will it contain? 1.) List of all completed 1.) List of all completed
research projects of SSD research projects of SSD with some basic information with some basic information about it such as:about it such as:
project titleproject title project sitesproject sites principal researcher principal researcher duration of the projectduration of the project major variables collectedmajor variables collected main objective of the researchmain objective of the research project output (reports and paper project output (reports and paper
published)published)
2. Selected summary tables on basic 2. Selected summary tables on basic information about rice production at the information about rice production at the farm levelfarm level
3. Detailed data by individual household 3. Detailed data by individual household observations on the following ;observations on the following ; land use/profileland use/profile household informationhousehold information inputs on rice production e.g. fertilizer, inputs on rice production e.g. fertilizer,
seeds, pesticides, seeds, pesticides, and labor useand labor use rice production practices – method of crop rice production practices – method of crop
establishment, establishment, variety planted, level of variety planted, level of mechanization etcmechanization etc
costs and returns of rice production by costs and returns of rice production by individual individual sample householdsample household
What will it contain?What will it contain?