hyung-seon park ph.d biological diversity information flow in kbif, and the role to gbif korea...
TRANSCRIPT
Hyung-Seon PARK Ph.D
Biological diversity information flow in KBIF, and the role to GBIF
Korea Institute of S&T Information (KISTI)
17th CODATA/DSAO 21st October, 2006, China
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
Biological information including Taxonomy data, Observation data, Geological data, Specimen data and, also in total concept in characteristics of molecular genetics, ecological and taxonomical systems.
It assumed that the values in $$$’s, all around the world, particularly in biological industry.
Biodiversity Information,
http://www.kbif.re.kr
Specimens:Flowering Plants of
Africa
Specimens:Proteaceae of the World
Taxon Names:
Proteaceae of the World
Observations:Birds of Central
America
Observations:Butterflies of
Belize
Checklist:Birds of Belize
Specimens:Mammals of
North Europe Taxon
Names:Mammals of
the World
Specimens:Bacteria Cultures
Taxon Names:Bacteria
Further Links:
Bacteria
Further Links:
Mammals
Museum A
Museum C University D
Observer Network B
GBIFNetwork
A distributed network of Biodiversity web services
http://www.kbif.re.kr
The Biodiversity and Ecosystems information domain is vast, complex, and critically important to society.
However, most existing Biodiversity and Ecosystems information is not yet dynamically accessible and therefore not fully useful.
Recent technological and political developments provide opportunity for the development of Global Biodiversity
and Ecosystem Information Networks.
Rationale For GBIF
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
GBIF: chronological history
First Meeting of OECD Working Group on Biological Informatics, 1996 Proposed by OECD Working Group on Biological Informatics in its final
report, Jan. 1999 Basic operational aspects determined at meeting in March 99 Endorsed by OECD science ministers in June 99 GBIF Interim Steering Committee: first meeting (ISC1) in Sept. 99 GBIF Interim Steering Committee: second meeting (ISC2) in Feb. 00 GBIF Web page operational May 2000
www.gbif.org CBD presentations at SBSTTA5 (January 2000) and COP5 (May 2000) Letter to all science ministers in June 2000 GBIF Interim Steering Committee: third meeting (ISC3) in Sept. 00 Fourth and final GBIF Interim Steering Committee, Dec 00. Invitation mailed to all countries, Dec 00.
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
Voting Participants (26) Australia, Belgium, Canada, Costa Rica, Denmark, Estonia, Finland, France, Germany, Iceland, J
apan, Republic of Korea May 2001, Mexico, Netherlands, New Zealand, Nicaragua, Norway, Peru, Portugal, Slovenia, South Africa, Spain, Sweden, United Kingdom, USA ..
Associate Participants: Countries / Economies (21) Argentina, Austria, Bulgaria, Colombia, Czech, Ghana, India, Madagascar, Morocco, Pakistan, P
apua New Guinea, Poland, Slovak Republic, Switzerland, Taiwan (Economy), Tanzania ..
Associate Participants: Organizations (35) United Nations Environment Program (UNEP), World Federation for Culture Collections (WFC
C), Species 2000, BIOSIS, BioNET-INTERNATIONAL, EASIANET, European Commission, IUCN, ITIS, OBIS, SAFRINET, Taxonomic Databases Working Group (TDWG), Man and the Biosphere Program (MAB), ASEANET, All Species Foundation …
Cooperative activities with many others : CBD Clearing House Mechanism, UNEP-WCMC, CODATA (ICSU), NABIN, IABIN Many national-level organizations such as CONABIO (Mexico), ABIF (Australia), INBio (Costa Rica), etc.
GBIF Communications (October 2006)
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
What is GBIF?
GBIF is an international scientific co-operative project based on a multilateral agreement (MoU) between countries and international organisations, dedicated to:
establishing an interoperable, distributed network of databases containing scientific biodiversity information, in order to:
- make the world’s scientific biodiversity data freely available to all, - with initial focus on species- and specimen-level data, - with links to molecular, genetic and ecosystems levels
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
WWW.GBIF.ORGWWW.GBIF.ORGWWW.GBIF.ORGWWW.GBIF.ORG
GBIF as a global Biodiversity Data Integrator
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
Users and applications need data structured according to standards
<?xml version="1.0" encoding="UTF-8"?><response>
<record><darwin:DateLastModified>2003-06-08</darwin:DateLastModified><darwin:InstitutionCode>DGH</darwin:InstitutionCode><darwin:CollectionCode>DGH Lepidoptera</darwin:CollectionCode><darwin:CatalogNumber>DGHEUR_0002976</darwin:CatalogNumber><darwin:ScientificName>Dichomeris marginella (Fabricius, 1781)</darwin:ScientificName><darwin:BasisOfRecord>O</darwin:BasisOfRecord><darwin:Kingdom>Animalia</darwin:Kingdom><darwin:Order>Lepidoptera</darwin:Order><darwin:Family>Gelechiidae</darwin:Family><darwin:Genus>Dichomeris</darwin:Genus><darwin:Species>marginella</darwin:Species><darwin:ScientificNameAuthor>(Fabricius, 1781)</darwin:ScientificNameAuthor><darwin:IdentifiedBy>Donald Hobern</darwin:IdentifiedBy><darwin:Collector>Donald Hobern</darwin:Collector><darwin:YearCollected>2003</darwin:YearCollected><darwin:MonthCollected>06</darwin:MonthCollected><darwin:DayCollected>08</darwin:DayCollected><darwin:ContinentOcean>Europe</darwin:ContinentOcean><darwin:Country>Denmark</darwin:Country><darwin:County>Gentofte Amt</darwin:County><darwin:Locality>Merianvej, Hellerup</darwin:Locality><darwin:Longitude>12.538</darwin:Longitude><darwin:Latitude>55.737</darwin:Latitude><darwin:CoordinatePrecision>100</darwin:CoordinatePrecision><darwin:IndividualCount>1</darwin:IndividualCount><darwin:Notes>1 in Skinner trap</darwin:Notes>
</record></response>
June 2003 S M T W T F S 1 2 3 4 5 6 7 8 9 10 11 12 13 1415 16 17 18 19 20 2122 23 24 25 26 27 2829 30
Observation record formatted using the Darwin Core
http://www.kbif.re.kr
GBIF as a global data Integrator
Web services enable the aggregation of structured dataG
loba
l Bio
dive
rsity
Inf
orm
atio
n F
acili
ty
http://www.kbif.re.kr
Web servicesWeb services enable the aggregation of enable the aggregation of structured datastructured data
HeterogeneouHeterogeneous s DatabasesDatabases
Web Web ServicServic
eses
StandardisStandardiseded
Structured Structured DataData
UserUser<response> <record> …
<response> <record> …
<response> <record> …
<request>
<request>
<request>
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
With GBIF’s components inplace, data can be drawn directly from different sources with a single query.Compiled specimen, genetic, and
ecological information
GBIF contribution to interoperability
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
biodiversity informatics
True bioinformatics …True bioinformatics …
“bioinformatics”
ecoinformatics genomics
proteomics
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
Portal
Data providerProvider Services
Providerquery
RequestManager
QueryEngine
Availableproviders
UDDI RegistryInstitutions
Services (Providers)AccessPoints
ResourceMetadat
aResourceMetadata
Index
Metadataand name
query
Metadataresponse
Dataquery
Dataresponse
Metadata and logs
Name providerProvider Services
ResourceMetadat
aResourceMetadata
Synonyms, GUIDs
Publishavailability
CacheMetadata
Accounting
SOAP
DiGIR
HTTP
Data Provider within GBIF Architecture
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
The Protocol XML messaging on top of http
• Used for communication between data providers and data users• More light-weight and specialised than SOAP
Enables single point of access (portal/search) to distributed information resources• Resource: a collection of data objects that conform to a common schema (D
B records, XML documents)• Distributed resources comply with a federation schema
Enables search & retrieval of structured data • Search for data values in context (semantics)• Results are presented as a structured data set
Makes location and technical characteristics of the native resource transparent to the user
The Distributed Generic Information Retrieval protocol was created
by the TDWG/CODATA subgroup on biological collection data
http://www.kbif.re.kr
A simple DiGIR architecture
DiGIR providers
Databases
Portals, search engines, andapplications
http://www.kbif.re.kr
data exchange format: Darwin Core2
Darwin Core2 is a model that allows data on individual specimens or observations to be structured and shared as XML documents that can be transmitted across the Internet.• Suitable for collections and observations data.• http://digir.net/schema/conceptual/darwin/2003/1.0/dar
win2.xsd• 48 Elements:DateLastModified * InstitutionCode * CollectionCode * CatalogNumber *
ScientificName * BasisOfRecord Kingdom Phylum
Class Order Family Genus
Species Subspecies ScientificNameAuthor IdentifiedBy
YearIdentified MonthIdentified DayIdentified TypeStatus
CollectorNumber FieldNumber Collector YearCollected
MonthCollected DayCollected JulianDay TimeOfDay
ContinentOcean Country StateProvince County
Locality Longitude Latitude CoordinatePrecision
BoundingBox MinimumElevation MaximumElevation MinimumDepth
MaximumDepth Sex PreparationType IndividualCount
PreviousCatalogNumber RelationshipType RelatedCatalogItem Notes
http://www.kbif.re.kr
DiGIR Provider Package
Encompasses the DiGIR Provider software, Apache2 WEB server and PHP libraries.
Requires from the user only basic knowledge of the operating system.
Two available releases: (http://circa.gbif.net/Public/irc/gbif/ict/library?l=/digir_provider_package)
• Linux (RedHat 7.3, 8, 9), MS Windows (2000, XP)
Supported databases:• MySQL, PostgreSQL, MS SQL Server, MS Access (only the MS Windows
package)
Offers automatic registration with GBIF UDDI Registry (http://registry.gbif.net)
http://www.kbif.re.kr
Data repository tool
A tool to enable sharing of data
• Can upload and manage datasets in document format such as a) spreadsheet, b) embedded Darwin Core, or c) ABCD
• Can parse the data into embedded MySQL database that becomes available to the public as a DiGIR resource
• Can revoke release (data is deleted from database)
Stand-alone package or module of GBIF PTK
• For Linux and Windows• Based on Python and Zope• Includes automatic registration in GBIF registry
http://www.kbif.re.kr
Governing Board --GBGB con consisting of delegates from all countries and organizations that join GBIFsisting of delegates from all countries and organizations that join GBIF
• Executive Office -Secretariat (Copenhagen, Denmark) consisting of Secretariat (Copenhagen, Denmark) consisting of
Executive Secretary, Deputy Directors, Program Managers, technical and legal staffExecutive Secretary, Deputy Directors, Program Managers, technical and legal staff
• Science Committee -Science Committee plus S& TScience Committee plus S& T Advisory Groups Advisory Groups
consisting of delegates from pconsisting of delegates from participantsarticipants (+ other invited experts) (+ other invited experts)
-DADI, -ECAT, -DIGIT, -OCB subcommittee
• NODES Committee -Participant‘s Nodes that co-ordinate internal Participant‘s Nodes that co-ordinate internal
(national) activities with GBIF work programs(national) activities with GBIF work programs
• Budget Committee
• Review Committee -CODATA/KPMG
GBIF GoveranceGBIF Goverance
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
The main GBIF work programsThe main GBIF work programs
Data Access and Database InteroperabilityData Access and Database Interoperability Electronic Catalogue of Names of Electronic Catalogue of Names of
Known Organisms Known Organisms Digitisation of Natural History CollectionsDigitisation of Natural History Collections Outreach and Capacity BuildingOutreach and Capacity Building “Species Bank” Digital Biodiversity Literature Resources
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
Why was GBIF established ?Why was GBIF established ?
Both biodiversity and biodiversity data are unevenly distributed around the world:
Developing WorldDeveloping World
BiodiversityBiodiversity
Biodiversity Biodiversity DataData
Developed WorldDeveloped World
GBIF was established, in large part, to redress the inequality of data distribution
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
Where is GBIF located ?Where is GBIF located ?
Unlike CERN, the megascience instrumentation facility for particle physics that is located in Switzerland, GBIF is a megascience facility that is distributed all over the world, with its many parts connected by the Internet
The small, non-bureaucratic GBIF Secretariat is hosted by the Zoological Museum of the University of Copenhagen, Denmark
CERN
Glo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
http://www.kbif.re.kr
http://www.kbif.re.kr
http://www.kbif.re.kr
U.S.A
Denmark
KOREAGermany
http://www.kbif.re.kr
Korea, Opened GBIF Data portal Mirror Services
Http://www.asia.gbif.netHttp://www.asia.gbif.net
http://www.kbif.re.kr
GBIF Data Portal Statistics - ASIA.GBIF.NET
http://www.kbif.re.kr
GBIF.NET log- User Statistics (Monthly)
http://www.kbif.re.kr
GBIF.NET log- Statistics in Countries
http://www.kbif.re.kr
12th GBIF Governing Board Cape Town, South Africa 4 월 2-4 일 , 2006
Report
http://www.kbif.re.kr
○ Strategic Plan (2007-2011)
5 parts 18 modules
Contents Known-organism data integration and Search available
-Completion of ECAT 95%
Service available 1 Billion data cases in GBIF Data Portal
New data type search available; pictures and bibliography
Informatics Improvement in current Data Portal
Data available including molecular, ecological, biodiversity
Participation 100% NODE operable by 2011
Online education and training
GBIF Participation increase to 10% every year
Governance Science Council
Governing Board meeting once in a year
Sustainablefinance
Campaign activity per year
Secure finance level
http://www.kbif.re.kr
○ Funding for GBIF 2007-2011
<4 million Euros/year
http://www.kbif.re.kr
8th NODES (node committee) Report
http://www.kbif.re.kr
NODES NODES Status ReportStatus Report
- Data sharing is possible through 120,000 institutes within GBIF
Kore
an
Bio
div
ers
ity I
nfo
rmati
on
Facilit
y
http://www.kbif.re.kr
Increment in data sharing
http://www.kbif.re.kr
KBIF (Korean Biodiversity Information Facility)
http://www.kbif.re.kr
Accessible Biodiversity Data in the globe
0
10
20
30
40
50
60
70
80
Sep.2003
Dec.2003
Mar.2004
J un.2004
Sep.2004
Dec.2004
Mar.2005
# of
dat
a re
cord
s
Million
Sleepingrecords
sharedrecords 2.23%
97.77%
• 300 million species available(Museum, Herbarium, University …)
http://www.kbif.re.kr
http://www.kbif.re.kr
KBIF (Korea Biodiversity Information Facility)
Korea national Node for GBIF• IT Infrastructure and Web Services System • Training and Disseminate the Key technology (S/W, Protocol etc.)• Enlarge Data Provider, support biodiversity research and database
http://www.kbif.re.kr
Role in GBIF NODES and KBIF Operation
GBIFPortal
www.gbif.net
www.cbif.gc.ca
www.danbif.dk
CBIFCanada
BeBIFBelgium
JBIFJapan
DanBIFDenmark
www.be.gbif.net
GBIF타 회원국GBIF
타 회원국GBIF타 회원국GBIFOther countries
Participant Nodes
www.jbif.go.jp
DN1Environmenta
l
DN4Agricultura
l
DN2Oceanic DN3
Bioresource
Data Node
특화 연구소 Data
Node University
Data Node
Science,Natural History
Museum
www.kbif.re.kr
KBIFKorea
NODE
Data Node
Governmental
research Institute
http://www.kbif.re.kr
KBIF Science Committee Aim to Work Program
- DADI (Data Access and Data Interoperability )- DIGIT (Digitization of Natural History Collections )- ECAT (Electronic Catalogue of Names of Known Organisms)- OCB ( Outreach and Capacity Building)
Korean NODE Act gateway to Integration of Biodiversity information
- Build IT infra system and structure - Training and Disseminate the Key technology (S/W, Protocol etc.)
- Develop data exchange Standard schema, metadata- Management and liaison Data Nodes
(Enlarge Data Provider to GBIF)
KBIF (Korean Biodiversity Information Facility)
KBIF Committees (SC/NC/EC/BC/RC)
http://www.kbif.re.kr
KISTI’s role in GBIF
Acting Korea National Node for GBIF• Network for Biodiversity
information flow and Data Repository
• Training, disseminate and aids for the technological needed
• Act in GBIF Committee, NODES, Forum
• KBIF Operation
http://www.kbif.re.krGlo
bal B
iodi
vers
ity I
nfor
mat
ion
Fac
ility
http://www.kbif.re.kr
Korean Biodiversity Information Facility (KBIF)
http://www.kbif.re.kr
Kore
an
Bio
div
ers
ity I
nfo
rmati
on
Facilit
y
“BioDiversity”
+Genomics
+Resources
GBIF
KBIF
Data SharingIn GBIF
GBIF Data Service Stable service
1 billionData case services
•Activity in NODE committee•Management and enlarge data registration to GBIF•Training and disseminate key technology to Data Provider
KBIF Stable operation
Portal systemKDR, NABIPOS
2006 2007 2008 <
100million dataGBIF mirror Service
Infra- setups
Information NetworkData Repository (KDR)
KBIF SchemaWeb services
Nat’l Science Museum 30 local museum, univ’s
1million cases to GBIF
GBIF Data Service Asia Regional Hub
Tech, help deskSystem development
Data Exchange Standard and Protocol
Develop retrieval system,Statistics, analyzing system
Joining Government institutes
Add 1million data case
Enlarge data node Agricultural, Environ.
Oceanic, Bioresources
•KOBIC National Data Service
http://www.kbif.re.kr
National Biodiversity Information Portal System (NABIPOS)
1. NABIPOS(National Biodiversity Information Portal System) is integrated retrieval system to search the distributed biodiversity data of Korea.
2. It is currently providing the retrieval service and data provider state of Korea using the DarwinCore and DiGIR protocol.
3. It will provide retrieval service of various contents to extend a better understanding of the biodiversity.
http://www.kbif.re.kr
KBIF Data Repository (KDR)
1. KDR(KBIF Data Repository) supports general user who can easily store, convert, and search their own biodiversity data within international standard.
2. It is designed to support Korean language and KBIF (Korea Biodiversity Information Facility) schema using the GBIF Data Repository Toolkit.
3. It is an effective tool to improve the accumulation of the biodiversity data and to enhance the data sharing.
http://www.kbif.re.kr
The KBIF Value Chain
Observer
Taxonomist
Collater
Provider
Indexer
Portal GBIF
PortalNat’l
GISservice Modelling
service
Scientist
Policymaker
Analyticservice
Mirror
GENERATING - AGGREGATING - PROVIDING - INTEGRATING – DISCOVERING – ANALYSING - PRESENTING
Archive Lit./ref.service
Helpdeskservice
Public
Usageservice
http://www.kbif.re.kr
http://www.kbif.re.kr
Demonstration
Google Earth- Korean Data
http://www.kbif.re.kr
Synergetic Effects in Combining Data
: 1+1 › 2
KBIF (Korean Biodiversity Information Facility)
The value of data is in its use
http://www.kbif.re.kr
Supercomputing, extending the HorizonSupercomputing, extending the Horizon
of Science and Technologyof Science and Technology
KISTI KISTI
感謝