document image analysis for metadata extraction george r. thoma, ph.d. chief, ceb lister hill center...
Post on 15-Jan-2016
221 views
TRANSCRIPT
Document image analysis for Document image analysis for metadata extractionmetadata extraction
George R. Thoma, Ph.D.Chief, CEBLister Hill Center
U.S. National Library of Medicine
National Library of MedicineNational Library of Medicine
World’s largest medical libraryWorld’s largest medical library
U.S. govt. agency, part of NIHU.S. govt. agency, part of NIH
Collects all significant material in Collects all significant material in biomedicine and health carebiomedicine and health care
Database producer (MEDLINE, GenBank,..)Database producer (MEDLINE, GenBank,..)
Research centersResearch centers
Extramural grantsExtramural grants U.S. National Library of MedicineU.S. National Library of Medicine
NLM-MissionNLM-MissionDevelop and provide biomedical information to:Develop and provide biomedical information to:
The clinical and research communities (The clinical and research communities (e.g., MEDLINEe.g., MEDLINE))
Public health and public safety agencies (Public health and public safety agencies (e.g., HSDBe.g., HSDB))
The lay public (The lay public (e.g., MEDLINEpluse.g., MEDLINEplus))
Develop and provide tools for biomedical research (Develop and provide tools for biomedical research (e.g., e.g.,
WebMIRS, x-ray atlas, genomic data analysisWebMIRS, x-ray atlas, genomic data analysis))
Develop and provide tools for informatics research (Develop and provide tools for informatics research (e.g., e.g.,
UMLS, vocabulary tools, knowledge representation, medical ontologiesUMLS, vocabulary tools, knowledge representation, medical ontologies))
Conduct inhouse R&DConduct inhouse R&D
Sponsor extramural research (Sponsor extramural research (Telemedicine, Visible Human Project, Telemedicine, Visible Human Project, Next Generation Internet, Medical Informatics….)Next Generation Internet, Medical Informatics….)
Provide fellowships for faculty, studentsProvide fellowships for faculty, studentsTwo important missions:1. Create citations to the biomedical journal literature for MEDLINE®2. Preservation
R&D: why and howR&D: why and how Aim:Aim: to introduce appropriate technologies to introduce appropriate technologies
– To support NLM’s services and functionsTo support NLM’s services and functions– To create and disseminate information for To create and disseminate information for
biomedical communities: research, clinical and biomedical communities: research, clinical and informaticsinformatics
– To provide information for the lay publicTo provide information for the lay public How:How:
– Identifying suitable domainsIdentifying suitable domains– Designing/developing prototype systemsDesigning/developing prototype systems– Using these as testbeds to address key Using these as testbeds to address key
questionsquestions– Implementing/deploying operational systemsImplementing/deploying operational systems
Preservation of Digital MaterialsPreservation of Digital Materials
Technical obsolescence of storage media Technical obsolescence of storage media and supporting hardware and softwareand supporting hardware and software
Ever-increasing volume of endangered Ever-increasing volume of endangered digital materialsdigital materials
Critical component: metadata for future Critical component: metadata for future access and migration to newer formatsaccess and migration to newer formats
Avoid labor cost of manual metadata entryAvoid labor cost of manual metadata entry
Candidates for Digital Candidates for Digital Preservation (NLM collections)Preservation (NLM collections)
Profiles in Science Profiles in Science – Archival collections of Archival collections of
leaders in biomedical leaders in biomedical research and public research and public healthhealth
– TIFF, PDF, HTML, audio, TIFF, PDF, HTML, audio, video filesvideo files
PubMed CentralPubMed Central– Digital archive of life Digital archive of life
sciences journalssciences journals– XML, PDF, TIFF XML, PDF, TIFF – Contains about 170 Contains about 170
journal titlesjournal titles
Goal: System for Preservation of Goal: System for Preservation of Electronic Resources (SPER)Electronic Resources (SPER)
Automated metadata Automated metadata extractionextraction– Technical Technical metadata metadata
from file header from file header – DescriptiveDescriptive metadata metadata
(heuristic rules and (heuristic rules and machine learning machine learning techniques)techniques)
– Minimum human Minimum human interactioninteraction
Conform to standards Conform to standards (DC, NISO, METS)(DC, NISO, METS)
Intelligent file migrationIntelligent file migration– Lossy or lossless Lossy or lossless
migrationmigration– When to migrateWhen to migrate
Ingest
SPER
GUIs
Metadata extraction
Migration
Storage
Search
Metadataand files
Queryresults
•Extracting descriptive metadata (e.g., article title, authors, affiliation, page numbers, journal name, publication date, publisher, databank accession numbers, grant numbers, etc……. PLUS abstract)
•Example: Grubb RL. Hemodynamic factors in the prognosis of symptomatic carotid occlusion. JAMA. 1998. 280 (12) 1055-60…..
Our Problem
Grubb RL. Hemodynamic factors in the prognosis of Grubb RL. Hemodynamic factors in the prognosis of symptomatic carotid occlusion. JAMA. 1998. 280 (12) 1055-symptomatic carotid occlusion. JAMA. 1998. 280 (12) 1055-60…..60…..
In other words…In other words…
Automated Metadata Extraction Automated Metadata Extraction MethodsMethods
TIFF TIFF OCR OCR segment segment label physical label physical zones (using DIAU techniques)zones (using DIAU techniques)
Use Use heuristic rulesheuristic rules related to layout related to layout (geometric) and context (key words)(geometric) and context (key words)– Currently in production (MARS* for citation Currently in production (MARS* for citation
generation from journal articles)generation from journal articles)
Use the Use the learned rules or modelslearned rules or models– ExperimentsExperiments
*Medical Article Records System: automatic extraction of article title, author names, affiliations, abstract, from scanned journals, to populate MEDLINE.
Why learned rules or models? Diverse Why learned rules or models? Diverse layout styleslayout styles
Style differs in different Style differs in different journalsjournals
Style varies in different Style varies in different issues of a journalissues of a journal
Manual rule or model Manual rule or model creation expensivecreation expensive
Automated rule or model Automated rule or model learning from previous learning from previous resultsresults
Use style related Use style related featuresfeatures
Significant Features Significant Features (examples)(examples)
GeometricGeometric– Absolute location and size of zones (x1, y1, x2, y2)Absolute location and size of zones (x1, y1, x2, y2)– Relative location of zones (top, bottom, left of, right of)Relative location of zones (top, bottom, left of, right of)– Page margin and gap between zonesPage margin and gap between zones
ContextualContextual– Font size (12pt, 20pt)Font size (12pt, 20pt)– Font attribute (bold, italic)Font attribute (bold, italic)– Key words (University, city, department …)Key words (University, city, department …)
CheckIn
Scanner
Scanner
Database
AutozoneOCR
DCMS
Edit Edit Reconcile Admin
Lexicons/rules
Journal flow
Indexing
Autolabel
MARS
Autoformat
ConfidenceEdit
PatternMatch
Upload
MEDLINE
EditDiff
MARS
Original bitmap
Image
Original bitmap Zoned
Image processed by Autozone
Features for AutozoneFeatures for Autozone
Median character height and widthMedian character height and width Average character heightAverage character height Maximum character heightMaximum character height Average height of lower case characters (without Average height of lower case characters (without
ascenders or descenders)ascenders or descenders) Average character confidence valueAverage character confidence value Number of alphanumeric charactersNumber of alphanumeric characters Aspect ratio of line (height/width)Aspect ratio of line (height/width) % italics, bold, upper case, digits% italics, bold, upper case, digits Approximate location on pageApproximate location on page
For each text-line
Original bitmap Zoned Labeled
Image processed by Autozone and Autolabel
Original bitmap Zoned Labeled
Autoreformat
Text syntax reformatted
e.g., John A. Smith Smith John A
Original bitmap Zoned Labeled
Lexical analysis to overcome OCR errors
Text syntax reformatted
Lexical analysis
Scan workstation in MARS system
Scan workstation operation
Edit workstation: colors identify fields labeled automatically
Edit workstation
Edit workstation
High confidence characters(%)
Reconcile workstation in MARS – main screen
Pattern matching to correct words for Pattern matching to correct words for Reconcile operatorReconcile operator
Reconcile workstation GUI - closeupReconcile workstation GUI - closeup
Bitmappedimage
Incorrectword
Operator click selectscorrect word from pattern matching
Automated Metadata Extraction Automated Metadata Extraction based on Learning Methodsbased on Learning Methods
Automatically learn layout rules or models Automatically learn layout rules or models from previous (similar) TIFF documentsfrom previous (similar) TIFF documents
Use the learned rules or models to Use the learned rules or models to segment and label TIFF document pages segment and label TIFF document pages of similar layout stylesof similar layout styles
Two Machine Learning MethodsTwo Machine Learning Methods Learn labeling rules from dynamically Learn labeling rules from dynamically
generated features based on string generated features based on string matching techniques (DFGS)matching techniques (DFGS)
– Exploit the MARS system and Exploit the MARS system and DFGS is DFGS is now in the MARS production systemnow in the MARS production system
– Three types of features to infer rulesThree types of features to infer rules– Provide an unstructured and partial Provide an unstructured and partial
description of a document pagedescription of a document page– Good for arbitrary layouts but Good for arbitrary layouts but
sensitive to variations in absolute sensitive to variations in absolute zone locationszone locations
– Requires that the physical Requires that the physical segmentation (“zoning”) is done segmentation (“zoning”) is done accuratelyaccurately
Learn a 2-D layout model with logical Learn a 2-D layout model with logical labels based on a Bayesian approachlabels based on a Bayesian approach
– Provide a structured, either partial or Provide a structured, either partial or full 2D description of a document full 2D description of a document pagepage
– Physical segmentation and logical Physical segmentation and logical labeling are performed labeling are performed simultaneouslysimultaneously using the modelsusing the models
– Not sensitive to document noise and Not sensitive to document noise and variations in absolute zone locationsvariations in absolute zone locations
– Use backgroundUse background– Sensitive to document skewSensitive to document skew
tm h g1 ti g2 C g3 B bm
C1 g4 C2
ab g5 K au g6 af
lm P rmX
Y
X
Y
Title Font Size Distribution
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25 30 35 40
Title Font Size
Title Font Attribute Distribution
0
0.2
0.4
0.6
0.8
1
170 175 180 185 190 195 200 205 210
Title font Attribute
Example: title field
Learning Labeling Rules: Dynamic Learning Labeling Rules: Dynamic Feature Generation System (DFGS)Feature Generation System (DFGS)
Scanned journals
ZoneCzar1 Reformat Reconcile
Zoning and labeling
Reformatting syntax
Text verification
UploadOCR
MARS (simplified)
MEDLINE ®
FeatureGeneration
Candidate combined Feature sets
2
IndividualFeature sets
Loop Feature Combinationand matching score
ZoneMatch2ZoneCzar2
ZMControl
ZoneMatch1
Dynamic Feature Generation System (DFGS)
ZRJournalSpecificInformation
Matchedfeatures
Verifiedtext
Mao S, Kim J, Thoma GR.Mao S, Kim J, Thoma GR. A Dynamic Feature A Dynamic Feature Generation System for Automated Metadata Generation System for Automated Metadata Extraction in Preservation of Digital Materials. Extraction in Preservation of Digital Materials. Proc. 1Proc. 1stst International Workshop on Document International Workshop on Document Image Analysis for LibrariesImage Analysis for Libraries,, Pages 225-232, Palo Pages 225-232, Palo Alto, CA, January 2004.Alto, CA, January 2004.
Learn Document 2D Layout Models Learn Document 2D Layout Models based on a Bayesian Approach based on a Bayesian Approach
Represent 2D layouts by a set of attributed hidden Represent 2D layouts by a set of attributed hidden semi-Markov models (HSMMs)semi-Markov models (HSMMs)
A Bayesian method for learning 2D layout models A Bayesian method for learning 2D layout models from segmented and labeled, but unstructured from segmented and labeled, but unstructured training datatraining data
Simultaneous physical segmentation and logical Simultaneous physical segmentation and logical labeling using learned layout modelslabeling using learned layout models
Character bounding boxes as basic image units Character bounding boxes as basic image units [Liang [Liang et al, et al, 1996 and Ha 1996 and Ha et al, et al, 1995]1995]
Attributed Hidden Semi-Markov Models Attributed Hidden Semi-Markov Models (attributed HSMMs) = (attributed HSMMs) = ((A,B,C,A,B,C,ππ,,ρρ))
Hidden Semi-Markov Models (HSMMs)Hidden Semi-Markov Models (HSMMs)
1 4
3
52
6
1
0.4
0.6 1
0.8
0.2
1
1
ρ=X
AA: state transition probability matrix that : state transition probability matrix that defines a Markov modeldefines a Markov model
ππ : : initial state probability distribution initial state probability distribution vectorvector
BB: state observation probability matrix : state observation probability matrix that defines the “hidden” partthat defines the “hidden” part
CC: state duration probability matrix that : state duration probability matrix that defines the “semi” partdefines the “semi” part
ρρ: : direction attribute (x or y)direction attribute (x or y)
B C
1
Markov ModelsMarkov ModelsHidden Markov ModelsHidden Markov ModelsAttributed Hidden Semi-Markov Models Attributed Hidden Semi-Markov Models (attributed HSMMs)(attributed HSMMs)
Map attributed HSMMs To Map attributed HSMMs To 2D Document Layout2D Document Layout
States: document States: document regions such as text regions such as text regions, page margins, regions, page margins, gaps between text gaps between text regionsregions
State transitions: State transitions: boundaries and orderboundaries and order of document regionsof document regions
State observation: State observation: featuresfeatures of document of document regionsregions
State duration: State duration: sizessizes of of document regionsdocument regions
ρ=Y
StatesStates Key statesKey states
– Title, author, affiliation, Title, author, affiliation, abstractabstract
Marginal statesMarginal states– Header, footer, section textHeader, footer, section text– Text from neighboring pageText from neighboring page– Noise streakNoise streak
Combinatorial states: can be Combinatorial states: can be partitioned at another partitioned at another dimensiondimension
Margin and gap statesMargin and gap states
State Observations and DurationsState Observations and Durations
State observations State observations (contextual (contextual features)features)– Number of charactersNumber of characters– Majority font sizeMajority font size– Majority key wordMajority key word– Majority attribute (Majority attribute (BoldBold, , italicsitalics))
State durations State durations (geometric (geometric features)features)– The size of zones, page margins, The size of zones, page margins,
and gaps between zones (width and gaps between zones (width and height)and height)
2D Layout Model: a set of attributed 2D Layout Model: a set of attributed hidden semi-Markov modelshidden semi-Markov models
lm P rm
X
tm h g1 ti g2 C g3 B bm
Y
C1 g4 C2
X
ab g5 K
Yau g6 af g7 ad
lm P rmtmhg1tig2
C
g3
B
bm
C1 g4 C2
ab
g5k
aug6
af
g7
ad
Bayesian Learning MethodBayesian Learning Method
Start with an initial model Start with an initial model MM00
Let X be the observation Let X be the observation sample associated with sample associated with MM00
Merge the states of Merge the states of MM00 until until
we find a model we find a model MM such that such that
)()|(maxarg
)(
)()|(maxarg
)|(maxarg
MPMXP
XP
MPMXP
XMPM
M
M
M
P(M0|X) < P(M1|X) < P(M2|X) < P(M|X)
Model Merging ConstraintsModel Merging Constraints
Do not allow loop since order of zones are Do not allow loop since order of zones are importantimportant
Do not allow text state to be merged with Do not allow text state to be merged with gap or margin stategap or margin state
Two states to be merged should be Two states to be merged should be spatially closespatially close
The Recursive Learning AlgorithmThe Recursive Learning Algorithm
1. 1. Start from a set of Start from a set of training pages, let training pages, let i = 0,i = 0,
and 2D model and 2D model MM = = ΦΦ..
2.2. Learn 1D models Learn 1D models mm at the at the i i level, let level, let M = M U mM = M U m..
3.3. Use Use MM in a recursive in a recursive Duration Viterbi Duration Viterbi Algorithm to segment Algorithm to segment training pagestraining pages
4.4. Find out the segmented Find out the segmented region that can be further region that can be further split, exit if none exits.split, exit if none exits.
5.5. i = i+1i = i+1, go back to step 2., go back to step 2.
tm h
g1
ti g2
C g
3 B
bm
Y
m2:
lm P rmm1: X
Break a model into three componentsBreak a model into three components
Dirichlet distribution for multinomial prior (Dirichlet distribution for multinomial prior ([Stolcke [Stolcke and Omohundro, 1994] proposed priors for HMMs)and Omohundro, 1994] proposed priors for HMMs)
Multinomial and geometric distributionsMultinomial and geometric distributions
PriorsPriors
.)1()1()1()|( )(||||)( )()()()( qVdd
nQe
ne
nQt
ntg
qt ppppppMMP
qe
qe
qt
qt
Qqg
qt
qMg
qtg
gtMgtg
Mtg
Mtg
MMPMMPMP
MMPMMPMP
MMPMP
MMM
).,|()|()(
),|()|()(
),,( )(
},,{
)()()(
.),...,(
1
),...,(
1
),...,(
1),|(
)()()(
1
1
1
1
1
1)()(
qd
d
i
qe
e
i
qt
t
i
n
iq
dd
n
iq
ee
n
iq
ttg
qt
qM BBB
MMP
LikelihoodLikelihood
.),...,(
),...,(
),|()|(
),|()|()|(
)()(1
)()()(1
)(1
)()()()()(
Qqqn
q
qn
qn
qM
qM
qqM
MMM
B
vvB
dMvPMP
dMXPMPMXP
qM
M
),|(
),|,(
),|,(),|(
*
*
M
M
VMM
MvP
MVXP
MVXPMXP
Approximating the likelihood in Bayesian Learning by the Viterbi path
Global WeightsGlobal Weights
Adjust the contributions of prior and Adjust the contributions of prior and likelihood to the posterior probabilitylikelihood to the posterior probability
Control when the model generalization Control when the model generalization should stopshould stop
),...,(
),...,(log
)1(log
)1()1(log
)()(1
)()()(1
)(1
)(2
||||1
)()()()(
qn
q
qn
qn
qVdd
nQe
ne
nQt
nt
B
vvB
pp
ppppqe
qe
qt
qt
Comparison of four Labeling MethodsComparison of four Labeling Methods((Test set: 69 pages)
0
10
20
30
40
50
60
70
80
90
100
Heuristic rulesHMM-basedDFGS-basedHSMM-based
• 198 title textlines• 181 author textlines181 author textlines• 600 affiliation textlines600 affiliation textlines• 2079 abstract textlines2079 abstract textlines
• Heuristic rules and DFGS: 1. assume zoning is done.2. Use font size, font
attribute, key words as features
• HMM- and HSMM-based methods:1. simultaneous zoning and
labeling2. Only use character count
as feature (3 others later)
Zo
nin
g a
nd
lab
elin
g a
ccu
racy
(%
)
Future WorkFuture Work
For TIFF images, extend feature set to font For TIFF images, extend feature set to font size, font attributes, key wordssize, font attributes, key words
Map the layout model to other document Map the layout model to other document formats, e.g., HTML, PDFformats, e.g., HTML, PDF
Use text-line (rather than zone) as basic Use text-line (rather than zone) as basic state unitstate unit
George R. Thoma, Ph.D.Chief, Communications Engineering BranchLister Hill National Center for Biomedical CommunicationsNational Library of Medicine8600 Rockville Pike, Bethesda, MD 20894 USA
[email protected] 496 4496
archive.nlm.nih.gov
PublicationsPublications1. 1. Bayesian Learning of 2D Document Layout models for Automated Preservation Bayesian Learning of 2D Document Layout models for Automated Preservation Metadata Extraction, Metadata Extraction, Song Mao and George R. Thoma. Song Mao and George R. Thoma. Submitted to tSubmitted to the 4th IASTED International Conference on he 4th IASTED International Conference on VISUALIZATION, IMAGING, AND IMAGE PROCESSINGVISUALIZATION, IMAGING, AND IMAGE PROCESSING..
2. 2. Style-Independent Labeling: Design and Performance Evaluation. Style-Independent Labeling: Design and Performance Evaluation. Song Mao, Jong Woo Kim and G. R. Thoma, Song Mao, Jong Woo Kim and G. R. Thoma, SPIE Conference on Document Recognition and SPIE Conference on Document Recognition and RetrievalRetrieval, pages 14-22, San Jose, CA, January 2004., pages 14-22, San Jose, CA, January 2004.
3. 3. A Dynamic Feature Generation System for Automated MetadataA Dynamic Feature Generation System for Automated Metadata Extraction in Preservation of Digital Materials. Extraction in Preservation of Digital Materials. Song Mao, Jong Woo Kim and G. R. Thoma, Song Mao, Jong Woo Kim and G. R. Thoma, The First International Workshop on Document Image The First International Workshop on Document Image Analysis for LibrariesAnalysis for Libraries,, Pages 225-232, Palo Alto, CA, January 2004.Pages 225-232, Palo Alto, CA, January 2004.
4. 4. Stochastic Attributed K-D tree Modeling of Technical Paper Title Pages,Stochastic Attributed K-D tree Modeling of Technical Paper Title Pages, Song Mao, Azriel Rosenfeld, Tapas Kanungo, Song Mao, Azriel Rosenfeld, Tapas Kanungo, IEEE International Conference on Image ProcessingIEEE International Conference on Image Processing, , pages 533-536,pages 533-536, Barcelona, Spain, September 2003.Barcelona, Spain, September 2003.
5. 5. Stochastic Language Model for Style-Directed Physical Layout Analysis of Documents,Stochastic Language Model for Style-Directed Physical Layout Analysis of Documents, Tapas Kanungo and Song Mao, Tapas Kanungo and Song Mao, IEEE Transactions on Image ProcessingIEEE Transactions on Image Processing, pages 583-596, , pages 583-596, vol. 12, no. 5, May 2003. vol. 12, no. 5, May 2003.
ReferencesReferences
6. Best-first model merging for hidden Markov model induction,6. Best-first model merging for hidden Markov model induction, A. Stolcke and S. M. Omohundro,A. Stolcke and S. M. Omohundro, Technical Report TR-94-003, ICSI, Berkeley, CA, 1994Technical Report TR-94-003, ICSI, Berkeley, CA, 1994
7. Document layout structure extraction using bounding boxes7. Document layout structure extraction using bounding boxes of different entities,of different entities, J. Liang, J. Ha, R. M. Haralick,J. Liang, J. Ha, R. M. Haralick, 33rdrd IEEE Workshop on Applications of Computer Vision (WACV ’96), December, 1996 IEEE Workshop on Applications of Computer Vision (WACV ’96), December, 1996
8. Document page decomposition using bounding boxes of connected 8. Document page decomposition using bounding boxes of connected components of black pixels,components of black pixels,
J. Ha, R. M. Haralick, I. T. PhillipsJ. Ha, R. M. Haralick, I. T. Phillips Document Recognitin II, SPIE Proceedings, vol 2422, pp. 140-151, Feb 1995
9. The Elements of Statistical Learning: Data Mining, Inference, 9. The Elements of Statistical Learning: Data Mining, Inference, and Prediction,and Prediction, T. Hastie, R. Tibshurani, and J.H. Friedman, T. Hastie, R. Tibshurani, and J.H. Friedman, Spinger Series in Statistics, 2001.Spinger Series in Statistics, 2001.
ExamplesExamples
HSMM-based HMM-based
DFGS-based Heuristic-rule-based
Bayesian LearningBayesian Learning
)|()(maxarg
)(
)|()(maxarg
)|(maxarg*
MXPMP
XP
MXPMP
XMPM
M
M
M
Goal is to find
Need to know the explicit form of prior P(M) and likelihood P(X|M). Obtained from training set.
Layout Model Learning ResultsLayout Model Learning Results(Training set: 19 journal title pages)(Training set: 19 journal title pages)
Model Model componentcomponent
Number of Number of Initial statesInitial states
Number of Number of final statesfinal states
11 9595 55
22 189189 1717
33 205205 3232
Model Merging AlgorithmModel Merging Algorithm
Best-first merging with look ahead Best-first merging with look ahead [Stolcke and [Stolcke and Omohundro, 1994]Omohundro, 1994]
Algorithm stepsAlgorithm steps– Let Let M0M0 be the empty model. Let i=0. Loop: be the empty model. Let i=0. Loop:
1.1. Get 5 new samples X and incorporate them into Get 5 new samples X and incorporate them into MiMi
2.2. Find the best merge that maximize Find the best merge that maximize P(MP(Mii|X)|X)
3.3. Let Let Mi+1Mi+1 be the new model be the new model
4.4. if if P(MP(Mi+1i+1|X) < P(M|X) < P(Mii|X), |X), perform look-ahead:perform look-ahead:If probability does not improve after merging some more If probability does not improve after merging some more
states, break from the loop, else let states, break from the loop, else let MMi+1i+1 be the merged model.be the merged model.
5.5. Let Let i=i+1i=i+1..
– If data is exhausted, break from the loop and return If data is exhausted, break from the loop and return MiMi as the inducted model.as the inducted model.
The Recognition AlgorithmThe Recognition Algorithm
A duration Viterbi A duration Viterbi algorithm algorithm [Rabiner [Rabiner et al, et al, 1985]1985]
Recursively apply Recursively apply it to a document it to a document page using a set page using a set of learned of learned attributed hidden attributed hidden semi-Markov semi-Markov models models [Mao [Mao et al, et al, 2003]2003]
model Markov-semi hidden attributed an :
toingcorrespond states :
vectornobservatio feature :
)|,(max
oq
o
qoPq
. allfor 1 and if
if where
max
ˆ| at time ends statein stay the,,,,max
0
.1,,min
,,
21,, 121
jjtd
tda
cbij
tjoooPj
j
ij
ij
ijjd
t
dtsjodt
jiDtdid
t
pt
qqqt
s
t