math literate computers dorothea blostein school of computing, queen’s university cicm 2009 m ath...
TRANSCRIPT
Math Literate Computers
Dorothea Blostein School of Computing, Queen’s University
CICM 2009
MMath ath LLiteracyiteracy:: The ability to read and write math notation.
In people, understanding precedes literacy. Computers are fairly literate, but with shallow understanding.
People learn to read before they learn to write. Computers are better at writing than reading.
Math literacy relates to literacy in other diagram notations:two-dimensional, domain-specific, natural languages.
Freedom to think with paper and pencil.
Computer support for typesetting, search, automated reasoning.
Goal:Goal: Smooth conversion betweenSmooth conversion betweenpaper and electronic documentspaper and electronic documents
Four Color Theorem, Appel and Haken, 1976
Math Notation - A Tool to Support Reasoning• Evolved over centuries• Additional notation is invented as needed• Many dialects
Difficult: create anaesthetically appealing diagram
A solved problem
Difficult. An active research area.
Difficult: handle symbol recognition errors and variable layout.
Writing (Generation)
Reading, RecognitionReading (Recognition)
Conventions geared toward generation
Conventions geared toward recognition
Many Diagrams Represent the Same Information
Same use of hard conventions
Varying use of Soft conventions
RecognitionAll the diagrams lead to same information
GenerationOne path (chosen according
to user preferences) from information to diagram
Hard conventions: how to encode information. Soft conventions: how to make it readable.
Sources of Information about Math NotationSample Documents Math notation defined by use in society. Introspection.
geared toward manual typesetting.
By example. People use their judgment .
Chaundy, Barrett, Batey, The Printing of Mathematics, 1957.Wick, Rules for Typesetting Mathematics, 1965. Higham, Handbook of Writing for the Math. Sciences, 1993.
geared toward computational typesetting. Knuth, “Mathematical Typography,” Bulletin of the AMS, 1979.
for recognizing and generating math notation.
Written Descriptions
Program Code
Recognition Contestsdefine datasets and evaluation metrics. Contests at ICDAR and GREC: Arc segmentation, symbol recognition, segmenting text and graphics, raster to vector conversion, signature verification, document binarization, page segmentation.
Statistics about Math Notation: An Example
Gather statistics from training data.
Almost matches human performance in labeling bounding boxes.
Spatial relations for pairs of bounding boxes.
Top labels: most likely, based on statistics.
Ambiguity due to unknown baseline
[Wang&Faure, ICPR 1988]
Challenges in Math Recognition
Symbol recognition ( C O 0 7 > S 5 / 1 l
Several roles for symbols
Spatial relationships
Little redundancy
Handwritten notationis particularly difficult
Compilers easily handle math notation in programming languages.
2D math notation is harder: – Noise causes errors in segmenting and identifying symbols.– Can’t blame the user for mistakes.– Hard to capture 2D relationships effectively in a string.
Evaluate/compare these approaches?
The choice of software architecture is difficult to make and defend.
Procedurally-coded math syntax Coordinate grammar
Projection profile cutting Stochastic grammars & HMMs
Graph rewriting Tree rewriting
Math-Recognition Approaches
[Survey by Blostein and Grbavec, 1997]
Procedurally-coded math syntax Coordinate grammar
Projection profile cutting Stochastic grammars & HMMs
Graph rewriting Tree rewriting
Math-Recognition Approaches
No explicit definition of math syntax.
Update code in response to recognition errors.
Can get good recognition performance.
Procedurally-coded math syntax Coordinate grammar
Projection profile cutting Stochastic grammars & HMMs
Graph rewriting Tree rewriting
Math-Recognition Approaches
Apply a rule to a set of symbols: create subsets with syntactic subgoals.
A clear, well-structured representation of notational conventions.
[Anderson 1969; in Fu 77]
Attributes: xmin, ymin, xmax, ymax, xcenterm encodes meaning
horizontal cut
Procedurally-coded math syntax Coordinate grammar
Projection profile cutting Stochastic grammars & HMMs
Graph rewriting Tree rewriting
Math-Recognition Approaches
vertical cut
[Okamoto and Miao, 1992]
The order of cuts provides the tree-structure of the expression.
A simple and efficient technique.Can be applied prior to OCR.
Special handling of overlapping symbols:
Procedurally-coded math syntax Coordinate grammar
Projection profile cutting Stochastic grammars & HMMs
Graph rewriting Tree rewriting
Math-Recognition Approaches
Hidden Markov Model [Kopec, Chou 1994]
An explicit image-generation model,to drive recognition.
Applied to yellow pages & music notation.
2D stochastic context-free grammar [Chou 1989]
Find the most likely parse of the image, without segmentation.
Procedurally-coded math syntax Coordinate grammar
Projection profile cutting Stochastic grammars & HMMs
Graph rewriting Tree rewriting
Math-Recognition Approaches
Rewrite rules replace one subgraph by another
PROGRES language: a mix of textual and visual notation
Write a graph schema to define the structure of valid graphs.
The PROGRES execution environment flags violations.
Build Constrain
Parse
Parse
[Blostein, Schürr, Software Practice and Experience, 1999]
Math-Recognition Approaches
Compiler-inspired approach, using tree rewriting[Zanibbi, Blostein, Cordy: ICPR 2002 and PAMI 2002]
Separate analysis of layout, lexical, syntactic, and semantic aspects.
Get partial results even ifthere are syntax errors.
Find linear structures in the input,and create a tree from them.
Operation of a compiler
Recognition of math notation
Goal: seamless transition between - real world (stylus and paper)
- electronic world
Many paper documents are produced from electronic sources.Eventually include digitally-encoded contents?
Methods used in digital watermarking are relevant.
Electronic Paper is more advanced than Paper Electronic
Entering math expressions
• How much user time?
• How many residual errors?
• How much frustration?
Method 1: Use Recognition Software Scan a document image or write on a data tablet
Method 2: Enter information directly Type the information (e.g. LaTeX)
or use a structure-based editor
User proofreads and corrects
Generate math notation
Recognition software
Information
User Frustration
People eventually feel comfortable with irritating interfaces.
The Argh is a unit of frustration. Kilarghs. Megarghs….Arghometers need to be developed.
Document recognition is frustrating because:
1.Users don’t like to correct errors made by the “stupid computer”. Better to correct errors they made themselves.
2.Users don’t like to think about the marks on the paper.They would rather think about the document contents.
3.Users don’t like unpredictable systems. Better to adapt themselves (even if inconvenient) to achieve predictability.
[Talk at ICDAR 2001]
Possible research directions
Precisely define math literacy tasks.
Use soft conventions in recognition.
Use statistics: know about likely versus unlikely expressions.
Exploit the advanced state of generation, to improve recognition.
Topics: Notational Conventions What is Math Notation, anyway?
Math Recognition Approaches User Interface Issues
Conclusion
A group effort is required.