(S i m p l e) ? P D F M a n i p u l a t i o n L a n g u a g e Stefano Pacifico Jayesh Kataria Dhivya Khrishnan Hye Seon Yi.

Download (S i m p l e) ? P D F M a n i p u l a t i o n L a n g u a g e Stefano Pacifico Jayesh Kataria Dhivya Khrishnan Hye Seon Yi.

Post on 21-Dec-2015




0 download


  • Slide 1
  • (S i m p l e) ? P D F M a n i p u l a t i o n L a n g u a g e Stefano Pacifico Jayesh Kataria Dhivya Khrishnan Hye Seon Yi
  • Slide 2
  • Contents Overview & Motivation Language Features PDF Functionalities Architectural Design Tutorials (including example) Lesson Learned Summary
  • Slide 3
  • Overview & Motivation SPML (Simple PDF Manipulation Language) is a language to create and manipulate PDF files. PDF is a de-facto standard for electronic documents because of its open standard format with a free viewer program (Acrobat Reader) However, it is difficult and expensive to manipulate PDF files!!! There are a few open source libraries available (ex) iText and XPAAJ Why not come up with a language for PDF by using them Focus is on manipulation of PDF files since people can easily create PDF files using freewares (ex) PDF ReDirect, cutePDF Writer
  • Slide 4
  • Language Features Carefully chosen set of keywords Multiple Types (int, string, pdf, void, array) Several Operators Unary Operators (~,!) Arithmetic (+, -, *, /) Comparison (, >=,==,!=) Logical operators (&&, ||) PDF operators (+, create, extractpage, totextfile, highlight, in)
  • Slide 5
  • Language Features (con.) Various types of statements Conditional statements (ifelse) Iterative statements (while) Jump statements (return, continue, break) I/O statements (print, totextfile) User defined functions Recursion
  • Slide 6
  • PDF Functionalities File generation (create) File concatenation (+ operator) Page extraction (extractpage) Highlight a word(highlight) All Pdfs in directory (in) Text file support (totextfile)
  • Slide 7
  • Architectural Design Front End SPMLLexer SPMLParser Tree Walker SPMLWalker CompilerException SPMLCodeGen Environment Classes Back End CodeGen SPMLLibrary Runtime Library JRE System iText XPAAJ Take SPML source code and output AST With the AST passed, perform static semantic checking and generate Java output code Bridge class between Java output code and Runtime Libraries iText (Open Source PDF library in Java), XPAAJ (XML/PDF Access API for Java from Adobe)
  • Slide 8
  • Tutorials - Example Program to concatenate two PDF files start() { pdf p1; p1 = "a.pdf"; /* open a.pdf */ pdf p2; p2 = "b.pdf"; /* open b.pdf */ pdf combined; combined = create "c.pdf"; /* create c.pdf */ combined = p1 + p2; }
  • Slide 9
  • Tutorials (con.) FunctionExample Variable declaration Array declaration pdf file; pdf files[10]; Conditional statement if (a == 1) { print a is 1; } Iteration statement while (a < 5) { print a = + a; } Jump statement return a; continue; break; I/O statement print Hello World!; User defined function int sum(int a, int b) { return a + b); } Recursion Used to reverse a file( coming soon in the demo)
  • Slide 10
  • Tutorials (con.) FunctionExample Length operator int a; a = length files; In operator all PDFs in a dir phrase search in PDF phrase search in string files = pdf in dir; int iArray[10]; iArray = the in files[0]; a = 1 in 12345; Extract a page file = extractpage files[0] 1; Highlight a phrase highlight pdfFile COMS; Save as a text file totextfile pdfFile file.txt;
  • Slide 11
  • Applications Forming a catalogue of pdfs Reversing file pages Deleting a page from pdf Extracting even and odd pages and forming a new pdf Swapping 2 pages of a file Highlighting word in a pdf Forming a new pdf of pages containing a specific word.
  • Slide 12
  • Lesson Learned Choose types carefully absence of boolean. User input could have been added. Deadlines are never too far away!
  • Slide 13
  • Summary SPML is a simple yet powerful language for manipulating PDF files. SPML works!


View more >