![Page 1: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/1.jpg)
NLP and Graph Databases in
Charlie Greenbacker & Joe Kerner
![Page 2: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/2.jpg)
Agenda
Graph Databases
Lumify Overview
Introductions
Natural Language Processing
![Page 3: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/3.jpg)
photo:&Columbia&Pictures&
About me: @greenbacker Theories: popular tripe Methods: sloppy Conclusions: highly questionable
![Page 4: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/4.jpg)
Best reason for not finishing PhD
![Page 5: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/5.jpg)
@ExploreAltamira
![Page 6: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/6.jpg)
is an open source big data analysis and visualization platform built by Altamira engineers
![Page 7: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/7.jpg)
Key Lumify Concepts
structure for organizing information (i.e., your data model) Ontology
any “thing” you want to represent (e.g., person, place, event) Entities
a link between two entities (e.g., leader-of, works-for, sibling-of) Relationships
data about an entity (e.g., first name, last name, date of birth) Properties
collection of entities and the relationships between them Graph
![Page 8: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/8.jpg)
Live Demo
![Page 9: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/9.jpg)
Who can Lumify help?
![Page 10: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/10.jpg)
Lumify helps analysts fuse structured and unstructured data from myriad sources into actionable intelligence.
Intelligence Analyst
![Page 11: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/11.jpg)
Law enforcement personnel can use Lumify to explore criminal networks, uncover hidden connections, and develop leads.
Police Investigator
![Page 12: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/12.jpg)
Lumify analyzes financial data and transaction records to help detect fraud and identify possible insider threats.
Financial Analyst
photo:&Ken&Teegardin&(h9ps://flic.kr/p/9rn9Yh)&
![Page 13: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/13.jpg)
Scientists, law firms, news organizations, and others can track their research in Lumify to unearth latent knowledge and discover critical new insights.
Research Staff
photo:&UK&NaConal&Archives&(h9p://bit.ly/1n9dhR8)&
![Page 14: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/14.jpg)
Why Lumify?
![Page 15: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/15.jpg)
• Distributed under the permissive Apache 2.0 license
• No restrictions on modifications
• No licensing or usage constraints
Free and Open Source
![Page 16: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/16.jpg)
Built on Scalable Open Source Tech
Hadoop&CDH&4&
Accumulo&
ElasCcSearch&
tesseract&CLAVIN& CMU&Sphinx&OpenNLP& OpenCV& ffmpeg&
Apache&Storm&
Secure&Graph&
custom&code&
![Page 17: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/17.jpg)
• Separate security restrictions at the entity, property, and relationship level
• Implemented in and enforced by Accumulo cell-level security
Highly Secure
Joaquin Guzman Loera
DOB: 1957-04-04 POB: Badiraguarto Nationality: Mexican
Founded: 2010-01-11 Location: Mexico City Employees: 121
Zarka de Mexico
![Page 18: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/18.jpg)
• Full-time development staff
• Custom development and customization services
• Commercial support offerings
Supported
![Page 19: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/19.jpg)
• Day-to-day development done on Amazon infrastructure
• Primarily use EC2, VPC, S3, SES, CloudWatch
• Altamira is an AWS consulting partner
AWS Compatible
![Page 20: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/20.jpg)
Natural Language Processing in
![Page 21: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/21.jpg)
Text Extraction
video
text docs structured data
images OCR tesseract
audio CMU Sphinx
CMU Sphinx
OCR tesseract
extractor
![Page 22: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/22.jpg)
Text Enrichment
• Apache OpenNLP • Named Entity Recognition • Extracts names of entities
from unstructured text • Persons, Orgs, & Locations • Highlighted in preview text • User must confirm/resolve
• CLAVIN • Geospatial Entity Resolution • Resolves extracted location
names to gazetteer records • Solves “Springfield problem” • Disambiguates place names • Turns text docs into maps!
![Page 23: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/23.jpg)
Machine-powered entity extraction and resolution, combined with human QA and supplementation, supports rich semantic analysis of raw text.
Enriched Text Documents
Drug Lord “El Chapo” Captured in Mexico
PUBLISHED DATE SOURCE
Audit
2014/02/22 Wikipedia
Add Property
Although Guzman had long hidden successfully in remote areas of the Sierra Madre mountains, the arrested members of his security team told the military he had begun venturing out to Culiacan and the beach town of Mazatlan. A week prior to his capture, Guzman and Zambada were reported to have attended a family reunion in Sinaloa. The Mexican military followed the bodyguards tips to Guzman’s ex-wife’s house, but they had trouble ramming the steel-reinforced front door, which allowed Guzman to escape through a system of secret tunnels that connected six houses, eventually moving south to Mazatlan. He planned to stay a few days in Mazatlan to see his twin baby daughters before retreating to the mountains. On 22 February 2014, at around 6:40 a.m., Mexican authorities arrested Guzman at a hotel in a beach front area in Mazatlan, Sinaloa, following an operation by the Mexican Navy, with joint intelligence from the DEA and
![Page 24: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/24.jpg)
Benefits to Users
quickly find relevant data without reading Increases Discoverability
machines process text faster than humans Helps Deal with Information Overload
enables object-based analysis & investigations Uncovers Hidden Connections
![Page 25: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/25.jpg)
Future NLP Integration
e.g., Stanford NER, SUTime, MITIE Support other NER tools
e.g., OpenIE (formerly ReVerb) Event/Relationship Extraction
augmenting/extending GATE/ANNIE Coreference Resolution
e.g., frequency analysis, topic modeling, sentiment analysis Additional Text Analytics
use non-English language models for NER, etc. Multilingual Support
![Page 26: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/26.jpg)
Graph Databases in
view part 2 of the presentation here: github.com/altamiracorp/secure-graph-presentation
![Page 27: Natural Language Processing and Graph Databases in Lumify](https://reader035.vdocuments.us/reader035/viewer/2022081720/53ed83b38d7f7289708b5d04/html5/thumbnails/27.jpg)
Questions?
more info: lumify.io