res entity resolver res real identities · +1 617-386-2090 start using rni today try our free...
TRANSCRIPT
Names are the linchpin that connect data points in financial compliance, anti-fraud, government intelligence, law enforcement, and identity verification. Yet, names are challenging to connect because of their incredible variation in misspellings, nicknames, initials, and titles. In international databases, a single name may also appear in many languages!
Rosette® Name Indexer (RNI) solves these challenges with a linguistic, knowledge-based system that compares and matches names of people, places, and organizations despite their many variations. RNI is unrivalled in its ability to match names because of its intelligent approach.
As linguistics experts with deep understanding at the intersection of language and technology, Basis Technology continually improves the Rosette product family with language additions, feature updates, and the latest innovations from the academic world. RNI is unrivalled in its ability to match the names of entities—find out how your organization can utilize this pioneering technology for extraordinary results.
Accurate fuzzy name matching in many languages 14 Supported
Languages
KEY FEATURES
- Component of the Rosette SDK
- Simple API
- Fast and scalable
- Industrial-strength support
- Easy installation
- Flexible and customizable
- Java
- Unix, Linux, Mac, or Windows
- Matches names of people, places, and
organizations
- Increases name search accuracy
- Ranks results by relevancy with a similarity
score
- Built to work with Apache™ Solr and
Elasticsearch
Select Customers
www.basistech.com [email protected]
+1 617-386-2090
Start using RNI today Try our free product evaluation
www.basistech.com
Franklin D. Roosevelt
32nd U.S. PresidentID: USPRES32DOB: Jan. 30, 1882
82%
97%
77%
82%
84%
85%
74%
79%
73%
富兰克林·罗塞费尔特
Gov. Franklin Roosevelt
Frank Delano Roosevelt
Franklin Rosenvelt
President Roosevelt
Рузвельт, Франклин
F. D. R.F. D. Roosev
Franklin Delano Roosevelt, also known by his initials,
FDR, was the 32nd President of the United States
and a central figure in world events during the
mid-20th century, leading the United States during....
Rosette®
BIG TEXT ANALYTICS
RES
RNT
RNI
REX
RBL
RLILanguage Identifier Identify languages and encodings
Base Linguistics Search many languages with high accuracy
Entity Extractor Tag names of people, places, and organizations
Name Indexer Match names between many variations
Name Translator Translate foreign names into English
CategorizerCategorize Everything In Sight
Sentiment AnalyzerDetect The Sentiments Of Your Text
Entity Resolver Make real-world connections in your data
Better Search
Tagged Entities
Real Identities
Matched Names
Sorted Languages
Translated Names
Sorted Content
Actionable Insights
RES
RNT
RNI
REX
RBL
RLI ROSETTELanguage Identifier
ROSETTEBase Linguistics
ROSETTEEntity Extractor
ROSETTEName Indexer
ROSETTEName Translator
ROSETTECategorizer
ROSETTESentiment Analyzer
ROSETTEEntity Resolver
RCA
RSA
RCA
RSA
Our knowledge-based system combines the latest in Natural Language Processing (NLP) to intelligently match names based on their linguistic and cultural structures and norms.
Unlike expensive and less accurate legacy solutions driven by thousands of spelling variants from known names, RNI analyzes the intrinsic structure of each name component and performs an intelligent comparison using advanced linguistic algorithms.
Our approach is not limited to a particular list of variants and reduces the likelihood of both “false positives” (wrong matches) and “false negatives” (missed matches).
List driven systems cannot equal RNI for matching never-seen-before names or mis-segmented names (Mary Ellen vs. MaryEllen).
- Arabic scripts: Arabic, Persian, Pashto, Urdu
- Cyrillic: Russian
- Hangul: Korean
- Hanzi (Simplified & Traditional): Chinese
- Kanji, Katakana, Hirigana: Japanese
- Roman scripts: English, Spanish, French, Italian, German, Portuguese
RNI matches names from these languages either in transliteration to English or written in their native scripts.
Available Languages and Scripts
Name Matching Capabilities
Code Base Platform Support
Compatibility
Same name in multiple languagesMao Zedong 1 Мао Цзэдун 1 毛泽东
Phonetic spelling di erencesCairns 1 Kearns 1 Kerns
Transliteration spelling di erencesAbdul Rasheed 1 Abd-al-Rasheed 1 Abdulrashid
NicknamesWilliam 1 Will 1 Bill 1 Billy
InitialsJ. E. Smith 1 James Earl Smith
Titles and honorificsDr. 1 Mr. 1 Ph.D.
Out-of-order name componentsDiaz, Carlos Alfonzo 1 Carlos Alfonzo Diaz
Missing name componentsPhillip Charles Carr 1 Phillip Carr
Missing spaces or hyphensMaryEllen 1 Mary Ellen 1 Mary-Ellen
Truncated name componentsMcDonalds 1 McD 1 McDonald
Name split inconsistently across database fieldsDick • Van Dyke 1 Dick Van • Dyke
© 2015 Basis Technology Corporation. “Basis Technology Corporation” , “Rosette” and “Highlight” are registered trademarks of Basis Technology Corporation. “Big Text Analytics” is a trademark of Basis Technology Corporation. All other trademarks, service marks, and logos used in this document are the property of their respective owners. (2015-06-29-RNI)
WEST COAST
1700 Montgomery St.San Francisco, CA 94111
FEDERAL
2553 Dulles View Dr.Suite 450Herndon, VA 20171
HEADQUARTERS
One Alewife CenterCambridge, MA 02140
EUROPE
Furzeground WayMiddlesex UB11 1BD, UK
ASIA
9-6 Nibancho, Chiyoda-kuTokyo 102-0084, Japan
Rosette®
BIG TEXT ANALYTICS
RES
RNT
RNI
REX
RBL
RLILanguage Identifier Identify languages and encodings
Base Linguistics Search many languages with high accuracy
Entity Extractor Tag names of people, places, and organizations
Name Indexer Match names between many variations
Name Translator Translate foreign names into English
CategorizerCategorize Everything In Sight
Sentiment AnalyzerDetect The Sentiments Of Your Text
Entity Resolver Make real-world connections in your data
Better Search
Tagged Entities
Real Identities
Matched Names
Sorted Languages
Translated Names
Sorted Content
Actionable Insights
RES
RNT
RNI
REX
RBL
RLI ROSETTELanguage Identifier
ROSETTEBase Linguistics
ROSETTEEntity Extractor
ROSETTEName Indexer
ROSETTEName Translator
ROSETTECategorizer
ROSETTESentiment Analyzer
ROSETTEEntity Resolver
RCA
RSA
RCA
RSA
Rosette®
BIG TEXT ANALYTICS
RES
RNT
RNI
REX
RBL
RLILanguage Identifier Identify languages and encodings
Base Linguistics Search many languages with high accuracy
Entity Extractor Tag names of people, places, and organizations
Name Indexer Match names between many variations
Name Translator Translate foreign names into English
CategorizerCategorize Everything In Sight
Sentiment AnalyzerDetect The Sentiments Of Your Text
Entity Resolver Make real-world connections in your data
Better Search
Tagged Entities
Real Identities
Matched Names
Sorted Languages
Translated Names
Sorted Content
Actionable Insights
RES
RNT
RNI
REX
RBL
RLI ROSETTELanguage Identifier
ROSETTEBase Linguistics
ROSETTEEntity Extractor
ROSETTEName Indexer
ROSETTEName Translator
ROSETTECategorizer
ROSETTESentiment Analyzer
ROSETTEEntity Resolver
RCA
RSA
RCA
RSA
The Rosette Advantage
Financial institutions use RNI to manage and update watchlists to block terrorist access to funds, simultaneously avoiding compliance violations and protecting their reputation. Applications also include fraud detection, money laundering, and document triage.
Financial Compliance
Names are often the most critical data point in intelligence, law enforcement, and border control. RNI is being adopted throughout the U.S. government to address the challenge of matching names in all their variations—particularly names from non-Latin languages such as Arabic, Russian, Chinese, Korean, or Persian.
Government Intelligence
Trust is foundational to the sharing economy. Whether booking room rentals, rides, or odd jobs, it is important to establish ways to connect the online and offline worlds to reinforce that trust and confidence.
Name matching is a key component of verifying online identities with real-world documentation (passports, driver’s licenses). Members of the sharing economy such as Airbnb rely on RNI to match names originating from all over the world, and internationally between names written in alphabets besides the Roman A-to-Z.
Identity Verification in the Sharing Economy
Rosette® Name Indexer integrates easily into Apache Solr™ as a plug-in or into applications as a Java library to support its main use cases. RNI can also be adapted to match the needs of each application.
Apache SolrApache Solr™-based search systems can easily add high-quality fuzzy name matching to every search by simply adding name fields. RNI provides a special Solr field type for names. This mechanism means Solr can index documents with multiple name fields, each with multiple values (e.g., an “alias” field may contain more than one name). Each document could also contain non-name fields like dates or plain text.
<fieldname=”primary”>MuhammadAli</field> <fieldname=”alias”>CassiusClayJr</field> <fieldname=”alias”>TheGreatest</field> <fieldname=”dob”>1/7/1942</field>
A single query can then be constructed that gives different weight to the various fields. For example, a single query can find movies starring “Binedict Cumberbund” with screenplays by “Giyermo Diltoro” that were released around 2014.
Java LibraryAny application that needs name matching can directly integrate a Java library which takes care of storing watchlists without incurring the overhead of a web-service call.
Integration Options
- Set the minimum threshold of the similarity score to manage the precision and recall of the returned search results.
- Ignore a given list of words (“stopwords”) with respect to matching (e.g., titles, honorifics).
- Force two name words to always match with a given score (e.g., “Elizabeth” and “Lisbeth” always match at 90%).
- Force two names to always match with a given score (e.g., “John Doe” and “Joe Bloggs” always match at 95%).
- Link multiple names to a single individual (e.g., queries for "Marilyn Monroe" and "Norma Jeane Mortensen" include the same person).
Customize To Your Needs
Same name in multiple languagesMao Zedong 1 Мао Цзэдун 1 毛泽东
Phonetic spelling di erencesCairns 1 Kearns 1 Kerns
Transliteration spelling di erencesAbdul Rasheed 1 Abd-al-Rasheed 1 Abdulrashid
NicknamesWilliam 1 Will 1 Bill 1 Billy
InitialsJ. E. Smith 1 James Earl Smith
Titles and honorificsDr. 1 Mr. 1 Ph.D.
Out-of-order name componentsDiaz, Carlos Alfonzo 1 Carlos Alfonzo Diaz
Missing name componentsPhillip Charles Carr 1 Phillip Carr
Missing spaces or hyphensMaryEllen 1 Mary Ellen 1 Mary-Ellen
Truncated name componentsMcDonalds 1 McD 1 McDonald
Name split inconsistently across database fieldsDick • Van Dyke 1 Dick Van • Dyke
Use Cases