alchemist hour: entity extraction

Download Alchemist Hour: Entity Extraction

If you can't read please download the document

Upload: alchemyapi

Post on 30-Jul-2015

162 views

Category:

Data & Analytics


11 download

TRANSCRIPT

1. AlchemyAPI Entity Extraction 2. The housekeeping items#AlchemistHour Webinar slides, recording and Q&A will be emailed Enter questions in chat on webinar panel Interact with us on Twitter @alchemyapi Use #AlchemistHour 3. What will be covered today?#AlchemistHour Entities: what they are and why we care! Sample calls Parameter explanation 4. #AlchemistHour What is an entity? 5. A person "entities": [ { "type": "Person", "relevance": "0.33", "count": "1", "text": "Douglas Adams", "disambiguated": { "subType": [ "MusicalArtist", "Writer", "CompanyFounder", "Dedicatee", "FilmProducer", "FilmWriter", "TVWriter", "VideoGameDesigner", "TVActor" ], "name": "Douglas Adams", "website": "http://www.douglasadams.com/", "dbpedia": "http://dbpedia.org/resource/Douglas_Adams", "freebase": "http://rdf.freebase.com/ns/m.0282x", "opencyc": "http://sw.opencyc.org/concept/Mx4rwPVThpwpEbGdrc N5Y29ycA", "yago": "http://yago-knowledge.org/resource/Douglas_Adams", "musicBrainz": "http://zitgist.com/music/artist/e9ed318d-8cc 5-4cf8-ab77-505e39ab6ea4" } } ] 6. A company "entities": [ { "type": "Company", "relevance": "0.33", "count": "1", "text": "spacex", "disambiguated": { "subType": [ "RocketEngineDesigner", "RocketManufacturer" ], "name": "SpaceX", "website": "http://www.spacex.com/", "dbpedia": "http://dbpedia.org/resource/SpaceX", "freebase": "http://rdf.freebase.com/ns/m.03fkyw", "yago": "http://yago-knowledge.org/resource/SpaceX", "crunchbase": "http://www.crunchbase.com/company/spacex" } } ] 7. A facility "entities": [ { "type": "Facility", "relevance": "0.33", "count": "1", "text": "White House" } ] 8. A sport "entities": [ { "type": "Sport", "relevance": "0.33", "count": "1", "text": "skiing", "disambiguated": { "subType": [ "Location", "TouristAttraction", "CauseOfDeath" ], "name": "Skiing", "dbpedia": "http://dbpedia.org/resource/Skiing", "freebase": "http://rdf.freebase.com/ns/m.071k0", "opencyc": "http://sw.opencyc.org/concept/Mx4rvVi-m5 wpEbGdrcN5Y29ycA" } } ] 9. Making API Calls - cURL #AlchemistHour curl 'http://access.alchemyapi.com/calls/text/TextGetRankedNamedEntities? text=President+Obama+visited+China+on+Friday&apikey='$API_KEY'&outputMode=json' "entities": [ { "type": "Person", "relevance": "0.33", "count": "1", "text": "Obama", "disambiguated": { "subType": [ "Politician", "President", ], "name": "Barack Obama", "website": "http://www.whitehouse.gov/", "dbpedia": "http://dbpedia.org/resource/Barack_Obama", "freebase": "http://rdf.freebase.com/ns/m.02mjmr", "yago": "http://yago-knowledge.org/resource/Barack_Obama" } }, { "type": "Country", "relevance": "0.33", "count": "1", "text": "China" }, ] 10. Parameters#AlchemistHour url url of text to be analyzed only used with endpoint URLGetRankedNamedEntities text text to be analyzed only used with endpoint TextGetRankedNamedEntities apikey your API key outputMode xml, json, rdf, rel-tag, rel-tag-war jsonp desired JSONP callback (requires json outputMode) 11. Parameters#AlchemistHour disambiguate set to 1 to disambiguate detected entities linkedData whether to include Linked Data content links with disambiguated entities. Requires disambiguation set to 1 coreference whether to resolve he/she/etc coferences into detected detected entities quotations whether to enable quotations extraction. Disabled by default sentiment whether to enable entity-level sentiment analysis. Disabled by default, incurs 1 additional transaction 12. Parameters#AlchemistHour sourceText where to obtain the text that will be processed by this API call, such as an xpath query. For full list of options, see the documentation knowledgeGraph expose knowledge graph information with results. O by default, incurs 1 additional transaction structuredEntities extract structured entities, such as quantities, email addresses, Twitter handles 13. Additional info#AlchemistHour Maximum document size for HTML documents is 600kb. Remaining text after HTML cleaning must be less than 50kb. Entity extraction is supported for eight languages: English, Spanish, Italian, German, French, Portuguese, Swedish, and Russia Disambiguation and quotation extraction is only available for English- language content A full list of entity types can be found in our documentation. 14. Remember, you can #AlchemistHour Get an API Key Download an SDK Check out the Getting Started Guides Ask me questions! 15. #AlchemistHour Q&A 16. Look out for a follow up email with a copy of these slides, a recording of the webinar, Q&A recap, and additional resources The series continues bi-weekly on Wednesdays @ 12pm ET / 9am PT June 24 - Face Recognition #AlchemistHour Whats next? 17. Contact us 1-877-253-0308 [email protected] www.alchemyapi.com Thank you for attending! #AlchemistHour