s u m m a r e s t a p i s c h e m a ? v 0...

1
SUMMA Platform Renars Liepins, Didzis Gosko and Guntis Barzdins [email protected], [email protected], [email protected] Live Streams (Audio, Video) write publish NLP Job Queries NLP Worker Wrapper REST API GUI Backend Title http://summa.leta.lv/demo SUMMA Prototype GUI docker compose NLP Worker Wrapper NLP Worker Wrapper NLP Service (ASR) NLP Service (MT) ... News Feed News Feed Data Pull Message Queue SUMMA Platform News Occurences (Text, Audio, Video) NLP Logical Segmenter News Feed Livestreams 2 Week Data Purge AV Store Cache generator NLP Service (ET) SUMMA Architecture Detect Text Language (??) Logical Textual MediaOccurance (tweet, blogpost, news article) SUMMA NLP PIPELINE ? V0.4.0 Live Feed (audio, video) Logical A/V MediaOccurance (news video, news audio) Translate to English (MT) Segment into Logical A/V MediaOccurances Detect Speech Language (??) Convert to Text (ASR) Detect Storyline (Clusterisation) Extract Entities (ET) Summarise Storyline Extract Relationships Sentiment Analysis End Save to AV Store Add Punctuation and Capitalisation Detect IPTC Topics Pseudonymize Social Media (PSM) NPL Pipeline SUMMA REST API SCHEMA ? V0.4.1 NamedEntities baseForm String type String timeAdded Datetime relations [??] Feeds name String feedType ["dwFeed", "tweet", "RSS"] url String isActive Bool groups [{id, name}] FeedGroups name String feeds [Feeds] Users name String email String password String role [user, admin] isSuspended Bool Feedback user User guiPath String comment String rating [not-set, thumbs-up, thumbs-down] screenshot Base64 String timeAdded Datetime Query name String user User feedGroups [FeedGroup] namedEntities [baseFormStr] namedEntity FilterType [OR, AND] trending {baseForm: {-hour: count}} stories [Story] MediaItems source Feed {id, name} tile {original: String, english: String} summary String teaser {original: String, english: String} mainText {original: String, english: String} originalMultiMedia {sourceItemVideoURL, sourceItemAudioURL, sourceitemPhotoURL} transcript {original: {text, wordTimestampsAndConfidences}, english: {text, wordTimestampsAndConfidences}} namedEntities {entities: {baseForm: Entity}, mentionsIn: { title: {baseForm: []}, summary: {baseForm: []}, teaser: {baseForm: []}, mainText: {baseForm: []}, transcript: {baseForm: []} }} sentiment ?? keywords [String] story Story timeAdded Datetime timeLastChanged Datetime Story tile String mediaItems [MediaItemId] timeChanged Datetime summary String REST API Schema This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements 688139 (SUMMA).

Upload: others

Post on 02-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: S U M M A R E S T A P I S C H E M A ? V 0 .4summa-project.eu/wp-content/uploads/2019/01/platform... · 2019-01-30 · SUMMA Platform Renars Liepins, Didzis Gosko and Guntis Barzdins

SUMMA PlatformRenars Liepins, Didzis Gosko and Guntis Barzdins

[email protected], [email protected], [email protected]

Live Streams(Audio, Video)

write

publish

NLP JobQueries

NLP WorkerWrapper

RESTAPI

GUIBackend

Titlehttp://summa.leta.lv/demo

SUMMA Prototype GUI

dockercompose

NLP WorkerWrapper

NLP WorkerWrapper

NLPService (ASR)

NLPService

(MT)

...

News Feed

News Feed

DataPull

Message Queue

SUMMA Platform

News Occurences(Text, Audio, Video)

NLP Logical

Segmenter

News Feed

Livestreams

2 WeekData Purge

AV Store

Cache generator

SUMMA PLATFORM ARCHITECTURE ? V0.5.1

NLPService

(ET)

SUMMA Architecture

Detect Text Language

(??)

Logical Textual MediaOccurance (tweet, blogpost,

news article)

SUMMA NLP PIPELINE ? V0.4.0

Live Feed(audio, video)

Logical A/V MediaOccurance

(news video,news audio)

Translate to English

(MT)

Segment intoLogical A/V

MediaOccurances

Detect Speech Language

(??)Convert to Text

(ASR)

Detect Storyline(Clusterisation)

Extract Entities(ET)

Summarise Storyline

Extract Relationships

Sentiment Analysis

End

Save toAV Store

Add Punctuation and Capitalisation

Detect IPTC Topics

PseudonymizeSocial Media

(PSM)

NPL Pipeline

SUMMA REST API SCHEMA ? V0.4.1

NamedEntities

baseForm String

type String

timeAdded Datetime

relations [??]

Feeds

name String

feedType ["dwFeed", "tweet", "RSS"]

url String

isActive Bool

groups [{id, name}]

FeedGroups

name String

feeds [Feeds]

Users

name String

email String

password String

role[user, admin]

isSuspended Bool

Feedback

user User

guiPath String

comment String

rating[not-set, thumbs-up, thumbs-down]

screenshotBase64

String

timeAdded Datetime

Query

name String

user User

feedGroups [FeedGroup]

namedEntities [baseFormStr]

namedEntityFilterType

[OR, AND]

trending {baseForm: {-hour: count}}

stories [Story]

MediaItems

source Feed {id, name}

tile{original: String, english: String}

summary String

teaser{original: String, english: String}

mainText{original: String, english: String}

originalMultiMedia{sourceItemVideoURL, sourceItemAudioURL,sourceitemPhotoURL}

transcript

{original: {text, wordTimestampsAndConfidences}, english: {text, wordTimestampsAndConfidences}}

namedEntities

{entities: {baseForm: Entity}, mentionsIn: { title: {baseForm: []}, summary: {baseForm: []}, teaser: {baseForm: []}, mainText: {baseForm: []}, transcript: {baseForm: []}}}

sentiment ??

keywords [String]

story Story

timeAdded Datetime

timeLastChanged Datetime

Story

tile String

mediaItems [MediaItemId]

timeChanged Datetime

summary String

REST API Schema

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements 688139 (SUMMA).