s u m m a r e s t a p i s c h e m a ? v 0...
TRANSCRIPT
SUMMA PlatformRenars Liepins, Didzis Gosko and Guntis Barzdins
[email protected], [email protected], [email protected]
Live Streams(Audio, Video)
write
publish
NLP JobQueries
NLP WorkerWrapper
RESTAPI
GUIBackend
Titlehttp://summa.leta.lv/demo
SUMMA Prototype GUI
dockercompose
NLP WorkerWrapper
NLP WorkerWrapper
NLPService (ASR)
NLPService
(MT)
...
News Feed
News Feed
DataPull
Message Queue
SUMMA Platform
News Occurences(Text, Audio, Video)
NLP Logical
Segmenter
News Feed
Livestreams
2 WeekData Purge
AV Store
Cache generator
SUMMA PLATFORM ARCHITECTURE ? V0.5.1
NLPService
(ET)
SUMMA Architecture
Detect Text Language
(??)
Logical Textual MediaOccurance (tweet, blogpost,
news article)
SUMMA NLP PIPELINE ? V0.4.0
Live Feed(audio, video)
Logical A/V MediaOccurance
(news video,news audio)
Translate to English
(MT)
Segment intoLogical A/V
MediaOccurances
Detect Speech Language
(??)Convert to Text
(ASR)
Detect Storyline(Clusterisation)
Extract Entities(ET)
Summarise Storyline
Extract Relationships
Sentiment Analysis
End
Save toAV Store
Add Punctuation and Capitalisation
Detect IPTC Topics
PseudonymizeSocial Media
(PSM)
NPL Pipeline
SUMMA REST API SCHEMA ? V0.4.1
NamedEntities
baseForm String
type String
timeAdded Datetime
relations [??]
Feeds
name String
feedType ["dwFeed", "tweet", "RSS"]
url String
isActive Bool
groups [{id, name}]
FeedGroups
name String
feeds [Feeds]
Users
name String
email String
password String
role[user, admin]
isSuspended Bool
Feedback
user User
guiPath String
comment String
rating[not-set, thumbs-up, thumbs-down]
screenshotBase64
String
timeAdded Datetime
Query
name String
user User
feedGroups [FeedGroup]
namedEntities [baseFormStr]
namedEntityFilterType
[OR, AND]
trending {baseForm: {-hour: count}}
stories [Story]
MediaItems
source Feed {id, name}
tile{original: String, english: String}
summary String
teaser{original: String, english: String}
mainText{original: String, english: String}
originalMultiMedia{sourceItemVideoURL, sourceItemAudioURL,sourceitemPhotoURL}
transcript
{original: {text, wordTimestampsAndConfidences}, english: {text, wordTimestampsAndConfidences}}
namedEntities
{entities: {baseForm: Entity}, mentionsIn: { title: {baseForm: []}, summary: {baseForm: []}, teaser: {baseForm: []}, mainText: {baseForm: []}, transcript: {baseForm: []}}}
sentiment ??
keywords [String]
story Story
timeAdded Datetime
timeLastChanged Datetime
Story
tile String
mediaItems [MediaItemId]
timeChanged Datetime
summary String
REST API Schema
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreements 688139 (SUMMA).