ufa state aviation technical university

23
Ufa State Aviation Technical University Distributed Collaborative Filtering System as a Prototype of a New Information Messaging Media Ufa, 2007 Paranoia: a web-based blog and RSS aggregation system Grigory A. Makeev

Upload: petula

Post on 28-Jan-2016

16 views

Category:

Documents


0 download

DESCRIPTION

Ufa State Aviation Technical University. Grigory A. Makeev. Distributed Collaborative Filtering System as a Prototype of a New Information Messaging Media. Paranoia: a web-based blog and RSS aggregation system. Ufa, 200 7. Information messaging. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ufa State Aviation Technical University

Ufa State Aviation Technical University

Distributed Collaborative Filtering System

as a Prototype of a New Information Messaging Media

Ufa, 2007

Paranoia: a web-based blog and RSS aggregation system

Grigory A. Makeev

Page 2: Ufa State Aviation Technical University

2

Information messaging

•Important from his own point of view (selectivity);•In time (operativeness);•Most of existing important ones (pervasion);

A person, being an element of a social system, needs to obtain adequate information to interact with others. Thus we suppose that every person wishes to get information messages:

However, natural limitations are evident:

•Importance can be estimated only by user himself;•Messages are too many to handle in time;•Messages are too many to process them all;

Page 3: Ufa State Aviation Technical University

3

Hypothesis: collaboration

At least until the semantics of natural languages can be processed effectively, importance of a message would always initially be estimated manually, by a human user.

•One single user has to process messages manually•Many collaborative users can effectively process a large set of

messages, exchanging important messages they find•May a message importance be estimated only once?•May a user use/trust an estimation of an arbitrary user(s)?

Page 4: Ufa State Aviation Technical University

4

Collaborative filtering problem

U1 U2 Un...

...

Ui

m1

{m1,m3}

M m2

m3

m4mj

{m1,m2} {m4,m2} { ? }

P1(mk) P2(mk) Pn(mk) Pi(mk)

Building a recommendation

Ui

M

{ ? }

M`Í M

Models and methods of recommender systems

Restrictions

Page 5: Ufa State Aviation Technical University

5

Recommender systemsSearch engines: • Google

Web-based recommender systems: • GroupLens • IOwl

Online stores: • Amazon• Ebay

Resources with elements of social networks

General drawbacks of existing collaborative filtering systems:•recommendations are built using data from all users, thus result has a

bad selectivity;•centralization;•vulnerability on logical and physical layers;•users lack control on the process;•users lack the explanation of the results;•systems do not allow an objective efficiency estimation.

Approaches to recommender systems

Content analysis Recommendation

support systems

Social data-mining

Collaborative filtering

Page 6: Ufa State Aviation Technical University

6

An approach on collaborative filtering

• Users U1,U2,…,Ui;

• Every Ui controls a peer of a p2p-network, identified by a pair of

security keys;• Every Ui manages a set of messages Mi;

• If a message is in Mi, Ui is said to recommend this message;

• Only user Ui may manage messages of Mi set;

• Other users may retrieve Mi, receiving a recommendation of Ui

Data structures: messages

User UiUi

Public key

Private key

User name

Ui

Channel

Climate

Messages

Message

February, 13th, a strong hurricane approached

central Antarctica

Sgntr

...

UserName: Иванов И.И.

Location: УфаLanguage: русский

...

...

Page 7: Ufa State Aviation Technical University

7

An approach on collaborative filtering

• Users U1,U2,…,Ui;

• Every Ui controls a set of rates Ri – pairs of (Uj,vij); vij [0,1]

which may have an additional information, such as a channel;

• Only user Ui may manage rates in Ri;

• Other users may retrieve Ri

Data structures: rates

Ui

Channel

Climate

Rates

User

Uj

Rate

0,9...

Society Uk 0,7

Page 8: Ufa State Aviation Technical University

8

An approach on collaborative filtering

• Every user rates a limited number of users directly, that he knows of, or that he is somewhat sure of;

• Transitivity allows us to extend a set of users, included in collaborative filtering for a particular user;

•Messages, retrieved from all users included in a filtering process, are sorted by how many users recommended it and what their value was;

•Aggregation function AMF(m, R*i) is also to be found

Uj

Uk

0,9

0,8

Ui

TRF(Ui,Uj,Uk) = 0,8*0,9 = 0,72

Extending rates set and message aggregation

•Transitive rate is computed with a special function TRF(Ui,Uj,Uk) to be found

Page 9: Ufa State Aviation Technical University

9

A proposed scheme of collaborative filtering

1. User evaluates an extended rates set of a sufficient depth.

Stage 1

UA

UB

UD UC

UE

0,9 0,8

0,8 0,7

(UD, 0.72, 1), (UB, 0.9, 0), (UC, 0.63, 1), (UE, 0.8, 0)

Page 10: Ufa State Aviation Technical University

10

A proposed scheme of collaborative filtering

2. Retrieving messages from many peers, user evaluates an extended messages set M*I – unsorted result

of collaborative filtering;3. Calculating a value of every

message, user evaluates an extended messages set MR*i –

sorted result of collaborative filtering.

Stages 2-3

UA

UBUD UC UE

0.80.630.90.72

{m1, m2} {m1, m3} {m4, m5} {m1, m4}

m U v

m1 UD 0.72

m2 UD 0.72

m1 UB 0.9

m3 UB 0.9

m4 UC 0.63

m5 UC 0.63

m1 UE 0.8

m4 UE 0.8

m Uv

m1 UD,UB,UE2.42

m4 UC,UE1.43

m3 UB0.9

m2 UD0.9

m5 UC0.63

Page 11: Ufa State Aviation Technical University

11

A proposed scheme of collaborative filtering

4. User corrects his own set of messages Mi;

5. User corrects his own set of rates;

Stages 4-5

m Uv

m1 UD,UB,UE2.42

m4 UC,UE1.43

m3 UB0.9

m2 UD0.9

m5 UC0.63

Ui

Chnl

...

Rates

User

UB

Rate

0,4

Chnl...

Messages

Messagem1

Sgntr...

... m4 ...

m6

... m6 ...

... UE 0,8

Ui

Page 12: Ufa State Aviation Technical University

12

Advantages of the approach

Features of the system implementing the approach proposed:• Decentralization• Anonymity of authors• Authors can prove themselves and ownership on the message• Selectivity• Controllability• Explainability• Flood resistance• Antagonistic societies can co-exist and even collaborate

Page 13: Ufa State Aviation Technical University

13

Results of the formal analysis and experiments

• Criteria of controllability and persistency on users and messages found and formalized;

• Several transitivity functions TRF and message aggregation function AMF found, examined to conform criteria found and the best one chosen;

• A system of virtual users created, seeking and exchanging important messages:

• Messages considered numbers;• Every user had a favourite number;• Users constructed their trusted neighbours in the making,

starting with random rates set, or a preset one;• Users aim at collecting most favourable messages;• An objective efficiency of the system is calculated;• Dependencies of efficiency on many factors investigated;

Page 14: Ufa State Aviation Technical University

14

Proposed prototype implementation

• HTTP instead of p2p-network protocols• DNS routing instead of ad-hoc p2p naming and routing protocols• Web-server instead of p2p-node• Users sharing common web-servers instead of users on p2p-nodes• RSS as a message delivery protocol

A web-based RSS aggregator

It looks like a web-based RSS aggregator, but a typical one of them• does not actually “aggregate”, merely “collects”

It looks like a typical web-based collaborative filtering system, but most of them• use “general” reputation, influenced by everyone• are server based, centralized• are not customizeable

As a working prototype we propose an open-source (GNU GPL) web-based RSS aggregator – Paranoia, available at

http://greg.southural.ru/paranoia/

Page 15: Ufa State Aviation Technical University

15

Proposed prototype implementationAn open-source web-based RSS aggregator - Paranoia

Paranoia server

LiveJournal

Blog 1<p>a message</p><p>a message</p><p>a message</p> ...

RSS

HTML

Syndicated feeds

A Paranoia blog is accessible both in browsers and RSS-aggregators

Paranoia server RSS-aggregator

Syndicated feeds

A Paranoia blog is accessiblein LiveJournal throughSyndicated feeds feature

A Paranoia blog is accessibleon another Paranoia server

A Paranoia blog is accessiblein any other RSS-aggregator

Page 16: Ufa State Aviation Technical University

16

Proposed prototype implementationAn open-source web-based RSS aggregator - Paranoia

Paranoia server

My news<p>news</p><p>news</p><p>news</p> ...

Blog 2<p>message</p><p>message</p><p>message</p> ...

Paranoia server

LiveJournal

Paranoia can aggregate messagesfrom different sources – users of the samesystem, users of remote Paranoia system,users and communities of LiveJournal,and of arbitrary RSS feed.

RSS feeds

Page 17: Ufa State Aviation Technical University

17

Proposed prototype implementationAn open-source web-based RSS aggregator - Paranoia

Paranoia server

My news<p>news</p><p>news</p><p>news</p> ...

My messages<p>message</p><p>message</p><p>message</p> ...

RSS

HTML

RSS

HTML

Page 18: Ufa State Aviation Technical University

18

Proposed prototype implementationAn open-source web-based RSS aggregator - Paranoia

А

B

This rate means the following:

«I want to receive messages from user А in channel «Politics», and I value him for 0.5 in this channel»

Channel: politicsRate: 0.5

Channel: handiworkRate: 0.2

B

Channel: handiworkRate: 0.8

LJ

Channel: mainRate: 0.2

Page 19: Ufa State Aviation Technical University

19

Proposed prototype implementationAn open-source web-based RSS aggregator - Paranoia

S А

B

The news will be as higher whenit has come from many users andwhen as higher as is their valuefor you.

Channel: politicsRate: 0.5

Channel: HandiworkRate: 0.2

B

Channel: HandiworkRate: 0.8

My news<p>news</p> 0.5<p>news</p> 0.4<p>news</p> 0.4 ...

Channel: KnitworkRate: 0.2

C

Channel: mainRate: 0.5

D

Channel: politicsRate: 0.5

E

Channel: politicsRate: 0.5

E

Channel: mainRate: 0.5

...

...

Paranoia server

...

Paranoia server

LiveJournal

...

Page 20: Ufa State Aviation Technical University

20

Proposed prototype implementation

А B

Paranoia server

C D

E F

News

А and В process news feed ontheir own – thus they can not useeach other’s labour

An open-source web-based RSS aggregator - Paranoia

А B

Paranoia server

C D

E F

News

A and B collaborate in processing news feeds,- thus a message, coming from both a feed anda fellow user would receive higher rank ina news result set

Page 21: Ufa State Aviation Technical University

21

Proposed prototype implementationNon-trivial features

An environment appears to be very flexible, and many tasks can be solved trivially within:

1. Administrator notifications: every user automatically rates a local administrator in a channel ‘system’

Channel: systemRate: 0.5

User news: system<p>notification1</p><p>notification2</p><p>notification3</p> ...

Paranoia server

Admin messages: system<p>notification1</p><p>notification2</p><p>notification3</p> ...

2. Users feedback: local administrator automatically rates every user in a channel ‘feedback’

C D ...

Channel: feedbackRate: 0.1

User A messages: feedback<p>feedback1</p><p>feedback2</p> ...

Paranoia server

Admin news: feedback<p>feedback1</p> 0.2<p>feedback2</p> 0.1<p>feedback3</p> 0.1 ...

User B messages: feedback<p>feedback1</p><p>feedback3</p> ...

Channel: feedbackRate: 0.1

Page 22: Ufa State Aviation Technical University

22

Proposed prototype implementationNon-trivial features

3. Comments to messages are merely one’s own messages, stored in a special channel:

• Comments to do leave creator’s peer;• Comments are retrieved when needed, following the same rules as

any other message;

C D ...

Channel: politicsRate: 0.5

User A news: politics<p>message1</p>

<p>comment1</p><p>comment2</p>

<p>message2</p> ...

Paranoia server

User B messages: politics<p>message2</p> ...

User B messages: comments<p>comment1</p> ...

4. If comments are retrieved only from trusted peers and are not stored locally:• No one (except trusted peers) can spam the discussion;• Different groups with rates among group fellows can discuss the same

message without interfering!

Page 23: Ufa State Aviation Technical University

23

Conclusion

In our opinion messaging systems (news messaging or whatsoever) would evolve gradually:

• to be distributed among many storages• to have many initial sources of information

• with emphasis to direct witnesses• to implement collaborative filtering

• specific for every user• controllable by every user• resistant to most types of malicious behaviour

Thank you!