document classification using deep belief nets lawrence mcafee 6/9/08 cs224n, sprint ‘08

Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Post on 22-Dec-2015

213 views

Category:

Documents

0 download

Report

Download

Tags:

Embed Size (px):

TRANSCRIPT

Document Classification using Deep Belief Nets

Lawrence McAfee6/9/08

CS224n, Sprint ‘08

Overview

• Corpus: Wikipedia XML Corpus• Single-labeled data – each document falls under single

category

• Binary Feature Vectors• Bag-of-words• ‘1’ indicates word occurred one or more times in document

Doc#1

Doc#1Doc#1

Doc#3Doc#2 Classifier

Doc#1

Food

Doc#2

Brazil

Doc#3President

Page 3: Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Background on Deep Belief Nets

Training Data

RBM 1

RBM 2

RBM 3Higher level

features

Features/basis vectors for

training data

Very abstract features

RBM

• Unsupervised, clustering training algorithm

Page 4: Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Inside an RBMhidden

visible Configuration (v,h)

GolfCycling

Energy

Input/Training data

• Goal in training RBM is to minimize energy of configurations corresponding to input data

• Train RBM by repeatedly sampling hidden and visible units for a given data input

Page 5: Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Depth

• Binary representation does not capture word frequency information

• Inaccurate features learned at each level of DBN

100

120

0 2 4 6 8

Number of layers

Acc

ura

cy (

straight

linear

Page 6: Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Training Iterations

• Accuracy increases with more training iterations

• Increasing iterations may (partially) make up for learning poor features

100

0 2000 4000 6000 8000 10000 12000

Training iterations per layer

Acc

ura

cy (

Configuration (v,h)

Lions Tigers

Configuration (v,h)

LionsTigers

Energy Energy

Page 7: Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Comparison to SVM, NB

• Binary features do not provide good starting point for learning higher level features

• Binary still useful, as 22% is better than random• Time: DBN-2h,13m; SVM-4sec; NB-3sec

1015

2025

3035

4045

DBN (100K iters, 30categ)

SVM NB

Classifier

Acc

ura

cy (

30 categories

Page 8: Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Lowercasing

• Supposedly richer vocabulary when lowercasing• Overfitting: we don’t need these extra words

• Other experiments show only top 500 words relevant

0 500 1000 1500 2000 2500

Number of hidden neurons in top layer

Acc

ura

cy (

low ercase

non-low ercase

Page 9: Document Classification using Deep Belief Nets Lawrence McAfee 6/9/08 CS224n, Sprint ‘08

Suggestions for Improvement

• Use appropriate continuous-valued neurons• Linear or Gaussian neurons• Slower to train• Not much documentation on using continuous-valued

neurons with RBMs

• Implement backpropagation to fine-tune weights and biases• Propagate error derivatives from top level RBM back

to inputs• Unsupervised training gives good initial weights,

while backpropagation slightly modifies weights/biases

• Backpropagation cannot be used alone, as it tends to get stuck in local optima

Activate McAfee | McAfee activation | McAfee Antivirus Tech Support | 1-800-986-3790

McAfee, Inc. McAfee Endpoint Encryption for PC with McAfee

CS224n: Natural Language Processing with Deep Learning

Natural Language Processing with Deep Learning CS224N/Ling284web.stanford.edu/class/cs224n/slides/cs224n-2020... · 1.The course (10 mins) 2.Human language and word meaning (15 mins)

Natural Language Processing with Deep Learning CS224N/Ling284web.stanford.edu/class/cs224n/slides/cs224n-2019-lecture01-wordvecs1.pdf · What’s different this year? •Lectures

Natural Language Processing with Deep Learning CS224N/Ling284web.stanford.edu/class/cs224n/slides/cs224n-2020... · Neural graph-based parser: Dozat and Manning (2017) 3.Constraint

McAfee Cloud Identity Manager Google-LDAP Quick Start Guidekb.mcafee.com/resources/sites/MCAFEE/content/live/... · 2012-08-01 · August 2012 McAfee® Cloud Identity Manager Google-LDAP

CS224N/Lin4 with Deep Learning tural Language Pr …web.stanford.edu/class/cs224n/slides/cs224n-2019-lecture...Natural Language Processing with Deep Learning CS224N/Ling284 Lecture

CS224N/Lin4 with Deep Learning tural Language Pr ocessing › class › archive › cs › cs224n › cs224n... · 2019-02-04 · tural Language Pr ocessing with Deep Learning CS224N/Lin4

CS224N Final Project Geo-location Route Recognition

CS224N/Ling284 - Stanford University

McAfee for Mac Install - Gatekeeper Workaround for Mac... · · 2016-08-15While in the process of installing McAfee on Mac OS X, if "'McAfee Install' is damaged and can't be opened

CS224n, Winter 2019web.stanford.edu/class/cs224n/posters/15785384.pdf · Improving coreference resolution by learning entity-level distributed representations. CoRR, abs/1606.01323

Natural Language Processing with Deep Learning CS224N ...web.stanford.edu/class/cs224n/slides/cs224n-2021-lecture...(Oscar Wilde –The Picture of Dorian Gray) Taking stock… •It’s

McAfee : McAfee Antivirus User Guide

Natural Language Processing with Deep Learning CS224n

CS224N/Ling284web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture04-neural… · Make sure to get help if you need it Visit office hours Friday/Tuesday Note: ... Later we discuss

CS224N/Lin4 with Deep Learning tural Language Pr ocessingweb.stanford.edu/class/cs224n/slides/cs224n-2019-lecture06-rnnlm.pdf · longer, but this slide doesn’t have space! hidden

Natural Language Processing with Deep Learning CS224N/Ling284web.stanford.edu/class/cs224n/slides/cs224n-2020-lecture02-wordve… · Natural Language Processing with Deep Learning

McAfee Application Control McAfee Embedded Control McAfee ... · McAfee Embedded Control McAfee Integrity Control. McAfee Embedded Security . Enhanced Security for Today's ... of

Natural Language Processing with Deep Learning CS224N ...web.stanford.edu/class/cs224n/slides/cs224n-2021-lecture...•On each timestep, each element of the gates can be open(1), closed(0),

cs224n-python-review-code-updatedweb.stanford.edu/class/cs224n/readings/cs224n-python... · 2021. 1. 19. · Recommended IDEs Spyder (in-built in Anaconda) Pycharm (the most popular

CS224N Python Introduction - Stanford Universityweb.stanford.edu/class/cs224n/readings/cs224n-python-review-20.pdf · CS224N Python Introduction. Plan for Today Intro to Python Installing

McAfee technical support PPT | McAfee Customer Support Number | McAfee PPT

Moving from McAfee SecurityCenter to McAfee ePO Cloud · Moving from McAfee SecurityCenter to McAfee ePO Cloud Product Comparison Feature or Function McAfee SecurityCenter McAfee

McAfee, Inc. McAfee Endpoint Encryption for PC with … Inc. McAfee Endpoint Encryption for PC with McAfee Endpoint Encryption Manager Common Criteria Security Target McAfee, Inc

3.0.0 McAfee Security for Microsoft SharePoint...McAfee, the McAfee logo, McAfee Active Protection, McAfee CleanBoot, McAfee DeepSAFE, ePolicy Orchestrator, McAfee ePO, McAfee EMM,

Uninstall McAfee and the McAfee Removal Tool-Support for McAfee-TechSpeedy

McAfee, Inc. McAfee Firewall Enterprise 1100F

McAfee Inc. McAfee Linux Cryptographic Module

Natural Language Processing with Deep Learning CS224N ...web.stanford.edu/class/cs224n/slides/cs224n-2021-lecture...PP attachment ambiguities multiply •A key parsing decision is

McAfee Endpoint Security 10€¦ · • McAfee support McAfee Endpoint Security 10.5 Product Guide 7. McAfee Endpoint Security 8 McAfee Endpoint Security 10.5 Product Guide. 1 Introduction

Natural Language Processing with Deep Learning CS224N ...web.stanford.edu/class/cs224n/slides/cs224n-2021-lecture...high frightened red correct similar fast evil christian 1) 57.5

Natural Language Processing with Deep Learning CS224N ...web.stanford.edu/class/cs224n/slides/cs224n-2021-lecture...•David Hays, one of the founders of U.S. computational linguistics,

Natural Language Processing with Deep Learning CS224N/Ling284web.stanford.edu/class/cs224n/slides/cs224n-2020... · Program synthesis applications from natural language I think it

document classification using deep belief nets lawrence mcafee 6/9/08 cs224n, sprint ‘08

Documents

training rbm

level rbm

level of dbn slide

input data train rbm

given data input slide

document doc

words relevant slide

higher level features