ssb bart group silicon valley (415) 975-8000 [email protected] it accessibility problem -...

21
SSB BART Group Silicon Valley (415) 975-8000 [email protected] IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 [email protected] Images, Alternative Text, and Artificial Intelligence

Upload: abel-claud-golden

Post on 24-Dec-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

SSB BART GroupSilicon Valley(415) [email protected]

IT Accessibility Problem - Solved™

SSB BART GroupWashington DC

(703) [email protected]

Images,Alternative Text,

and Artificial Intelligence

Page 2: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

Agenda

About Us

About Me

The Project

What’s Next

http://amp.ssbbartgroup.com/public/research/Automatic_Image_Classification_090707.doc

http://amp.ssbbartgroup.com/public/research/SSB_BART_Group_Image_Alt_CSUN_2008.ppt

Page 3: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

Corporate Overview

History

Founded in 1997 by engineers with

disabilities

750 commercial and government

customers

1,500 enterprise projects successfully

completed

Pioneers of commercial accessibility

validation tools

Approach

Data driven and scalable

Violation profiling across 5.5M human validated accessibility issues

Scalable Solutions

One to one million developers

One to one thousand production systems

Fifty percent staffing mix of

individuals with disabilities

Appropriately mixed automated,

human and code level validation

Page 4: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

Supported Platforms

Web

HTML

XML

JavaScript

CSS

AJAX

Adobe Flash and Flex

Adobe Acrobat Documents

Streaming Audio and Video

Compiled Software

JFC and SWT Java Applications

.Net Applications

MFC Windows Native Applications

Macintosh Applications

BMC Remedy Applications

Standalone Systems

Telecommunications Hardware

IVR Systems

Agent Systems

Digital Imaging

Page 5: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

Industry Solutions

Public Sector

Federal Solutions

United States

European Union

Education

K-12

Universities

State and Local

Government System Integrators

Healthcare

Primary Care Providers

Insurance

Information Technology

Manufacturers

Software

Hardware

Web Based Service Providers

Mass Transit

Financial Services

Consumer Banking

Insurance

Legal

Web Based Service Providers

Page 6: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

Accessibility Management Platform

AMP – SSB’s web based platform for managing all aspects of

Accessibility process

Benefits

Single point for tracking compliance over time

Scalable solutions from one to one million developers

across multiple domestic markets

Support for all aspects of a successful accessibility initiative

Requirements Implementation Certification

Baseline Audit Development Audit Maintenance Audit

Standards Development Standards Maintenance VPAT Creation

eLearning Developer Support Certification

InFocus™ Suite

Page 7: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

About Me

General Story

Founder and Managing Director of SSB

BART Group

Also Known As President and CEO

Professional web site developer for 13

years

Started in 1994 at the dawn of the

Web

BS Computer Science Leland Stanford

Junior University (AKA Stanford)

Odds on Brad Pitt to

play me in the movie

Accessibility Work

Involved in Web Accessibility activities,

validation and education since 1999

Architected and developed first commercial

accessibility testing and fixing tool

InSight and InFocus 1.x -> 4.x

Initial release in mid-200

Next release in a few months

Architected and developed Accessibility

Management Platform (AMP)

Current Version – 2008 R1

Personal work with fifty enterprise class

software vendors

Page 8: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

Project Overview

Project Description

Create a decision tree to classify images into one of eight types

Image types are organized by alternative text requirements

Upon classification, alternative text validity can then be tested via straightforward heuristics

Project Utility

Alternative text provides a textual description of an image

Alternative text validity

Ensures access to content for people with disabilities

Allows pages to be adapted effectively - low resolution, alternative browsers

Increases search engine relevance for pages

Bottom Line – Good alternative text is good for society and good for profits

8

Page 9: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

Automated Testing Tools

A brief note on automated testing tools

First generation of automated testing tools, where we

are now, can test about 25% of requirements accurately

Another 25% with so-so accuracy

And the rest need to be checked manually

We think the next generation of tools can double this

efficacy through better AI, more complex page models

and better leveraging of human judgment…

…but ultimately tools can only facilitate the process of

human review they cannot replace it

Page 10: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-895510

Image Types Layout Element – The image is used solely to layout elements on the page

Decorative Picture – The image is a picture that is used solely for the purpose of making the page more visually appealing and it provides no information

Text – The image is used to stylize text on the page but is not used as an active element on the page

Picture – The image is a picture that contains information important to the use of the page

Hidden Link – The image provides a “hidden” link on a page for search engine optimization or screen reader users

Linked Text – The images is used to stylize text and provide a link to another page

Skip Link - The image is the root of an inner-document link that provides a means of skipping past page content that is not relevant

Linked Picture – The image is a picture that provides a link to another page

Page 11: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-895511

Variables

Width The width of the imageHeight The height of the imageEdge Count The number of vertical and horizontal edges in

the image

Size The rectangular size of the image or width time height

File Size The size of the file in bytesLink Whether or not the image is a link

Inner-document Link Whether or not the image is a link within the current document

Color Depth The number of unique colors that the image has

Page 12: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-895512

Project Functionality

Challenge

No database of relevant image classifications exists

Subject Matter Experts (SMEs) use experience to

determine form of alternative text

Without a good data set the decision tree isn’t going to

decide much

Solution

Build a spider to crawl sites and gather sample data

Classify the images using a basic interface

Store the image classification and additional variables in a

database

Build a decision tree from the database rather than a live site

Repeat using updated tree

Result

Created an image database of 1000 images with about an

hour of actual data entry

Page 13: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-895513

Project Functionality

Challenge

Build the decision tree

…which became build the decision tree before the end of time

…which became build the decision tree once and store it for

later use

Discussion

Building the tree is fairly straightforward and involves splitting

on variables and analyzing remaining sets

Implementation uses Russell, Norvig algorithm

More on the tricky parts later

The “catch” - a lot of the queries involve eliminating groups of

images

SQL doesn’t have good concepts for handling unordered

sets of keys so you enumerate out elements for queries…

Page 14: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-895514

Project Functionality

Discussion (Continued)

This results in lots of nasty queries and a fair amount of

time to build the tree

This more or less grows exponentially as you add variables

and quanta

Solution

Build the tree once and persist to disk

Limit quanta for variables and require minimum information

gain

Result

Creation of the tree takes about forty minutes

Reading in the tree takes about forty milliseconds

Resolving against the tree takes about forty nanoseconds

Page 15: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-895515

Project Functionality

Challenge

Test the decision tree for accuracy

Avoid peeking at the data set

Solution

Always test on new data [Tank!]

Don’t store the test set so we avoid any temptation to peek

Name AccuracyHi5 – www.hi5.com 94.7%Hillary Clinton for President - http://www.hillaryclinton.com/ 98.6%Department of Defense - http://www.defenselink.mil/ 86.84%Engadget – www.engadget.com 91.45%Gamespot.com – www.gamespot.com 91.57%

Average 92.63%

Page 16: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

The Tricky Parts

Information Gain

Successful classification provides 2.391 bits of

information

Which means, what, exactly?

Technically – You have enough information

to answer 2.391 yes/no questions

Practically – You can order nodes to split on

by information gain

At each split choose node that provides highest

information gain

Note - The amount of information provided

by an attribute will change as you move

through the tree

Solution

Calculate information gain for each split

This is where the nasty set queries occur

Overfitting

Observe

Permutations of Variable Quanta -

460,800

Sample Data Size – 1000

460,800 >> 1000

Thus the risk of over fitting is significant

Solution

Require that we gain at least .05 bits to split

– otherwise just return the modal value for

the remaining set

16

Page 17: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

The Tricky Parts

Variable Quantification

Strategy

Make everything an integer

Define ranges for all variables

Initially picked quanta based on guesses

divisions

These turned out to be wildly inaccurate

Solution

Picked variables based on image type

grouping and average

SQL AVG and COUNT make this easy

Edge Detection

Used Sobel Edge detection and Java

convolution application for images

Count the number of edges in the

image

Lots of images have edges

Solution

Count vertical and horizontal edges

Turns out to be a great proxy for

text in the image

Accuracy goes from 78.23% to 92.63%

with this types of edge detection

17

Page 18: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

Future Features

Second Order Variables

First order variables are primary data from images

Second order variables are derived from one or more primary variables

Specifically

edge_count, color_depth have much more relevance as ratios to size

height is more relevant as a ratio for width

Classification Tightening

Current classifications have some overlap which could be refined out

Certain classifications evolved over the course of the project and the data set

should be updated to reflect the final classification

18

Page 19: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

Future Features

Safe Failure

Okay to require alternative text when not necessary

than not require text when necessary…

…or is it??

Celebrity Endorsement

If K-Fed uses it wouldn’t you

19

Page 20: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

InFocus 5.0

Page 21: SSB BART Group Silicon Valley (415) 975-8000 sales@ssbbartgroup.com IT Accessibility Problem - Solved™ SSB BART Group Washington DC (703) 442-5023 sales@ssbbartgroup.com

Silicon Valley (415) 975-8000 www.ssbbartgroup.com Washington DC (703) 637-8955

For More Information

Silicon Valley

Phone (415) 975-8000

E-mail [email protected]

Fax (415) 624-2708

300 Brannan Street

Suite 608

San Francisco, CA 94107-1876

Washington DC

Phone (703) 637-8955

E-mail [email protected]

Fax (703) 734-8381

1489 Chain Bridge Road

Suite 204

McLean, VA 22101