personal assistant like siri

5

Click here to load reader

Upload: shree-bohra

Post on 26-Jun-2015

485 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Personal assistant like Siri

Intelligent Embedded Personal Assistant.A two piece embedded device responding to user queries using internet as its knowledge base.

Dinesh Vyas7th Semester ECESri Balaji College of Engineering and TechnologyJaipur.

Shree Kant Bohra7th Semester IT Engineering College BikanerBikaner.

Page 2: Personal assistant like Siri

What is it ?

A voice controlled embedded device working as a personal assistant to the user. Internet is a vast hub of information. Sitting in front of computer and getting information is useful to only some extent, information is more useful when it can be used in day to day life. An embedded personal assistant using internet as its knowledge base to answer our queries is what we are proposing.

There are a lots of services available on internet which provides exact information to the user queries rather then a bunch of search results, using these services, this embedded device can answer to many of the user queries as a real human being. These service includes answer engines, which provides answers not the results, querying to it will give you a result or no result, means 0 or 1. These answer engines can be greatly used in an embedded device, making them more useful for users, other services like Google maps, Google business etc provides similar results. The beauty of this project is that the usefulness of these service increases a manifold by embedding them into hardware. Its like providing the information exactly when it is needed instead of making the getting information as a separate task.

The device being a personal assistant can do a lots of things other then providing information and solving user query, it can give suggestions, it can assist in managing schedules, meeting etc.

It can work as a social network device, can update twitter, facebook etc status, read out tweets, by user voice query, making this task more obvious, then to go and type what are you doing.

Why Voice Controlled ?

Intuitiveness of device. Its rarely a case that we want to open keyboard, type our query and wait for the response as text, especially for smaller queries like, what's the capital of India or what's the Sensex index now? or what's the temperature here? As we want to get the answer instantly, so the voice controlled, querying using voice has always been more easier then typing using the lame keyboards. Voice recognition has been improved a lot in recent years and more research is being done in this field. So going with a voice controlled mechanism seems the best solution for such a device.

How it works ?

The device has two parts, a touch screen small handheld and a bluetooth based earpiece containing microphone and earphone. The following can be seen as a work flow of device working:

1. User has bluetooth earpiece on her head and handheld in her pocket. Both device powered on, user makes a query using voice, to tell the device that user is making a query it uses a keyword based on type of query something like 'weather' followed by the query. The bluetooth receiver, get the query, the software at the receiving end converts it into text. On the basis of query type, device makes a request to the internet service it is configured to use.

2. On receiving the result, device converts it into voice signals. 3. These voice signals are sent back to the bluetooth earphones as a response to the user query. 4. In response to the query, there can be images, videos and large text too in some cases, which

can be seen on the handheld device.

Page 3: Personal assistant like Siri

Schematic diagram showing work flow of the device

Example Use Cases

With such hardware, powerful software stack over it and internet as the knowledge base, only imagination is the limit to this device. Exciting use cases we thought of building as a start are as following:

• Using the the WikiAnswers and Wolfram alpha answers engine, the device can answer to the general queries related to information need of user these answer engines provide great results in most cases. An easy to use API is provided by these engines which can be used. These answer engines provide interesting results using their own algorithm, suppose we ask how to make tea, it will provide with a complete instructions of how to make tea, if we ask it, who is the CEO of

Page 4: Personal assistant like Siri

Google it will give answer as “Eric Schmidt” or if we ask who is “Indira Gandhi” it will come up with short bio of who is she. Such queries when asked using voice becomes more useful, as we may require these answers to understand something better.

• Another use case for such a device is getting directions to any particular place from the place you are. The device using the in built GPS can send its location and get directions to the destination using services provided on internet such as Google maps.

• Device can be used to store our meetings schedules, user can direct it to store a meeting or event schedule and the device can remind back before the schedule. Device can tell you latest news, weather report, stock reports etc on your demand, using the internet as its source.

• Device can suggest you to do certain task such as put on warm clothes, rain coats, based on your location it can get weather forecast for the place and give you advice accordingly.

• Device can be used as a internet radio readily. • The device working as a social network device, can make using social network services more

easier, configuring it to the social network user uses, she can update, read out statues using this device.

• With a camera on the device, head mounted, user can click the photograph by voice command, see it on the handheld device, upload the pic on social networks like facebook, twitpic, flickr.

Hardware Description

The Intel Atom Processor is best suited for such device which requires good computing power at low power consumption. We figured out the main components of the device design in the following table, these are based on the performance requirement of the device and available options from the reference application provided. Our design is based on the media phone reference design.

Processor Memory Storage Interfaces Expansion

Intel® Atom™ Processor Z530 and Intel® System Controller Hub (large form factor)

1GB DDR2 533MHz SODIMM memory

8GB Compact Flash and separate SATA

Low Pin Count Interface (LPC): Intel® Firmware Hub (FWH); on-board Trusted Platform Module (TPM) version 1.2; super I/O chip (SIO); SDVO for external HDMI connector; HD audio

(4) external USB host connectors, internal USB to touch panel, external USB , (2) SDIO ports, (2) PCIe x1 connections, GPIO

Page 5: Personal assistant like Siri

Software Description

Moblin OS supports Atom processor and is fully featured Operating System, by Linux Foundation. It seems like the best choice to use as OS in this device.

Open Source speech recognition engine CMU sphinx can be extended to use in the software solution. The software provides enough libraries, training data that can be used to built a software for our need on top of it. The project website is http://cmusphinx.sourceforge.net