voice controlled home1199967/fulltext01.pdf · degree project in technology, first cycle, 15...

51
IN DEGREE PROJECT TECHNOLOGY, FIRST CYCLE, 15 CREDITS , STOCKHOLM SWEDEN 2017 Voice controlled home ANTON BACKMAN ERIC CIORAN KTH ROYAL INSTITUTE OF TECHNOLOGY SCHOOL OF INDUSTRIAL ENGINEERING AND MANAGEMENT

Upload: others

Post on 10-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

IN DEGREE PROJECT TECHNOLOGY,FIRST CYCLE, 15 CREDITS

, STOCKHOLM SWEDEN 2017

Voice controlled home

ANTON BACKMAN

ERIC CIORAN

KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF INDUSTRIAL ENGINEERING AND MANAGEMENT

Page 2: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal
Page 3: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

Voice controlled home

ANTON BACKMANERIC CIORAN

Bachelor’s Thesis in MechatronicsSupervisor: Naveen MohanExaminer: Nihad Subasic

TRITA MMK 2017:09 MDAB 627

Page 4: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal
Page 5: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

Abstract

To aid the resident this project has made it possible to change the setting of homeappliances using voice recognition. In the startup process of the voice recognitionprogram, the user records a word that is associated to the home appliance, the wordwill be frequency analysed and stored. When the startup process is done, the usercan say the word that is associated to the appliance which will then be frequencyanalysed and compared to the stored word. If the words match, the setting of theappliance will change. One of the setting is, automatic, where the appliance iscontrolled by sensors.

The purpose of this project is to create a simple voice recognition programusing Fast Fourier Transform, FFT. The user is able to control and change settingsof different home appliances using their voice. The voice recognition should be ableto recognise at least one user.

The results shows that when the same user records a word and then repeats it,the application setting changes at least 78% of the times, however when another userthat has not recorded any word repeats the word the application setting changes atleast 65% of the times.

iii

Page 6: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal
Page 7: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

Sammanfattning

För att underlätta användaren i hemmet har detta projekt gjort det möjligt attändra hemapplikationers tillstånd via röststyrning. Detta görs via att användarenvid startprocessen spelar in ett ord som associeras till hemapplikationen, ordetkommer frekvensanalyseras och sedan lagras. När startprocessen är klar kommeranvändaren närsomhelst kunna säga ordet som associerades till hemapplikationen,ordet kommer frekvensanalyseras och sedan jämföras med det lagrade ordet. Ifallorden stämmer överrens kommer applikationens tillstånd att ändras. Ett tillståndär t.ex. automatiskt läge, där applikationen styrs med hjälp av sensorer.

Målet med detta projekt är att skapa ett röstigenkänningsprogram, som an-vänder Fast Fourier Transform, FFT. Användaren ska kunna kontroller och ändraolika hem applikationer tillstånd via röststyrning. Minst en användare ska kunnaanvända rösigenkkänningsprogrammet.

Resultatet visar att när samma användare spelar in ett ord och sedan uppreparordet ändras applikationens tillstånd vid minst 78% av tillfällena, däremot när enannan användare som inte har spelat in ett ord använder röststyrningen, ändrasapplikationens tillstånd vid minst 65% av tillfällena.

v

Page 8: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal
Page 9: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

Preface

We would like to thank our supervisor Naveen Mohan for support and feedback.All the lab assistants that answered our questions. Staffan Qvarnström for materialand parts. We would also like to thank our fellow classmates Karl Lundin and OscarOlli for proofreading our report.

Anton Backman and Eric CioranStockholm, May, 2017

vii

Page 10: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal
Page 11: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

Nomenclature

Abbreviations

MCU Microcontroller UnitPWM Pulse-Width ModulationFFT Fast Fourier TransformRPM Revolutions Per MinuteLED Light Emitting DiodePIR Passive Infra RedRQ Research QuestionKF Kitchen fan

ix

Page 12: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

Contents

Abstract iii

Referat v

Preface vii

Nomenclature ix

1 Introduction 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Literature research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.5 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Theory 52.1 Sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Fan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 Voice recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Demonstration 93.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.3 Hardware and Electronics . . . . . . . . . . . . . . . . . . . . . . . . 12

4 Results 154.1 Result for first research question . . . . . . . . . . . . . . . . . . . . 154.2 Result for second research question . . . . . . . . . . . . . . . . . . . 18

5 Discussion and conclusions 21

6 Recommendations and future work 23

Bibliography 25

x

Page 13: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

CONTENTS xi

Appendices 26

A Arduino code 27

Page 14: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

List of Tables

3.1 Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

4.1 Round 1 results for first RQ. . . . . . . . . . . . . . . . . . . . . . . . . 154.2 Round 2 results for first RQ. . . . . . . . . . . . . . . . . . . . . . . . . 154.3 Round 3 results for first RQ. . . . . . . . . . . . . . . . . . . . . . . . . 164.4 Round 1 results for second RQ. . . . . . . . . . . . . . . . . . . . . . . . 184.5 Round 2 results for second RQ. . . . . . . . . . . . . . . . . . . . . . . . 184.6 Round 3 results for second RQ. . . . . . . . . . . . . . . . . . . . . . . . 18

xii

Page 15: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

1 | Introduction

1.1 BackgroundIn a modern home there are many different appliances that are completely inde-pendent of each other. The result is that the resident of the home needs to changethe setting of these appliances one by one, which can make it so the resident can’tkeep enough focus on each appliance to make them behave ideally. A bedriddenperson might not even be able to control simple things as the lights above the bed.Solutions to this could be by implementing a voice recognition system to the home.A system which also uses sensor to automatically control home appliances.

Voice recognition software demands a high requirement on the computer thatanalyses the voice. The most common voice recognition software work by recognis-ing the phonemes of a spoken word and with these form words that contains thesephonemes. A phoneme is any distinct speech sound from humans when they aretalking [1]. The use of hidden Markov methods are also commonly used to "guess"what words or sentences are being said by using what have already been said in toform of words/phonemes. Large vocabularies, that need a lot of storage, are alsoneeded to make these "guesses" [2].

1.2 PurposeThe purpose of this project was to make home appliances easy to control using voicerecognition. This can be achieved by connecting them to a central unit. This unitshould be able to turn off and on the appliances. The appliances should also beable to behave automatically depending on the sensors inputs.

The voice recognition will be used to control the appliances. Since the commonlyused voice recognition software is very complex, this project is going to focus onmaking a voice recognition that’s more simple and just competent enough for it’spurpose.

Research questions, RQ, of this study are:

• What is the success rate of the voice recognition?• What is the success rate of the voice recognition when using two different

voices?

1

Page 16: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

CHAPTER 1. INTRODUCTION

The focus of the first RQ question will be to calculate the success rate of thevoice recognition when the same user records and speaks the word. The successrate would be, in a percentage, how often the software recognizes the right word.

The second RQ question is simply if the voice recognition is set up for a person,what will the success rate be if another person uses it?

1.3 ScopeDue to this being a Bachelors thesis with limited time there are some limitations.The commonly used voice recognition, as discussed in section 1.1 Background, istoo complex to make. The focus will be on creating a simplified voice recognition.The voice recognition doesn’t need to recognize any word, only the capability ofdifferentiating a few number of words is of interest. The words cannot sound alikeand they need to contain a mix of different vowels because of the formants, seesection 2.3 Formants. This voice recognition will not take into account how a wordchanges over time. The system won’t be designed for a noisy environment.

1.4 Literature research

Research on this area have been done by searching in different archives/librariessuch as Google Scholar, Google, KTHB, and Digitala Vetenskapliga Arkivet. Word-s/phrases such as "voice recognition", "voice recognition arduino", "speech recogni-tion", "phoneme", and "formants" have been used when searching. When doing theresearch it was of interest to see if voice recognition had been achieved before. Itturns out that work in voice recognition is something that is being done. Voicerecognition works almost as good as human interpretation of spoken words [3]. Theliterature research found out that this type of commonly used voice recognition istoo complex, as discussed in chapter 1.1 Background, for the purpose. Research wasneeded to find out how to make a simpler system for voice recognition. Formants,as discussed in chapter 2.3, was researched how to be used for the purpose.

1.5 MethodThe central unit that controls the voice recognition and home appliances is a Micro-controller Unit, MCU. For the voice recognition a microphone and amplifier withadjustable gain have been used. A display is used to guide the user in operatingthe system.

Voice recognitionThe microphone sends the audio signal to the MCU for analysis. This analysis isdone by doing a Fast Fourier Transform, FFT (see page 7), of the audio signal. Theaudio signal is continuously being transformed which creates a lot of FFT vectors.

2

Page 17: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

1.5. METHOD

As the signal is being transformed an avenging function creates only one vector forthe whole audio signal i.e. one word is associated to one averaged FFT vector. Touse the voice recognition the user repeats the pre-record word and the repeatedword will be compared to the pre-recorded word. The comparison function willtake the difference of each frequency for the vectors. There can be small or largedifference between the FFT vectors in each frequency, if there is more than e.g5 small frequency differences in the comparison between the two FFT vector, thewords will be assumed the same.

Test design

First research question

To get results for the first research question, one person recorded each word andspoke each recorded word 20 times.

Second research question

To get results for the second research question, one person recorded each word andanother person spoke each recorded word 20 times.

3

Page 18: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal
Page 19: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

2 | Theory

2.1 Sensors

Motion sensorThe PIR sensor is sending two different signals that can detect the amount of IReach signal has. When the PIR is on idle each signal cancels each other out becausethere is no temperature difference but when a human or animal passes by, the firstsignal will notice a change and compare it with the other signal which will result ina temperature difference, which means something has triggered the PIR that willsend a high output [4].

Figure 2.1: How PIR works. [4].

Gas sensorThere are two main different gas sensors, optical and ionizing. The optical has aphotocell and a light source. The photocell does not receive any light when thereis no smoke. When smoke comes into the sensor, some of the light is reflected intothe photocell which triggers the sensor.

Ionizing gas sensor uses a radioactive substance, the radiation is ionizing the airin the sensor which creates a small electric current. When smoke comes into the

5

Page 20: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

CHAPTER 2. THEORY

sensor the particles are being neutralized which lowers the electrical current thatthen triggers the sensor [5].

Light sensorThe light sensor has a semi conductive material called cadmium sulphide, when thelight level(lux) is low the electrical resistance is low and when high it can go tothousands Ohms which then changes the output voltage[6].

2.2 FanThe fan is using a brushless motor that is silent, reliable, power efficient and hasrpm feedback. PC fans can have 2, 3 or 4 wires/pins. Fans with 2 pins, redfor positive(voltage) and black for ground. These will spin in full speed whenconnected(input voltage).

Fans with 3 pins has one extra wire, yellow, that outputs the RPM using antachometer. For one revolution 2 output pulses is generated.

Fans with 4 pins, can control the PWM signal which controls the RPM [7].

PWM (Pulse-Width Modulation)PWM technique changes the output voltage between maximum and zero(gnd) thou-sands of times per second. When switching the voltage on and off multiple timesthere will be an average output voltage which depends on the duty cycle. The dutycycle depends on the period and pulse width. Where d is duty cycle, T is the periodand p is pulse width. Pulse width is when the signal is on maximum [8]. See Figure2.2.

d = p

T(2.1)

Figure 2.2: Different PWM cycles. [8].

6

Page 21: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

2.3. VOICE RECOGNITION

2.3 Voice recognition

MicrophoneA microphone is a type of transducer, it converts sound energy to electrical en-ergy. In the microphone there exists a magnet that is surrounded by a coil and adiaphragm, the magnet creates a magnetic field. When we speak, the diaphragmmoves back and forth as the waves reaches the diaphragm. When the diaphragmmoves, the coils moves which creates an electrical current which is our sound signal[9]. The sound signal is then amplified and boosted using an amplifier.

FFTFFT is the Fast Fourier Transform. This algorithm is used to compute the FourierTransform of a discrete sequence. One purpose of this algorithm is to transform asound signal in the time domain into the same signal but in the frequency domain.

F (t)→ F (f) (2.2)

F (t) is the function in time whereas F (f) is the same function in frequency. In thecase when F (f) is a discrete function the Fourier transform is defined as

Fk =N−1∑n=0

Fne−i2πkn/N k = 0, . . . , N − 1. (2.3)

Where Fk is the function at frequency component k, N is the amounts of samplesin the time domain that gets transformed, n is the current sample considered in thesum, Fn is the function in the time domain at sample n, the mathematical constante, i is the imaginary unit.

FormantsFormants are, in the frequency spectrum, distinctive components of a sound pro-duced by speech or singing. Different vowels have peaks at different frequencies.This information is what humans use to distinguish different vowels. The first for-mant is between 200 - 900 Hz while the second formant is between 600 - 2500 Hz,for all vowels. Generally only the first and second formant is needed to distinguishthe vowel [10].

7

Page 22: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal
Page 23: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

3 | Demonstration

3.1 Problem formulationTo make a prototype in this thesis project a few problems needed to be solved.The voice recognition program needs to be able to record a word, link it with acommand, store it and be able to compare a recorded word to what word beingrecorded at any moment by a user.

Another problem to be solved is the appliances and sensor being connected tothe same device as the voice recognition. These sensors needs to be able to readvalues at the same time as the voice recognition actively listen to what’s being said.

3.2 Software

Voice recognitionTo use the voice recognition, the user needs to setup the voice recognition. In thesetup state the user speaks the word that is associated with the home appliancewhich is then linked to the home appliance command and stored in the memory.When all the commands have been recorded the voice recognition will then go intothe second state where the user can use the voice recognition to change settings ofhome appliances.

The voice recognition has an algorithm for recording words. This algorithmtakes in a sound signal from the microphone. The signal is transformed with theFFT. The FFT used is an open software algorithm [11]. In one second the signal hasbeen transformed many times therefore an averaging part is needed in the algorithm.This means that one complete word will be stored as the signal in the frequencydomain, averaged over the time of the word.

9

Page 24: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

CHAPTER 3. DEMONSTRATION

Figure 3.1: Flowchart of recording algorithm. Figure made with www.draw.io.

The flowchart of the recording algorithm seen in Figure 3.1. This algorithmshouldn’t be stuck in long loops because if it gets stuck in a loop the hardware can’tperform other tasks. If the sound threshold level is not reached the algorithm willend, as seen in the flowchart. This means that the algorithm gets called continuouslywhen the voice recognition is supposed to listen for words.

The voice recognition also needs an algorithm for comparison. This comparisonalgorithm needs two recorded signals as inputs. The comparison will output a"True" or "False" depending if the two signals are assumed the same or not. Thealgorithm evaluates the likeness of the two signals by firstly comparing if the valuesat a specific frequency are the same, or close to the same. If the difference is lessthan 17 (the values range between 0-255) then the signals are assumed the same atthat frequency. This check is done for every frequency. If the number of frequenciesthat are assumed the same is more than 88% of the total number of frequencies,the signals will be determined the same.

10

Page 25: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

3.2. SOFTWARE

Figure 3.2: Flowchart of the comparison algorithm. Figure made with www.draw.io.

Kitchen fanTo retain low amounts of smoke from cooking a fan was used. The fan can be indifferent states such as: "Auto", "Full", and "Off".

In the "Auto" mode, the angular speed is controlled by the output of the smokesensor. This ensures that the fan only needs to exhaust the smoke at a fan workload that’s just enough. The data from the smoke sensor needs to be handledbecause the values range from 0-1023 and the values change very rapidly. The valuefrom the smoke sensor is rounded to the closest factor of 80 and then divided by 4.The division is needed because when controlling the fan speed the output from theMCU should be between 0-255. The PWM signal from the MCU is transferred toa transistor in which it controls a larger current that drives the fan.

Room lightingThe room lighting will also be able to go into different states. The states are thesame as the kitchen fan. Although, the auto state is controlled differently.

The "Auto" mode will use a light sensor to make the lights shine at a desirablelevel. The light sensor outputs values between 0-1023. The output value from

11

Page 26: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

CHAPTER 3. DEMONSTRATION

the light sensor is rounded to the closest factor of 80 and then divided by 2. Theautomatic mode will also, by using a motion sensor, know if somebody is in theroom or not and accordingly make the lights behave certain ways.

3.3 Hardware and Electronics

Table 3.1: Hardware

Hardware ModelMicrocontroller Unit Aduino Uno [13]Transistor BU910 [14]Light sensor ADA-1384 [15]Microphone MAX4466 [16]Fan A12025-12CB-3BN-F1Smoke sensor MQ-2 [17]Display BC2004AYPLJBS [18]Motion sensor 555-28027 [19]Led

The whole system is centered around the Arduino Uno which is a MCU. All theelectrical parts connected to it. There is only one of every part except for thetransistor, which there are two of, and also the LEDs, which there are five of.

Figure 3.3: System diagram. Figure made with www.draw.io.

Figure 3.3 shows how the system is connected.

12

Page 27: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

3.3. HARDWARE AND ELECTRONICS

MicrophoneFor the voice recognition to work, a microphone is needed. Amplification and DCbias is also needed. The amplification is needed because the MCU reads values onthe analog input between 0 V and 5 V. The DC bias is needed because the analoginput does not read negative voltages. The DC bias makes the sound signal from themicrophone to revolve around 1,65 V instead of 0 V. The microphone MAX4466,contains all these features, and also an adjustable gain.

Kitchen fanThe KF, kitchen fan, is implemented by the use of a computer fan of model A12025-12CB-3BN-F1. It has a operating voltage of 12 V at 0,16 A. The fan have beenconnected to another power source since the MCU can’t output 12 V. The MCUneeds to control the fan, a transistor have been implemented to be able to controlthe angular speed of the fan by using PWM signal from an output of the MCU.

LightingThe lighting of the room has been implemented by connecting 5 LEDs in parallel.These are also PWM controlled by the MCU. The power source for the LEDs aretwo 1,5 V AA batteries connected in series to make a total voltage of 3 V.

DisplayIn this project a display with 20 columns and 4 rows is being used to aid the userwith necessary info to use the voice recognition program and to see what state eachhome appliance is in. The display requires several cables to the MCU and also avoltage of 5 V. The voltage is taken from the MCU.

Small scale roomThe small scale room have been built out of a foam-core material of 5 mm thickness.The room is in the shape of a cube with the side of 250 mm. It features a floor,roof, and two adjacent walls. In the corner where there are no walls and a pillarhave been added for stability, also to hide the motion sensor.

13

Page 28: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

CHAPTER 3. DEMONSTRATION

Figure 3.4: Small scale room.

As seen in Figure 3.4 the fan is mounted in the far corner together with the lightsensor. The red and black cables on the roof are connected to the LEDs, which areon the other side of the roof together with the gas sensor. The display is visible forthe user.

14

Page 29: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

4 | Results

4.1 Result for first research questionThese results are only for the research question when the same person records andspeaks the words.There are three different rounds.

Round 1

Table 4.1: Round 1 results for first RQ.

Round 1 First RQ

Person 1 records and speaks Person 2 records and speaks

Lamp Lamp(%) KF KF(%) Lamp Lamp(%) KF KF(%)Interpreted correctly 17 85 11 55 19 95 15 75Interpreted incorrectly 0 0 6 30 0 0 2 10No match found 3 15 3 15 1 5 3 15

Round 2

Table 4.2: Round 2 results for first RQ.

Round 2 First RQ

Person 1 records and speaks Person 2 records and speaks

Lamp Lamp(%) KF KF(%) Lamp Lamp(%) KF KF(%)Interpreted correctly 17 85 18 90 19 95 19 95Interpreted incorrectly 0 0 2 10 0 0 0 0No match found 3 15 0 0 1 5 1 5

15

Page 30: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

CHAPTER 4. RESULTS

Round 3

Table 4.3: Round 3 results for first RQ.

Round 3 First RQ

Person 1 records and speaks Person 2 records and speaks

Lamp Lamp(%) KF KF(%) Lamp Lamp(%) KF KF(%)Interpreted correctly 13 65 20 100 18 90 2 10Interpreted incorrectly 5 25 0 0 1 5 10 50No match found 2 10 0 0 1 5 8 40

Combined resultThe combined result from the three rounds for the first RQ for each person.

The result for first RQ from three rounds.

Figure 4.1: The result for first RQ fromthree rounds. Figure made with MicrosoftExcel.

Figure 4.2: The result for first RQ fromthree rounds. Figure made with MicrosoftExcel.

16

Page 31: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

4.1. RESULT FOR FIRST RESEARCH QUESTION

Figure 4.3: The result for first RQ fromthree rounds. Figure made with MicrosoftExcel.

Figure 4.4: The result for first RQ fromthree rounds. Figure made with MicrosoftExcel.

Total result for first research questionCombining the result independent which word was spoken or which person recordedand spoke.

Figure 4.5: Total result for first RQ. Figure made with Microsoft Excel.

17

Page 32: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

CHAPTER 4. RESULTS

4.2 Result for second research questionThese results are only for the research question when one person records the wordand another person speaks the word.There are three different rounds.

Round 1

Table 4.4: Round 1 results for second RQ.

Round 1 Second RQ

Person 1 records and person 2 speaks Person 2 records and person 1 speaks

Lamp Lamp(%) KF KF(%) Lamp Lamp(%) KF KF(%)Interpreted correctly 0 0 5 25 15 75 17 85Interpreted incorrectly 0 0 1 5 3 15 1 5No match found 20 100 14 70 2 10 2 10

Round 2

Table 4.5: Round 2 results for second RQ.

Round 2 Second RQ

Person 1 records and person 2 speaks Person 2 records and person 1 speaks

Lamp Lamp(%) KF KF(%) Lamp Lamp(%) KF KF(%)Interpreted correctly 17 85 12 60 15 75 10 50Interpreted incorrectly 3 15 1 5 0 0 3 15No match found 0 0 7 35 5 25 7 35

Round 3

Table 4.6: Round 3 results for second RQ.

Round 3 Second RQ

Person 1 records and person 2 speaks Person 2 records and person 1 speaks

Lamp Lamp(%) KF KF(%) Lamp Lamp(%) KF KF(%)Interpreted correctly 17 85 18 90 18 90 11 55Interpreted incorrectly 0 0 0 0 0 0 6 30No match found 3 15 2 10 2 10 3 15

18

Page 33: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

4.2. RESULT FOR SECOND RESEARCH QUESTION

Combined resultThe combined result from the three rounds for second RQ.

Figure 4.6: The result for second RQ fromthree rounds. Figure made with MicrosoftExcel.

Figure 4.7: The result for second RQ fromthree rounds. Figure made with MicrosoftExcel.

Figure 4.8: The result for second RQ fromthree rounds. Figure made with MicrosoftExcel.

Figure 4.9: The result for second RQ fromthree rounds. Figure made with MicrosoftExcel.

Total result for second research questionCombining the result independent which word was spoken or which person recordedor spoke.

19

Page 34: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

CHAPTER 4. RESULTS

Figure 4.10: Total result for second research question- Figure made with MicrosoftExcel.

20

Page 35: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

5 | Discussion and conclusions

As seen in the results, the success rate of the voice recognition is 78% when the sameperson setup the voice recognition and uses it. There are commercial recognitionsoftwares which have success rates around 90% to 96% [3][12]. Their software arebetter than what have been achieved in this thesis. Microsoft cortana has a successrate around 90%, with the success rate of 78% that have been achieved in this thesis,it is not too far off. The commercial software are of the type that was discussed inthe introduction, which are more complex. The voice recognition in this thesis onlyuses a few words, more precisely two, and as more words get added the success ratewill undoubtedly get worse.

When the voice recognition is used by someone that did not record the set up,the success rate gets lower. This is to be expected because of the different accentsand tone when speaking the same word. This is something that does not happenwith the commercial voice recognition because of their complex software. In thisregard the voice recognition of this thesis is not as good as the commercial software.

Even though the success rates are lower than the commercial ones, this voicerecognition has its perks. It is not as complex as the commercial ones and can beused with different languages. The hardware used can’t be used on the commercialones.

The system that has been built in this thesis could have some potential in beingused in the real world, one of the reasons is because of its simplicity. All thecalculations are done locally in the MCU. The MCU does not need to send datato a server for interpretation. The MCU does not have enough internal storage tostore what have been said. All these points makes it so no one can listen to whathave been said in the house.

For a better success rate it’s not necessary to change the MCU, its more im-portant to change the voice recognition software. If some values in the comparisonfunction is changed, words can be assumed the same but sometimes there is a lot of"no match found" which makes the success rate lower. Although a different MCUmight be needed if a completely different voice recognition software is used whichwould require more computational capabilities.

21

Page 36: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal
Page 37: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

6 | Recommendations and futurework

Potential future work could focus on making the success rate higher. Testing ahigher quality microphone or different comparison algorithms could be beneficial inachieving this. Both are recommended for future work.

As discussed in section 5 the voice recognition will not work as good with morewords. Future work could focus on making more words work better.

As the voice recognition software doesn’t keep track how the sound of a wordis changing over time it is recommended to implement that aspect to make therecognition software have a better success rate.

23

Page 38: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal
Page 39: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

Bibliography

[1] Grabianowski, E. (2006). "How Speech Recognition Works." fromhttp://electronics.howstuffworks.com/gadgets/high-tech-gadgets/speech-recognition1.htm viewed 11/5-2017.

[2] Gales, M. and S. Young (2008). The Application of Hidden Markov Models inSpeech Recognition. Engineering Department, Cambridge University.

[3] "IBM inches toward human-like accuracy for speech recognition." fromhttps://www.engadget.com/2017/03/10/ibm-speech-recognition-accuracy-record/ viewed 15/5-2017.

[4] "PIR Motion Sensor Tutorial." from http://www.instructables.com/id/PIR-Motion-Sensor-Tutorial/ viewed 23/3-2017.

[5] Jain, V. "Insight - Learn the Working of a Gas Sensor." fromhttps://www.engineersgarage.com/insight/how-gas-sensor-works viewed 24/3-2017.

[6] "Light Sensors." from http://www.electronics-tutorials.ws/io/io_4.html viewed24/3-2017.

[7] Lazaridis, G. "How PC Fans Work." fromhttp://pcbheaven.com/wikipages/How_PC_Fans_Work/ viewed 23/3-2017.

[8] Hirzel, T. "PWM." from https://www.arduino.cc/en/Tutorial/PWM viewed24/3-2017.

[9] Woodford, C. (2008/2016). "Microphones." fromhttp://www.explainthatstuff.com/microphones.html viewed 24/3-2017.

[10] "Formants." from https://home.cc.umanitoba.ca/∼krussll/phonetics/acoustic/formants.htmlviewed 15/5-2017.

[11] FFT for arduino from http://wiki.openmusiclabs.com/wiki/ArduinoFFTviewed 16/5-2017.

25

Page 40: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

BIBLIOGRAPHY

[12] "Who’s Smartest: Alexa, Siri, and or Google Now?" fromhttps://www.inc.com/kevin-j-ryan/internet-trends-7-most-accurate-word-recognition-platforms.html viewed 15/5-2017.Datasheets:

[13] Arduino Uno datasheet from http://www.atmel.com/images/Atmel-8271-8-bit-AVR-Microcontroller-ATmega48A-48PA-88A-88PA-168A-168PA-328-328P_datasheet_Complete.pdf viewed 21/5-2017.

[14] Transistor BU910 datasheet from http://pdf1.alldatasheet.com/datasheet-pdf/view/262372/ISC/BU910.html viewed 21/5-2017.

[15] Lightsensor datasheet from https://cdn-shop.adafruit.com/datasheets/GA1A1S202WP_Spec.pdfviewed 21/5-2017.

[16] Microphone datasheet from https://cdn-shop.adafruit.com/datasheets/MAX4465-MAX4469.pdf viewed 21/5-2017.

[17] Gas sensor datasheet from https://raw.githubusercontent.com/SeeedDocument/Grove-Gas_Sensor-MQ2/master/res/MQ-2.pdf viewed 21/5-2017.

[18] Display datasheet from http://www.lawicel.se/datablad/YM2004A.pdf viewed21/5-2017.

[19] Motion sensor datasheet from https://www.parallax.com/sites/default/files/downloads/555-28027-PIR-Sensor-Product-Guide-v2.3.pdf viewed 21/5-2017.

26

Page 41: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

A | Arduino code

/*Anton BackmanEric Cioran

Voice Controlled Homevoice_controlled_home_code_v2

TRITA NUMBER: MMK 2017:09 MDAB 627

19 maj 2017*/

#define LOG_OUT 1 // use the log output function#define FFT_N 128

#include <FFT.h> // include the FFT library#include <LiquidCrystal.h>

LiquidCrystal lcd(12, 11, 5, 4, 3, 2);int pinLed = 9;int pinFan = 10;int pinPir = 7;int lightsensorPin = A1;int smokesensorPin = A2;float rawRange = 1024;uint8_t first_freq = 2; // Discard frequencies belowuint8_t last_freq = FFT_N/2; //Discard frequencies aboveuint8_t smoke_detector_calib;uint8_t calib_mode = 1;int movement_timer = 0;int movement_time = 300;

void setup() {

27

Page 42: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

APPENDIX A. ARDUINO CODE

Serial.begin(9600);

setPwmFrequency(pinLed,1);

TIMSK0 = 0; // turn off timer0 for lower jitter - delay() and millis() killedADCSRA = 0xe5; // set the adc to free running modeADMUX = 0x40; // use adc0DIDR0 = 0x01; // turn off the digital input for adc0

lcd.begin(20, 4);pinMode(pinLed, OUTPUT);pinMode(pinFan, OUTPUT);pinMode(pinPir, INPUT);

}

void signal2fft() {/** Input: -* Output: -* Comment: Uses the signal from a microphone and transforms it into the frequency* plane. Data gets stored in vector fft_log_out. Taken from* http://wiki.openmusiclabs.com/wiki/Example 2017-05-13, it may be modified from* the original.*/

cli(); // UDRE interrupt slows this way down on arduino1.0for (int i = 0 ; i < 2*FFT_N ; i += 2) { // save 256 samples

while(!(ADCSRA & 0x10)); // wait for adc to be readyADCSRA = 0xf5; // restart adcbyte m = ADCL; // fetch adc databyte j = ADCH;int k = (j << 8) | m; // form into an intk -= 0x0200; // form into a signed intk <<= 6; // form into a 16b signed intfft_input[i] = k; // put real data into even binsfft_input[i+1] = 0; // set odd bins to 0

}// window data, then reorder, then run, then take output

fft_window(); // window the data for better frequency responsefft_reorder(); // reorder the data before doing the fftfft_run(); // process the data in the fftfft_mag_log(); // take the output of the fftsei(); // turn interrupts back on

}

28

Page 43: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

void record(uint8_t recorded_word[FFT_N/2], uint8_t last_state[2]) {/** Input: A vector where a sound signal in the frequency plane is stored. The* latest state that the home is in.* Output: -* Comment: Uses the signal2fft function to record spoken words. As long as the* soundlevel is over the threshold this function will average the data in* fft_log_out over time and store in recorded_word.*/

memset(recorded_word, 0, sizeof recorded_word);int i_2 = 1;int threshold = 100;int threshold_pos = FFT_N/8;int counter_quiettime = 0;int quiettime_limit = 100;int listen_state = 0;

while(1) {if (listen_state == 0) {

state_function(last_state);}signal2fft();if (fft_log_out[threshold_pos] >= threshold || listen_state == 1) {

Serial.println(fft_log_out[threshold_pos]);listen_state = 1;for (int i = first_freq ; i < last_freq ; i++) {recorded_word[i] = recorded_word[i] + (fft_log_out[i] - recorded_word[i])/(i_2);

}i_2++;if (fft_log_out[threshold_pos] < threshold) {

counter_quiettime++;}

}if (counter_quiettime >= quiettime_limit) {

return;}

}}

int comparison(uint8_t *active_sound, uint8_t *recordedword) {/** Input: Two different vectors of equal length.* Output: 1 for true or 0 for false.* Comment: The two vectors from the input are compared by looking at every

29

Page 44: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

APPENDIX A. ARDUINO CODE

* individual value in the first vector and compare them to the same value in the* second vector. Each of these values can only differ a little bit. The amount of* samey values gets counted. The vectors are deemed the same if that count is over* a specifik value.*/

uint8_t freq_sensitivity = 17; // 17float tot_sensitivity = 0.88; // 0.88uint8_t true_counter = 0;auto_level(active_sound, recordedword);

for (int i = first_freq ; i < last_freq ; i++) {if (abs(active_sound[i]-recordedword[i]) < freq_sensitivity) {

true_counter++;}

}

if ((float)true_counter/(last_freq) >= tot_sensitivity) {return 1;

}else {

true_counter = 0;return 0;

}}

void auto_level(uint8_t *active_sound, uint8_t *recordedword) {/** Input: Two vectors.* Output: -* Comment: Adjusts the vectors so they are the same at a specfik index.*/

for (int i = first_freq ; i < last_freq ; i++) {active_sound[i] = active_sound[i]+recordedword[FFT_N/8]-active_sound[FFT_N/8];

}}

void state_function(uint8_t *state) {/** Input: A vector.* Output: -* Comment: A function that interpets the state vector and sends the data for* actuating.*/

ADCSRA = 135; // set the adc to one-shot mode

30

Page 45: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

analogRead(lightsensorPin);lights(state[0]);

analogRead(smokesensorPin);kitchen_fan(state[1]);

analogRead(A0);ADCSRA = 0xe5; // set the adc to free running modeADMUX = 0x40; // use adc0

}

void lights(uint8_t setting) {/** Input: A value.* Output: -* Comment: Controls the lights. Three different settings. 0 is the lights being* off. 1 is the lights being on full. 2 is the lights being controlled by the* light sensor.*/

if (setting == 2) {if (digitalRead(pinPir) == HIGH) {

movement_timer = 1;}if (movement_timer < movement_time && movement_timer != 0) {

int rawLed = analogRead(lightsensorPin);int LedOutput = roundingfunc(rawLed,80)/2;if (LedOutput > 255) {

LedOutput = 255;}Serial.print("LedOutput = ");Serial.println(LedOutput);analogWrite(pinLed, LedOutput);movement_timer += 1;

}else {

analogWrite(pinLed, 255);}

}else if (setting == 1) {

analogWrite(pinLed, 0);}else if (setting == 0) {

analogWrite(pinLed, 255);

31

Page 46: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

APPENDIX A. ARDUINO CODE

}}

void kitchen_fan(uint8_t setting) {/** Input: A value.* Output: -* Comment: Controls the kitchen fan. Three different settings. 0 is the kitchen* fan being off. 1 is the kitchen fan being on full. 2 is the kitchen fan being* controlled by the smoke sensor.*/

if (setting == 2) {analogWrite(pinFan, 0);int rawFan = analogRead(smokesensorPin);if (calib_mode == 1) {

smoke_detector_calib = rawFan;calib_mode = 0;

}int FanOutput = (roundingfunc(rawFan,80))/4;Serial.print("FanOutput = ");Serial.println(FanOutput);

if (FanOutput > 200) {analogWrite(pinFan, 255);

}else if (FanOutput > 120) {

analogWrite(pinFan, FanOutput);}else {

analogWrite(pinFan, 0);}

}else if (setting == 1) {

analogWrite(pinFan, 255);}else if (setting == 0) {

analogWrite(pinFan, 0);}

}

int roundingfunc(int number, int roundnumber) {/** Input: A number to be rounded. The number it should be rounded to* Output: The rounded number.

32

Page 47: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

* Comment: A function that rounds a number to the closest roundnumber.*/

uint8_t i = 0;while (number > roundnumber/2) {

number = number - roundnumber;i += 1;

}return i*roundnumber;

}

void setPwmFrequency(int pin, int divisor) {/** Input: The pin to change frequency on. Divisor which changes which frequency.* Output: -* Comment: Sets the PWM frequency of the pin. To make the computer fan silent.* Taken from http://playground.arduino.cc/Code/PwmFrequency 2017-05-13.*/

byte mode;if(pin == 5 || pin == 6 || pin == 9 || pin == 10) {

switch(divisor) {case 1: mode = 0x01; break;case 8: mode = 0x02; break;case 64: mode = 0x03; break;case 256: mode = 0x04; break;case 1024: mode = 0x05; break;default: return;

}if(pin == 5 || pin == 6) {

TCCR0B = TCCR0B & 0b11111000 | mode;} else {

TCCR1B = TCCR1B & 0b11111000 | mode;}

} else if(pin == 3 || pin == 11) {switch(divisor) {

case 1: mode = 0x01; break;case 8: mode = 0x02; break;case 32: mode = 0x03; break;case 64: mode = 0x04; break;case 128: mode = 0x05; break;case 256: mode = 0x06; break;case 1024: mode = 0x07; break;default: return;

}TCCR2B = TCCR2B & 0b11111000 | mode;

33

Page 48: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

APPENDIX A. ARDUINO CODE

}}

void loop() {/** Comment: The main function. At first is the setup of the voice control. After* the setup is done the program goes into an infinite loop which is when the* program listens to commands from the user. The sensors which controls lights* and kitchen fan is also active here.*/

analogWrite(pinLed, 255);analogWrite(pinFan, 0);uint8_t recordedword10[FFT_N/2];uint8_t recordedword20[FFT_N/2];uint8_t active_sound[FFT_N/2];uint8_t state[2] = {0,0};uint8_t last_state[2] = {0,0};

lcd.setCursor(0, 0);lcd.print("Welcome to the voice");lcd.setCursor(0,1);lcd.print("control project!");delay(500000);lcd.clear();

lcd.print("Please follow these");lcd.setCursor(0,1);lcd.print("instructions:");delay(500000);lcd.clear();

lcd.print("Say: Lamp");record(recordedword10,last_state);//print_vec(recordedword10);lcd.setCursor(0,1);lcd.print("Wait 2 seconds...");delay(400000); //About 2 seclcd.clear();

lcd.print("Say: Kitchen fan");record(recordedword20,last_state);//print_vec(recordedword20);lcd.setCursor(0,1);lcd.print("Wait 2 seconds...");

34

Page 49: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

delay(400000); //About 2 seclcd.clear();

state_function(last_state);last_state[1] = 0;state_function(last_state);while(1){

lcd.clear();lcd.setCursor(0,0);if (state[0] == 0) {

lcd.print("Lamp: Off");}else if (state[0] == 1) {

lcd.print("Lamp: Full");}else if (state[0] == 2) {

lcd.print("Lamp: Auto");}

lcd.setCursor(0,1);if (state[1] == 0) {

lcd.print("Fan: Off");}else if (state[1] == 1) {

lcd.print("Fan: Full");}else if (state[1] == 2) {

lcd.print("Fan: Auto");}

lcd.setCursor(0,2);lcd.print("Active comparison");

record(active_sound,state);lcd.setCursor(0,3);

if (comparison(active_sound, recordedword10) == 1) {last_state[0] = state[0];lcd.print("Changed lamp state");if (state[0] == 2) {

state[0] = 0;}else {

state[0] += 1;

35

Page 50: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

APPENDIX A. ARDUINO CODE

}}else if (comparison(active_sound, recordedword20) == 1) {

last_state[1] = state[1];lcd.print("Changed fan state");

if (state[1] == 2) {state[1] = 0;

}else {

state[1] += 1;}

}else {

lcd.print("Couldn’t understand!");}state_function(state);

}}

36

Page 51: Voice controlled home1199967/FULLTEXT01.pdf · degree project in technology, first cycle, 15 credits stockholm , sweden 2017 voice controlled home anton backman eric cioran kth royal

TRITA MMK 2017:09 MDAB 627

www.kth.se