rock paper scissors - idc€¦ · rock paper scissors alon biran and mor sheffer image processing...
TRANSCRIPT
Rock Paper Scissors Alon Biran and Mor Sheffer
Image Processing on Mobile Platforms
IDC
Abstract
Rock-paper-scissors is a hand game usually
played by two people, where players
simultaneously form one of three shapes
with an outstretched hand. The "rock" beats
scissors, the "scissors" beat paper and the
"paper" beats rock; if both players throw the
same shape, the game is tied.
The application described here is some form
of implementation of this game, where the
player plays against the computer, and using
computer vision algorithms, the computer
guesses what the player will do before he
finishes the motion, as well as detecting the
final hand gesture and displaying the winner.
Interfaces / UI
The UI consists of 3 screens:
Splash Match Results
Match – includes a countdown in a form of 3
pink circles. When the countdown is over, a
snapshot is taken and the view changes to
the Results screen.
Results – two gestures are displayed: one
that was detected as the user’s gesture and
one for what the computer selected. Also,
the winner is announced, as well as the score
count for previous matches.
Implementation
The implementation consists of 3 major
parts: Gesture Detection, Gesture Guessing
and the UI Controller. The implementation
was in both C++ and Java, major algorithmic
parts were in native code, while the analysis
of the results was done using Java and in the
GUI, the circles countdown as well as camera
control was done using Java.
Gesture Detection
In order to understand the gesture of the
user, we’ve decided to implement a finger
counting technique, count the fingers, and
according to that, decide what gesture the
user did (>4 => paper, 2-3 => scissors, 0-1 =>
rock), this was written in native code,
supplying an interface to the application.
In order to count fingers, using knowledge
gained by [3], we’ve implemented the
following algorithm:
First we extract the skin from the image by
converting it to YCC color space (a yellowish
plane), and then applying a threshold we’ve
gained from experiments, we’re also using a
changing threshold in order to deal with
different light conditions, after the threshold
has been applied, an erosion and dilation
action occurs, in order to fill holes and
remove unwanted small interruptions, as
displayed in the images below.
Threshold Erosion Dilation
After that, a series of heuristic calculations
and methods are applied, first we create a
bounding box over the hand as well as a
convex hull over the polygon, then we detect
convexity defects according to the bounding
convex hull, using those defects, we were
able to understand how many fingers were
shown by the user, this was sent from the
native application to the GUI, and the GUI
converted the finger count into a gesture.
Below is shown an image where the gray
rectangle is the bounding box of the hand,
the blue polygon is the convex hull of the
hand, the gray polygon is the hand polygon,
the blue circle is the middle of the hand, and
the red circles point to the convexity defects
that were found (we can see that since two
fingers were close to each other, the skin
detected after dilation merged and thus no
defect was found.
Another note worth mentioning is that in the
case no skin was detected, the output is
directed out as well and is shown as
‘unknown’ image.
Gesture Guessing
In order to “try” and guess what the user is
doing we’ve used an optical flow method we
saw in the opencv samples called
FlowFarneback Optical Flow which outputs
a map of “arrows” that were pointing to the
flow of differences between two images,
we’ve used that output in order to count up,
down, left and right directions of the users
hand, and using some heuristics, we were
able to understand what the user was about
to do, for example, in the case where we had
a lot of down arrows, as shown in one of the
images below, we’ve successfully concluded
that the user was going to do rock gesture,
this algorithm has proven slow, even after
resizes and so it was solved using computer
science and programming techniques rather
than computer vision ones, as will be
explained in the controller section. This
algorithm was implemented in native C++
language as well, and the output of it was
rock, paper or scissors.
Controller
The algorithms are controlled by the UI,
which takes images and draws circles for
counting down, since the optical flow
algorithm was slow, we take one image at
approx. 2.2 seconds and then another image
at approx. 1.1 seconds and raise a thread to
calculate the differences, this allows the
program to continue running normally while
waiting for the optical flow results, when
counting is finished, another snapshot is
taken and sent to the finger counting
algorithm, when both finish (counting down
is fast), the results are sent to the Results
screen for display of the winner, the
computer choice is selected according to
what would win the optical flow result.
Results
The algorithm was run over 30 gestures and
the thresholds for detecting skin were
adjusted accordingly.
After adjusting the thresholds, the skin was
detected as expected, but then we
encountered a new challenge: white light vs.
yellow light. When the picture was taken
under a yellow light, the whole picture was
detected as skin.
After more improvements, at the end of the
process, the skin was detected as expected
in over 90% of the cases.
Regarding the optical flow detection, as we
took only one frame before the gesture and
one frame after it (because of performance
limitations), the prediction of the gesture
depends on the user’s timing of making the
gesture.
Link to Movie about the project:
https://drive.google.com/file/d/0B3GkZ7Vs9
JewTHA4bXY2cDJRMEk/edit?usp=sharing
References
[1] Opencv Android Samples . (n.d.). Retrieved
from
http://opencv.org/platforms/android/o
pencv4android-samples.html
[2] OpenCV Tutorials . (n.d.). Retrieved from
http://docs.opencv.org/doc/tutorials/tu
torials.html
[3] Tongo, L. d. (2010). Hand Detection and
finger counting example. Retrieved
from
https://www.youtube.com/watch?v=Fjj
9gqTCTfc