arai thesis

6
1 University of Aizu, Graduation Thesis. March, 2013 s1170002 Abstract Technical illustrations are important for understanding objects in space. Technical illustrations are used to communicate information of technical nature. This paper demonstrates that illustrations that show a performer’s point of view will be easier to understand with body position as a constant factor. Specifically, it is hard to understand movement of body positions described as an oral statement and in text, leading readers to mentally try and animate the performance. This paper argues that canonical body positions and depth perceptions across the fields of display are easier to comprehend. The experiment, as reported in this thesis aims to understanding how common people understand images that are shown from different perspectives and camera positions. The study remains inconclusive because no trend with the variety of the body tasks, height and angular combinations could be established. However, results show high levels of proficiency in identifying action-height-angular images when images from top views are matched with images shown from body height. 1 Introduction Mental imagery is an experience and an important aspect of our general understanding of how different objects functions in space without direct visualization. In a complex spatial world, mental imagery can present some complex cases of comprehension involving mental rotation. Mental rotation is the ability to rotate two-dimensional and three- dimensional objects in space but as an internal representation of the mind. It is basically about how the brain moves objects in the physical space in a manner that helps with positional understanding (including structural and functional) of objects in space. Research in psychology [1,2] has provided much literature demonstrating how people develop and customize mental models and perform mental rotation towards performing procedural actions in space. This is where technical illustrations can actually help develop guidelines in a way that might help users perform mental rotations in a predefined or expected sequence. Technical illustration is the use of illustration to visually communicate information of a technical nature. Illustrations should demonstrate visual images that are accurate in terms of dimensions and proportions, and should provide enough visual cues for the reader to understand exactly how any physical task is to be completed in a given 2-d space. Therefore showing body positions in an illustration lead to exact information and help you make an image. Further, depth perception is the visual ability to perceive the world in three dimensions and the distance between and within objects. People are better at judging distances directly across the display plane. Visual information becomes necessary for any physical action when it comes to learning a motor skill by observing it. Visual information is important for experiencing how a physical task needs to be completed. Illustrations are important for novice learners at an early stage of learning when it is hard to understand external physical movements, action sequences and patterns of movement, one that someone has not yet experienced directly and repeatedly. Technical illustrations might help comprehend the exact style of movement, pressure points, actions and reactions; etc could be initially understood using a manual. Mental rotation can be separated into the following cognitive stages [3]: 1 Create a mental image of an object 2 Rotate the object mentally until a comparison can be made 3 Make the comparison 4 Decide if the objects are the same or not 5 Report the decision The experiment as reported in this article has been designed to differentiate between these cognitive stages and then recognize three-dimensional movements shown in a 2-d environment. Interestingly, research in technical communication related to the aspect of understanding task-based body movements and ability for mental rotation could provide an interesting perspective into the comprehensive knowledge of how physical objects in Perception of Objects in Technical Illustrations: A Challenge in Technical Communication Yu Arai s1170002 Supervised by Prof. Debopriyo Roy

Upload: debopriyo-roy

Post on 19-May-2015

134 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Arai thesis

1

University of Aizu, Graduation Thesis. March, 2013 s1170002

Abstract Technical illustrations are important for

understanding objects in space. Technical illustrations are used to communicate information of technical nature. This paper demonstrates that illustrations that show a performer’s point of view will be easier to understand with body position as a constant factor. Specifically, it is hard to understand movement of body positions described as an oral statement and in text, leading readers to mentally try and animate the performance. This paper argues that canonical body positions and depth perceptions across the fields of display are easier to comprehend.

The experiment, as reported in this thesis aims to understanding how common people understand images that are shown from different perspectives and camera positions. The study remains inconclusive because no trend with the variety of the body tasks, height and angular combinations could be established. However, results show high levels of proficiency in identifying action-height-angular images when images from top views are matched with images shown from body height.

1 Introduction Mental imagery is an experience and an important

aspect of our general understanding of how different objects functions in space without direct visualization. In a complex spatial world, mental imagery can present some complex cases of comprehension involving mental rotation. Mental rotation is the ability to rotate two-dimensional and three-dimensional objects in space but as an internal representation of the mind. It is basically about how the brain moves objects in the physical space in a manner that helps with positional understanding (including structural and functional) of objects in space. Research in psychology [1,2] has provided much literature demonstrating how people develop and customize mental models and perform mental rotation towards performing procedural actions in space. This is where technical illustrations can actually help develop guidelines in a way that might

help users perform mental rotations in a predefined or expected sequence.

Technical illustration is the use of illustration to visually communicate information of a technical nature. Illustrations should demonstrate visual images that are accurate in terms of dimensions and proportions, and should provide enough visual cues for the reader to understand exactly how any physical task is to be completed in a given 2-d space. Therefore showing body positions in an illustration lead to exact information and help you make an image. Further, depth perception is the visual ability to perceive the world in three dimensions and the distance between and within objects. People are better at judging distances directly across the display plane.

Visual information becomes necessary for any physical action when it comes to learning a motor skill by observing it. Visual information is important for experiencing how a physical task needs to be completed. Illustrations are important for novice learners at an early stage of learning when it is hard to understand external physical movements, action sequences and patterns of movement, one that someone has not yet experienced directly and repeatedly. Technical illustrations might help comprehend the exact style of movement, pressure points, actions and reactions; etc could be initially understood using a manual.

Mental rotation can be separated into the following cognitive stages [3]:

1 Create a mental image of an object 2 Rotate the object mentally until a

comparison can be made 3 Make the comparison 4 Decide if the objects are the same or not 5 Report the decision

The experiment as reported in this article has been designed to differentiate between these cognitive stages and then recognize three-dimensional movements shown in a 2-d environment.

Interestingly, research in technical communication related to the aspect of understanding task-based body movements and ability for mental rotation could provide an interesting perspective into the comprehensive knowledge of how physical objects in

Perception of Objects in Technical Illustrations: A Challenge in Technical Communication Yu Arai s1170002 Supervised by Prof. Debopriyo Roy

Page 2: Arai thesis

2

University of Aizu, Graduation Thesis. March, 2013 s1170002

space could be shown for various tasks. This specific area of applied research in technical communication is still at its infancy and plenty of questions still remain unanswered. This research aims at contributing to the field by testing a set of hypotheses that could be directly used towards developing design guidelines for static technical illustrations.

The specific research questions for the experiment as reported in this study include the following:

o For a two-dimensional illustration being shown for a task completed in a three-dimensional space, would the readers prefer an object centered view or a performer-centered view?

o For a two-dimensional illustration being shown for a task completed in a three-dimensional space, would the readers prefer objects being shown with maximum viewpoints across the display plane or into the display plane?

o Should the primary object related to the task be shown below the camera position or at directly horizontal to the camera position?

o Is there any significant difference in the efficiency with which tasks are understood, based on type of task, height at which the task takes place with relation to the human body, and visual angles / perspectives, or some combination of all of the these factors?

2 Literature Review Mental imagery is a unique phenomenon in

cognitive psychology and is considered an inner experience that plays an important role for memory and thinking. Mental imagery has always been a central character in the research related to classical and modern philosophy. An important challenge and the central focus for this discussion are centered on how the human mind processes mental images.

Literature on mental rotation suggests mental rotation as the brain’s ability and way of moving objects in order to understand what they are and where they belong [3, 4, and 5]. Mental rotation refers to how the mind recognizes objects and its positions in space. Researchers call these objects stimuli. A stimulus then would be any object or image seen in the person’s environment that has been altered in some way. Mental rotation then takes place for the person to figure out what the altered object is.

Why is it that different people understand different physical procedures in a 2-D environment with differing ability? Besides issues related to design

efficiency, the answer probably lies in differing learning processes which emphasize visual, auditory, and kinesthetic systems of experience and preferences in learning styles [6]. One possible response could be thought of in terms of an individual’s ability for mental rotation towards comprehending any physical action. But the ability for mental rotation is also dependent on what people see and cannot see in the presented scene.

2.1 Display Plane Dynamics From a design perspective, people will perform

differently with objects or its angles when shown into versus across the display plane [7]. For objects into the display plane, there will be a question of how well readers can judge the distance between angles and object areas. This is because the vision will be obscured by bodies or objects on the line of sight resulting in lack of judgment related to depth cues. On the contrary, objects shown across the display plane show the maximum number of objects and visual angles across the line of sight making it easier to judge objects, and the need for judging depth cues are reduced. However, what needs to be shown depends on the task and the priority.

What does this literature mean for a technical communicator and an illustrator who wants to draw objects for explaining a physical procedure in a 3-d print or online environment? The first question that could be explored in this context is to ask how technical illustrators should demonstrate physical orientation to explain procedures. Another important question for technical communicators would be to understand the characteristics of the display plane for visualizing procedures.

2.2 Object-Body-Centered Perspective From a readers’ perspective, imagining someone

doing a physical action has more positive influence on physical task completion, when compared to no mental practice [8]. The important question about mental practice is related to whether the mental practice should be based on the spectator’s point of view or the performer’s point of view. There is extensive research done by Krull with the suggestion that graphics for physical tasks need to take into account the needs of users who will carry out actions in a physical environment. Research suggests that graphics need to show tasks from the users' viewpoint, and need to make clear how tools are to be used and the direction in which actions are to be exerted. An illustration with an object-centered point of view positions objects across a display plane [9]. This

Page 3: Arai thesis

3

University of Aizu, Graduation Thesis. March, 2013 s1170002

viewpoint, which could also be called a spectator’s view, allows objects to be placed so as to direct viewers’ attention without obscuring important parts of objects [10].

Psychological research has concurred that canonical views showing two-dimensional representations of physical actions that are held in a three dimensional world are best represented when illustrations are shown with objects in a three-quarter view from slightly below the camera position. Although canonical views (slightly rotated viewpoint to show maximum angles) are always preferred, when it comes to replicating a task, the choice between a spectator's viewpoint (seeing the action as an observer and not as a doer) and object-centered viewpoint (seeing the action as a doer and not as an observer) is rather obscure and more context-driven. However, if the focus is simply to adapt an object-centered perspective, the visual could be shown from the back without much thought going into how the task should be completed. This is important to understand because there are individual differences in the way people prioritize objects in space vis-a-vis the orientation of their bodies in space and with different interpretations of visual information and with different performance levels on the task [11, 12].

Hypotheses: The review of the literature points

towards the following set of hypotheses for the study. • Illustrations showing natural movements based on

actor’s own preference are easier to emulate, and stimulate activation of a strong mental image (Kosslyn et al., 1973, 1998, 1994).

• Canonical views showing three-quarter views are easier to understand.

• Object zones shown into the display plane are harder to comprehend when objects are shown across the display plane.

3 Methods Sample and Context: Forty-one students who are

non-native speakers of English (native Japanese speakers) participated in this study.

3.1 Procedures The experiment aims to understand how common

people understand images and relates them to images shown from different perspectives and camera positions. We asked test subjects to evaluate body images via matching tasks and asked them to rate their confidence in their choices.

3.2 Methods 41 subjects took part in the experiment and each

subject rated 40 image types, divided into two blocks of 20 each. As part of its robust design, the experiment considered two sets of images. For the experiment, we generated images of body positions for two kinds of activities: a man holding a ball and a man throwing a ball. The purpose for using two different types of objects relates to the exploration of whether object types influences how decisions about depth perceptions and display planes and viewpoints (object-centered or performer-centered) are made. This paper only discusses the results generated from the image set related to the man with the ball. The other set (man with bat) has been discussed as part of another paper [13].

Each participant was handed out two different sets as part of an in-class graded assignment, with each set having 20 test sheets. Each participant was first handed out an instruction sheet in Japanese, and they were orally explained in Japanese as to what is expected of them from the experiment.

3.3 Instruments Using a computer program that sustains accurate

three-dimensional relationships among body parts, the experimenter produced variations of viewpoints and body positions. Each position included two heights for each activity: Chest and Waist. The man-holding-the-ball is shown as holding a ball centered in front of his chest or waist with the hands gripping the ball from both sides. The man-throwing-the-ball version shows him throwing a ball with his left hand at the chest or waist height. These action gestures were captured for five positions where the body moves with the camera position remaining constant: Front - 0 degrees (the man holding/throwing the ball and facing the camera head on), 1/3 Side - 30 degrees, Side - 90 degrees, 1/3 Back - 120 degrees, Back - 180 degrees. For all these images, the camera was positioned slightly above the waist height.

Each set had five images and there were 4 sets in total. Every set was rotated in five angles as mentioned before. The first set showed a man holding a ball at chest height; the second set showed throwing at chest height. Two other sets showed a man holding and throwing a ball at waist height respectively.

Once these images were generated, the camera was then positioned to capture images from the top for the above-mentioned sets. A matching top image was generated for each image generated from the sets above, with a displacement along the y and z-axis to

Page 4: Arai thesis

4

University of Aizu, Graduation Thesis. March, 2013 s1170002

position the camera exactly on top of the head. Each of the images generated for the 4 sets were tested to see whether readers could identify the same when shown from the top. Each test sheet had an image from the above sets, with three top views out of which only one top view correctly represents the view shown from slightly over the waist height. Each test sheet had three questions and question 1 and 3 were answered using a Likert scale. 1. Identify the most appropriate picture shown

from the top that matches the picture shown from the waist height. (Three options provided).

2. Which illustration shown from the top stands the second best?

3. How confident are you about your response?

3.4 Data Analysis During data sorting, each test sheet for the given

task was reported to be accurately identified and given a score of either 1 or 0. The data was entered in SPSS statistical software for data analysis. If a person did not answer a question (test sheet), he/she was assigned a grade of 0 for the task. For the questions on confidence, a score in the range of 1-5 was reported for each test sheet. A score of 0 was assigned when the question on confidence was observed as 0 (not answered). The answers were divided into four different blocks of information.

1 Man Holding Ball - Chest Height (Front, 1/3rd Side, Side, 1/3rd Back, Back)

2 Man Holding Ball - Waist Height (Same) 3 Man Throwing Ball - Chest Height (Same) 4 Man Throwing Ball - Waist Height (Same)

4 Findings Overall findings suggest that readers across the

different body position types, rotations and heights used, did quite well with mean values indicating 85 ~ 90% accuracy.

Table1 shows the following accuracy types with

each combination having 5 images: Mean Accurate Responses for the following:

• 5 Angular Rotations Man with Ball - Holding - Chest Height

• 5 Angular Rotations for Man with Ball - Holding - Waist Height

• 5 Angular Rotations for Man with Ball - Throwing - Chest Height

• 5 Angular Rotations for Man with Ball - Throwing - Waist Height

Table 1: Descriptive Statistics for Man with Ball

Mean Std. Deviation

Frequency

0 1

Holding Chest Front .85 .358 6 35

Cochran's Q = 17.968

Holding Chest 1/3side .90 .300 4 37

Holding Chest Side .85 .358 6 35

Holding Chest 1/3back .90 .300 4 37

Holding Chest Back .90 .300 4 37

Throwing Chest Front .93 .264 3 38

Throwing Chest 1/3side .93 .264 3 38

Throwing Chest Side .93 .264 3 38

df = 19

Throwing Chest 1/3back .83 .381 7 34

Throwing Chest Back .88 .331 5 36

Holding Waist Front .83 .381 7 34

Holding Waist 1/3side .88 .331 5 36

Holding Waist Side .88 .331 5 36

Holding Waist 1/3back .88 .331 5 36

Asymp. Sig. = .525

Holding Waist Back .93 .264 3 38

Throwing Waist Front .90 .300 4 37

Throwing Waist 1/3side .88 .331 5 36

Throwing Waist Side .83 .381 7 34

Throwing Waist 1/3back .90 .300 4 37

Throwing Waist Back .93 .264 3 38

Data shows that there is not much difference

between the mean values for any position/height combination type. The overall average mean values shows 90% accuracy with which readers could identify any given illustration at the body height and match it with the top view.

The frequency chart further corroborates the mean accuracy results by showing that in all cases participants with accurate responses count up to between 34~38, while participants who answered wrongly measured up to between 3~7.

Further, statistically I wanted to explore if the similarity in mean accuracy between body positions also indicates that there is an insignificant difference between all the 20 body positions combined.

I then performed a non-parametric statistical analysis called the Cochran Test. This test enables us to identify whether there is a statistically significant difference between all the treatments carried out.

The Cochran test is performed for conditions where the values are binary. In this experiment, the accuracy

Page 5: Arai thesis

5

University of Aizu, Graduation Thesis. March, 2013 s1170002

values for each body position were recorded as 1 = correct, and 0 = incorrect. This treatment helps us understand if the “k” treatments have identical effects.

Overall, Cochran’s Q Value for the 5 Angular Rotations, Height and Ball Action shows Q = 17.968, with Asymp. Sig = .525. This sig. value goes on to show that p > .05, suggesting an insignificant difference between the results.

I then wanted to test if there is any significant difference between responses / accuracy scores in the matching task, between the illustrations rotating at 5 angles showing holding a ball at chest height.

The Confidence data shows a mean score around 3.68 ~ 3.80 in a 1 ~ 5 scale. The highest mean confidence score is reported for three of the five positions, reportedly front, 1/3rd side and 1/3rd back positions at 3.80. We see a touch lower mean score for 1/3rd side and back positions, but the difference might not be significant. Data shows that for the 5 rotations of throwing chest positions, there is relatively lower level of confidence for the throwing chest positions in the range of 3.53 ~ 3.85. We see relatively lower confidence levels for the front position and 1/3rd side positions, but the mean confidence does pick up for the side and back positions. The highest reported confidence is seen for the 1/3rd back position at 3.85.

Data shows high mean confidence levels for the holding waist positions. High mean confidence scores are recorded for front, 1/3rd side and 1/3rd back positions at over 3.90 while we see a relatively lower levels of confidence for side and back positions at 3.80 and 3.73 respectively.

Show a relatively high level of confidence for all the five positions for throwing –waist in the range of 3.78~3.95. We see a comparatively high confidence score for the front position at 3.95 when compared to other front positions for action-height combinations.

A non-parametric analysis with the Friedman test for the confidence levels for the 20 different body – height – action combinations was considered to judge if any significant difference exists between the confidence levels when considered together. Results showed a Chi-square value of 25.172 with an Asymp. Sig. value = .155 > .05. This goes on to show that there is no significant difference between the confidence levels as reported for 20 different illustrations.

Further, we tested the vicariate correlation between the mean accuracy scores and confidence levels. Data shows a few significant correlations, although in most cases no significant correlations were observed.

The data shows the significant correlation values.

This data is not conclusive and indicative of any pattern, but there are some indications that non-canonical viewpoints show a strong correlation between actual accuracy scores and confidence.

5 Discussion Data in this specific application context clearly

negates the null hypotheses that canonical views showing multiple viewpoints and maximum angles of visibility have a distinct advantage when compared to full front or back views where the viewpoints are expected to be obscured from direct vision. This specific context where readers had time to work through the matching tasks, clearly demonstrated that side, 1/3rd back, 1/3rd side views did not have any distinct and statistically significant advantage when compared to other frontal or back views. This study is different from previous studies because it allowed readers time to explore the task, whereas earlier studies conducted by myself were performed in a context with lesser stake and more random set of participants demonstrating the fact that canonical views are more beneficial when decisions are needed to made quickly and without reconsideration and more intricate thoughts going in to the process. Further, data do not support and demonstrate any significant difference and lowered mean accuracy and confidence levels when objects are shown in the display plane versus across the display plane. Finally, we did not specifically test what is a “natural movement” in the selected sample, but data on first look did not show any evidence in support of the hypotheses. However, unless “natural movement” is defined clearly and tested as two specific groups, testing this hypothesis is probably beyond the scope of the analysis. This paper has just provided some initial indications of the phenomenon.

This study is interesting because it allows us to see the accuracy with which readers are able to judge positions which are manipulated based on three different factors; reportedly body height, rotation angles, and actions. These three variations when happening at once, presents multiple confounding variables that should be considered towards interpreting the results. The fact that the matching task was conducted for a static camera angle shown from the top for all the combinations, allowed us to see how these body positions, otherwise seen at body height would appear from the top and then perform the matching task accordingly. The purpose behind showing the camera angles from the top was to put body zones/areas into the display plane at the vertical axis (seen from top), thereby intentionally attempting to negate the difference in judgments that could arise

Page 6: Arai thesis

6

University of Aizu, Graduation Thesis. March, 2013 s1170002

when body positions are shown at body height. At body height with such variations in rotation, height and action, more diverse visual cues should be naturally available when compared to top views.

The most interesting finding that could be reported from this study is the fact that there is a large amount of similarity among the mean accuracy scores for different body height – action – rotation combinations. The Cochran’s test goes on to show that there is a statistically insignificant difference between the mean scores for the 20 different combinations. Further, the Cochran’s test within the height-action combinations (5 illustrations/per combination) goes on to show that there is an insignificant difference. One probable reason for such an outcome might be the fact that readers spent some quality time judging the possible matching illustration, as this was a graded assignment for the class and everyone wanted to do well. That probably explains the high levels of accuracy with almost every single matching task. But the difference between accuracy sores, however insignificant still goes on to show that it was the character of the illustration that prompted a response and judgment towards accuracy or inaccuracy. In other words, variations in judgments could also have been caused by the position of the body as it looks from a top view.

6 Conclusion This has been an interesting study because it

allowed us to explore the effects of different body rotations, height and action. This experiment probably has been conducted under very strict conditions where the purpose of the experiment was well explained, the exercise was handled as part of a class assignment, and participants were allowed a lot of time to think through the exercise. It is possible that participants might have referred back and forth between matching tasks towards making judgments and that might have increased their accuracy levels. That makes for an interesting argument as to whether we want participants to make judgments based on first instinct or if readers are allowed to actually make thoughtful decisions towards the response they think appropriate. This should be considered as an exploratory analysis, as a lot more needs to be tested (e.g., different experimental conditions) for these matching tasks with the same set of variables.

7 References [1] Z. W. Pylyshyn, “What the mind's eye tells the

mind's brain: A critique of mental imagery,” Psychological Bulletin, vol. 80, no. 1, pp. 1-24, Jul. 1973.

[2] R. N. Shepherd and J. Metzler, “Mental rotation of three dimensional objects," American Association for the Advancement of Science, Vol. 171, No. 3972, pp. 701-713, 1971.

[3] A. M. Johnson, “Speed of Mental Rotations a Faction of Problem-Solving Strategies,” Perceptual and Motor Skills, vol. 71, no. 3, pp. 803-806, Dec. 1990.

[4] B. Jones and T. Anuza, “Effects of sex, handedness, stimulus and visual field on ”mental rotation”,” Cortex, vol. 18, no. 4, pp. 501-514, Dec. 1982.

[5] C. Hertzog and B. Rypma, “Age difference in components of mental rotation task performance,” Bulletin of the Psychonomic Society, vol. 29, no. 3, pp. 209-212, 1991.

[6] H. Gardner, Frames of Mind: The Theory of Multiple Intelligences. New York: Basic Books, 1983.

[7] R. Krull et al., “User perceptions and point of view in technical illustrations,” Proc. STC’s 50th Annual Conference Proceedings, Dallas, TX, May, 2003, pp. 205-210.

[8] J. A. Verbunt et al., “Mental practice-based rehabilitation training to improve arm function and daily activity performance in stroke patients: a randomized clinical trial,” BMC Neurology, 8:7 doi: 10.1186/1471-2377-8-7 April 2008.

[9] R. Krull, “Comparative Assessment of Document Usability With Writing Quality Measures,” STC Proceedings, 1994.

[10] R. Krull et al., “Canonical Views in Procedural Graphics,” Professional Communication Conference, 2003. IPCC 2003. Proceedings, IEEE International, 2003.

[11] A. D. Milner and M. A. Goodale, The Visual Brain in Action. Oxford University Press, 1995.

[12] J. M. Zacks et al., “Selective disturbance of mental rotation by cortical stimulation,” Neuropsychologia, vol. 41, pp. 1659-1667, 2003.

[13] M. Nozawa, “Efficacy of Technical Illustrations in a Technical Communication Environment,” University of Aizu graduation thesis 2013.