the human factors perspective

4
Conference report The PemNettve SID is a major international confer- ence covering numerous areas in dis- play technology, human factors and applications. The working theme of the symposium was: 'Gateway to the Images of Tomorrow'. Around 250 papers were presented, making a thorough review of the symposium impractical. This report attempts to deal with papers within my own areas of inter- est: vision and human factors. These are: display human factors; image quality; virtual environment technol- ogy; multimedia and videoconfer- encing; and perceptual approaches to image coding. Display human factors The purpose of displays is to convey information. Several important emerging markets are driving the heightened interest in advanced dis- play technology, including mobile computing and communications, multimedia and virtual reality. Con- sider mobile communication devices, as an example. These require small, low-cost displays with ever increas- ing quality requirements to support increased functionality. Rapid ad- vances in the flat panel display tech- nologies are being made to meet these ever more stringent demands. In order to do this effectively with human observers, the capabilities of users need to be considered at a number of levels. These range from basic sensory and motor character- istics up through increasingly com- plex processing stages at perceptual and cognitive levels. This issue is of importance from the design of a character in a text- based system, through the design of a graphical user interface, to the design of multimedia systems. For example, consider the user of a com- plex graphical display. The user will have to scan the visual display for relevant information, while carrying out some other cognitive tasks. Given what we know about eye movements and human vision, how should the information be spatially positioned on the screen? Andre and Cashion (USA) showed that there are essentially three areas of the vi- sual field that need to be considered. If the information is presented within about 8° of visual angle (about 7 cm for a viewing distance of 50 cm), performance is just as good as foveal presentation. However, presentation within the 15--48 ° range is slightly worse, and if head move- ments are required (for the range 56-89°), response time increases rapidly. Gan, Kotani and Miyahara (Japan) addressed the issue of how many bits are enough for colour displays. There is a general opinion that 8 bits per gun are sufficient. However, basing their analysis on the ability of users to judge 'just noticeable differences' in colour, the authors showed that displays need at least 10 bits for the red primary, 12 bits for the green primary and 9 bits for the blue primary. Fiske and Silverstein (USA) showed how a model of human vision could be used to create a colour simulation tool. The simu- lation tool runs on a CRT and can be used to predict the way an LCD panel would look. This provides a valuable method of assessing the colour appearance of prototype dis- plays at an early stage in the design process. Kaufmann and McFadden (Canada) carried out an evaluation of colours used on an electronic navigation chart. For some time, commentators have suggested that inappropriate choice of background colour could cause changes in the appearance of foreground colours, because of chromatic contrast. In this paper, the authors show that the perceived colour of foreground el- ements changes dramatically when viewed on different pastel back- grounds. For example, under day- light viewing conditions, pink is reported as magenta (on a light blue background) and pale green is re- ported as light blue (on a pale yellow background). Note that these back- ground colours are consistent with current design guidelines, which in- variably recommend pastel shades in preference to highly saturated colours. This work emphasizes the importance and benefits of consider- ing human vision at an early stage in the design process. Image quality Image quality has always been of importance to users of visual dis- plays, and with the introduction of the EC Directive on Display Screen Equipment earlier this year, employ- ers now have a legal obligation to ensure that the image quality of displays meets certain minimum requirements. Additionally, the de- mands of multimedia have placed image quality close to the top of any developer's agenda. How do you measure image quality? What is it in the image that people use when making judgements about image quality? Cappels described a new metric for evaluating the spatial non-uni- formity inherent in colour display screens. Using the AE* v metric, the author showed that it related directly to user perception of screen non- uniformity. Additionally, the vari- ation of AE*. over the screen surface (AE*/m) appears to be a good indi- cator of the screen spatial non- uniformity. 178 Displays Volume 14 Number 3 1993

Upload: david-travis

Post on 21-Jun-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Conference report

The PemNettve SID is a major international confer- ence covering numerous areas in dis- play technology, human factors and applications. The working theme of the symposium was: 'Gateway to the Images of Tomorrow'. Around 250 papers were presented, making a thorough review of the symposium impractical.

This report attempts to deal with papers within my own areas of inter- est: vision and human factors. These are: display human factors; image quality; virtual environment technol- ogy; multimedia and videoconfer- encing; and perceptual approaches to image coding.

Display human factors

The purpose of displays is to convey information. Several important emerging markets are driving the heightened interest in advanced dis- play technology, including mobile computing and communications, multimedia and virtual reality. Con- sider mobile communication devices, as an example. These require small, low-cost displays with ever increas- ing quality requirements to support increased functionality. Rapid ad- vances in the flat panel display tech- nologies are being made to meet these ever more stringent demands. In order to do this effectively with human observers, the capabilities of users need to be considered at a number of levels. These range from basic sensory and motor character- istics up through increasingly com- plex processing stages at perceptual and cognitive levels.

This issue is of importance from the design of a character in a text- based system, through the design of a graphical user interface, to the design of multimedia systems. For example, consider the user of a com- plex graphical display. The user will

have to scan the visual display for relevant information, while carrying out some other cognitive tasks. Given what we know about eye movements and human vision, how should the information be spatially positioned on the screen? Andre and Cashion (USA) showed that there are essentially three areas of the vi- sual field that need to be considered. If the information is presented within about 8 ° of visual angle (about 7 cm for a viewing distance of 50 cm), performance is just as good as foveal presentation. However, presentation within the 15--48 ° range is slightly worse, and if head move- ments are required (for the range 56-89°), response time increases rapidly.

Gan, Kotani and Miyahara (Japan) addressed the issue of how many bits are enough for colour displays. There is a general opinion that 8 bits per gun are sufficient. However, basing their analysis on the ability of users to judge 'just noticeable differences' in colour, the authors showed that displays need at least 10 bits for the red primary, 12 bits for the green primary and 9 bits for the blue primary.

Fiske and Silverstein (USA) showed how a model of human vision could be used to create a colour simulation tool. The simu- lation tool runs on a CRT and can be used to predict the way an LCD panel would look. This provides a valuable method of assessing the colour appearance of prototype dis- plays at an early stage in the design process.

Kaufmann and McFadden (Canada) carried out an evaluation of colours used on an electronic navigation chart. For some time, commentators have suggested that inappropriate choice of background colour could cause changes in the

appearance of foreground colours, because of chromatic contrast. In this paper, the authors show that the perceived colour of foreground el- ements changes dramatically when viewed on different pastel back- grounds. For example, under day- light viewing conditions, pink is reported as magenta (on a light blue background) and pale green is re- ported as light blue (on a pale yellow background). Note that these back- ground colours are consistent with current design guidelines, which in- variably recommend pastel shades in preference to highly saturated colours. This work emphasizes the importance and benefits of consider- ing human vision at an early stage in the design process.

Image quality

Image quality has always been of importance to users of visual dis- plays, and with the introduction of the EC Directive on Display Screen Equipment earlier this year, employ- ers now have a legal obligation to ensure that the image quality of displays meets certain minimum requirements. Additionally, the de- mands of multimedia have placed image quality close to the top of any developer's agenda. How do you measure image quality? What is it in the image that people use when making judgements about image quality?

Cappels described a new metric for evaluating the spatial non-uni- formity inherent in colour display screens. Using the AE* v metric, the author showed that it related directly to user perception of screen non- uniformity. Additionally, the vari- ation of AE*. over the screen surface (AE*/m) appears to be a good indi- cator of the screen spatial non- uniformity.

178 Displays Volume 14 Number 3 1993

Conference report

Ahumada (USA) reviewed com- putational image quality metrics. Computational metrics promise an 'objective' method of measuring image quality, and provide an alternative to performance-based or 'subjective' tests. Assume that you have two displays and you wish to compare their image quality. Using a computational metric, you apply a model of vision to the image and predict task performance. Simple metrics have been around for some time (for example, the modulation transfer function area (MTFA) met- ric, which compares the MTF of a display with the contrast threshold function of human vision); but these capture only a limited range of hu- man visual abilities. Ahumada re- viewed metrics that consider optical blur, photoreceptor sampling and transduction, retinal local contrast enhancement and gain control, and masking of various kinds (i.e. lumi- nance, contrast, spatial frequency, orientation and location). These clearly offer great promise, but the models are still restricted to a con- sideration of relatively early visual processing--simply because our knowledge of human vision is in- complete.

Scott Daly (USA) described a fine example of this approach: the visible difference predictor (VDP). This is a model for evaluating whether two images will be judged the same or different. The output of the model is a detection map showing the proba- bilities of detecting differences in the image as a function of location.

Barten (The Netherlands) de- scribed the square root integral (SQRI) method for evaluating the image quality of flat panel displays and CRTs. The main advantage of the SQRI method is that it is very simple to compute: all that is needed is contrast measurements at two different spatial frequencies. The SQRI of five display technologies (CRT, LCD, PDP, VFD and EL)

was compared with subjective rat- ings of image quality. The metric did a poor job of predicting image qual- ity for the VFD and EL displays: the author claimed that this was because they both had a highly polished glass surface without antiglare treatment. However, it predicted the image quality of the other displays impres- sively (r 2 = 0.96).

In contrast, Jorna (USA) de- scribed his PhD work in which he compared a number of image-qual- ity metrics for evaluating displays. These were the MTFA, the SQRI, the integrated contrast sensitivity (ICS) and the subjective quality fac- tor (SQF). These metrics are all based on various physical measure- ments from the display and make different assumptions about what is important in images for human vision. These results were compared with subjective ratings of image quality using the paired-comparison method (see below). Subjects viewed four different images on a colour monitor: a family portrait, a scenic picture of a forest, a group of people having a picnic and a text image. None of the metrics performed par- ticularly well: except for the text image, all of the r ~ values were less than about 0.6. It is worth consider- ing this result. Perhaps the corre- lations are 10w because the metrics have been derived for their ability to evaluate text images. The visual de- mands in reading text are quite different from those in viewing other images. In general, the ICS and SQF metrics (r 2 ~ 0.75) appeared better than the SQRI and MTFA, es- pecially for the text image. Jorna suggests that the SQRI metric per- forms poorly because of the empha- sis it places on lower spatial frequencies in the image.

Chen, Rotondo and Shovar (USA) described two subjective methods for measuring video image quality. One is the mean opinion score: subjects simply give the image

a rating between 1 (poor) and 5 (excellent). This is quick (for example, ten displays can be evalu- ated with just ten rating scores) but subjects do not always find it an easy task to do. A subject may find it helpful to see a reference display (for example, they may be shown a high- quality image and be told that this represents a rating of 5). The other subjective method used in this study was the paired comparison method. This method is very popular in Japan but has been used little in Europe and North America. In this method, subjects view two displays side by side and simply say which one is better. Subjects may also be asked to say how much better (for example, on a scale of 0 to 10). This is a simple task for observers but can take up a lot of experimental time since the number of pairs quickly exceeds the number of test items (for example, to compare ten different displays it is necessary to make 45 paired comparisons). In this exper- iment both subjective measures were very highly correlated (r > 0.96).

Virtual environment technology

Virtual reality is certainly one of the most hyped, if not one of the more exciting, developments in infor- mation technology over the last few years. No conference on displays or human factors or computer graphics is complete without a session on virtual reality. This was certainly the case at SID, which included over 20 papers on the theme including a provocatively titled address by one of the founding fathers, Tom Fur- ness: 'A vision for virtual environ- ment technology'.

Many of the papers provided simple and valuable guidelines for the design of head-mounted displays (HMDs). For example, Previc (USA) showed that the best place to put information in an HMD is in the upper right visual field. Presumably this is because (in the real world)

Displays Volume 14 Number 3 1993 179

Conference report

distant information is more likely to appear in the upper visual field. Grigsby and Tsou (USA) showed that an optimal design for partial overlap binocular displays was about 40 ° . Jones showed that mis- matches between target perceived distance and optical distance have no important effects on human visual accommodation, as long as the viewing pupils exceed 3mm. Stuart (USA) reviewed human factors issues in virtual reality, in- cluding cybersickness, safety and the integration of multi-modal input and output. Takada et al. (Japan) described applications of virtual reality-like applications over ISDN, mainly concerned with medical ap- plications, such as the 'Virtual Doctor': the ability for physicians to 'touch' patients using remote robot technology.

Furness (USA) gave the 'vision- ary' talk to a packed audience. Fur- ness is Director of the University of Washington's Human Interface Technology (HIT) Lab, an organiz- ation attempting to establish itself as the world co-ordinating centre for VR research. Furness's vision would have been exciting 5 years ago, but in the fast-moving area of virtual reality, it seemed pedestrian. This seemed like the kind of talk he might give to a potential sponsor who had just heard about the area. To a tech- nical audience, anticipating future developments, the talk fell wide of the mark. A second talk from the HIT Lab by Kollin (USA) was also disappointing. This talk was on the much-heralded retinal display. The idea behind this is that the ultimate visual display will be one that scans directly on the human retina. The goal is to produce a stereoscopic, full-colour, high-resolution (4000 x 3000 pixels) display. However, the current system is monocular, mono- chromatic with a resolution of 500 x 1000 pixels. It sits on an opti- cal bench and cannot be worn on the

head of a user. This was a great disappointment considering the hype that the HIT Lab have given this system (for example, a working model had been promised for the first half of 1993). There was also some suggestion from the audience that the retinal scanner is simply a reinvention of the laser optometer.

Multimedia and videoconferencing

Multimedia is seen by many to be the future of computing, television and telecommunications. There were a number of papers at SID on this theme.

Joy Mountford (USA) gave a presentation demonstrating the de- velopment and the potential of Apple Computer's QuickTime, a foundation for time-based media (such as video and sound samples). As would be expected, there was an emphasis on user testing. However, Mountford also said that 'inter- action designers' were moving away from traditional computer skills, and beginning to examine the process of software design from professions such as film production.

Komatsu et al. (Japan) described their approach to achieving a telecommunication environment termed 'high presence communi- cation'. The main factors seem to include wide viewing angle, life-sized image display and stereo imagery. This paper addressed the issues of achieving stereoscopic viewing while keeping binocular parallax slight. The reason for this is that excessive binocular parallax can result in eye strain and double images. Their method uses two separate stereo- scopic displays with one placed be- hind the other. For example, in a videoconferencing environment, the front display shows the image of the person; the rear display shows the image of the background. Some rudimentary image processing is car- ried out to make sure that the image of the person is superimposed on

the image of the background and not 'seen through'. A head-tracking device lets the observer move their head to achieve motion parallax. Thus, both displays contain stereo- scopic information but the binocular parallax is reduced and the observer has the added benefit of depth of field.

Kuriki et al. (Japan) addressed the issue of eye contact in videotele- phony. Users often complain that videotelephony seems unnatural be- cause the offset between the camera and the display prevents callers from making eye-contact with each other while speaking. These authors de- scribed a blazed half-transparent mirror that enables users to maintain eye contact yet does not reduce brightness or image quality.

Perceptual approaches to image coding

We are entering an era where there is likely to be a significant increase in the need to transmit and store video image sequences. What is the best way to compress this information? Some of the papers presented at SID proposed methods based on human vision. For example, based on the finding that the human visual system is slow in processing high spatial frequency information, Hu, Klein and Carney (USA) suggest that, after a scene cut, high spatial fre- quency information may not be vis- ible. This means the information can be discarded from the image without any loss in image quality.

Further examples are provided by papers presented by Peterson, Ahumada and Watson (USA) and Watson (USA). These papers fo- cused on the Discrete Cosine Trans- form (DCT). The JPEG image compression standard provides a mechanism by which images may be compressed and shared among users. The image is first divided into 8 x 8 blocks. Each block is then trans- formed into its DCT. Each block is

180 Displays Volume 14 Number 3 1993

Conference report

then quantized by dividing it by a quantization matrix. The JPEG quantization matrix is not defined by the standard but is supplied by the user and stored or transmitted with the compressed image. This matrix should be designed to provide maxi- mum visual quality for minimum bit rate. A model of vision could be profitably applied here.

Algazi et al. (USA) described the results from using an objective metric called the Picture Quality Scale (PQS). The PQS appears to be well correlated with subjective ratings of observers (r = 0.88) and

requires the measurement of five distortion factors commonly found in coded images. These cover random errors, with different weightings based on a knowledge of human vision; the end of block discontinuity observed in trans- form coders; and structured errors, such as ringing, induced by image structure. The paper compared JPEG, sub-band and wavelet coding. The results show that at low bit rates (i.e. between about 0.6 and 0.8 bits/pel) there is little difference between the three methods, but at medium to high bit rates (i.e. 1.0

to 1.2 bits/pel) sub-band coding is superior.

Further information

The conference proceedings are re- ported in more detail in Society for Information Display International Symposium, 18-20 May 1993. Di- gest of Technical Papers, Volume 23.

David Travis Human Factors Division

BT Laboratories Martlesham Heath

Ipswich UK

Displays Volume 14 Number 3 1993 181