the contribution of amplitude and phase spectra-defined

21
The contribution of amplitude and phase spectra-defined scene statistics to the masking of rapid scene categorization Bruce C. Hansen $ Department of Psychology & Neuroscience Program, Colgate University, Hamilton, NY, USA Lester C. Loschky # $ Department of Psychology, Kansas State University, Manhattan, KS, USA Viewers can recognize the gist of a scene (i.e., its holistic semantic representation, such as its category) in less time than a single fixation, and backward masking has traditionally been employedas a meanstodeterminethattime course.The masks used in those paradigms are often characterized by either specific amplitude spectra only, or amplitude and phase spectra-defined structural properties. However, it remains unclear whether there would be a differential contribution of amplitude onlyoramplitudeþphasedefinedimage statisticsto the effective backward masking of rapid scene categorization. The current study addresses this issue. Experiments 1–3 explored amplitude spectra defined contributions to category maskingand revealed thattheslope of theamplitude spectrum was more important for modulating scene category masking strength than amplitude orientation. Further, the masking effects followed an ‘‘ amplitude spectrum slope similarity principle’’ whereby the more similar the amplitude spectrum slopeof themaskwastothetarget’samplitudespectrum slope, the stronger the masking. Experiment 5 showed that, when holding mask amplitude spectrum slope approximately constant, both categorically specific unrecognizable amplitude only and amplitude þ phase statistical regularities disrupted rapid scene categorization. Specifically, the masking effects observed in Experiment 5 followed a target-mask categorical dissimilarity principle whereby the more dissimilar the mask category is to the target image category, the stronger the masking. Overall, the results support the notion that amplitude only or amplitude þ phase-defined image statistics differentially contribute to the effective backward masking of rapid scene gist recognition. Visual masking of rapid scene categorization The use of visual masking has a long history of illuminating the mechanisms of early spatial vision (Carter & Henning, 1971; K. K. De Valois & Switkes, 1983; Henning, Hertz, & Hinton, 1981; Legge & Foley, 1980; Losada & Mullen, 1995; Sekuler, 1965; Solomon, 2000; Stromeyer & Julesz, 1972; Wilson, McFarlane, & Phillips, 1983), and has recently been argued to also serve as an effective means to investigate the time course and spatial mechanisms involved in scene gist recognition, which has typically been operationalized in terms of rapid scene categorization (e.g., Bacon-Mace, Mace, Fabre-Thorpe, & Thorpe, 2005; Fei-Fei, Iyer, Koch, & Perona, 2007; Guyader, Chauvin, Peyrin, H´ erault, & Marendaz, 2004; Joubert, Rousselet, Fabre- Thorpe, & Fize, 2009; Kaping, Tzvetanov, & Treue, 2007; Loschky, Hansen, Sethi, & Pydimari, 2010; Loschky & Larson, 2008; Loschky et al., 2007, see also Wichmann, Braun, & Gegenfurtner, 2006, for animal detection). Rapid scene categorization masking para- digms typically employ backward masking (though some paradigms involve the direct manipulation of target scenes and can be construed as rudimentary forms of simultaneous masking). Backward masking paradigms often utilize masks that are characterized by specific second and higher order statistical regularities of luminance contrast within different real-world scene categories (i.e., regularities that are captured by the Fourier amplitude spectrum or phase spectrum, respectively). A detailed account of these relationships is provided in the Supplementary material. Briefly, the masks typically consist of spatial contrast characteristics defined only by amplitude spectral relationships (e.g., phase-scram- bled scenes with amplitude spectra that vary in the distribution of contrast across spatial frequency or orientation, hereafter referred to as amplitude-only masks), or some combination of both amplitude and phase spectral relationships (e.g., masks where the amplitude spectra properties are held constant, but Citation: Hansen, B. C., & Loschky, L. C. (2013). The contribution of amplitude and phase spectra-defined scene statistics to the masking of rapid scene categorization. Journal of Vision, 13(13):21, 1–21, http://www.journalofvision.org/content/13/13/21, doi:10.1167/13.13.21. Journal of Vision (2013) 13(13):21, 1–21 1 http://www.journalofvision.org/content/13/13/21 doi: 10.1167/13.13.21 ISSN 1534-7362 Ó 2013 ARVO Received March 30, 2013; published November 20, 2013

Upload: others

Post on 04-Apr-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

The contribution of amplitude and phase spectra-definedscene statistics to the masking of rapid scene categorization

Bruce C Hansen $Department of Psychology amp Neuroscience Program

Colgate University Hamilton NY USA

Lester C Loschky $Department of Psychology Kansas State University

Manhattan KS USA

Viewers can recognize the gist of a scene (ie its holisticsemantic representation suchas its category) in less timethana single fixation andbackwardmasking has traditionally beenemployedasameanstodeterminethattimecourseThemasksused in those paradigms are often characterized by eitherspecific amplitude spectra only or amplitude and phasespectra-defined structural properties However it remainsunclearwhether therewould be a differential contribution ofamplitudeonlyoramplitudethornphasedefinedimagestatisticstothe effective backwardmasking of rapid scene categorizationThe current study addresses this issue Experiments 1ndash3explored amplitude spectra defined contributions to categorymaskingandrevealedthattheslopeoftheamplitudespectrumwasmore important formodulating scene categorymaskingstrength than amplitude orientation Further themaskingeffects followed an lsquolsquoamplitude spectrum slope similarityprinciplersquorsquowhereby themore similar the amplitude spectrumslopeofthemaskwastothetargetrsquosamplitudespectrumslopethe stronger themasking Experiment 5 showed that whenholdingmask amplitude spectrum slope approximatelyconstant bothcategorically specific unrecognizableamplitudeonly and amplitudethornphase statistical regularities disruptedrapid scene categorization Specifically themasking effectsobserved in Experiment 5 followed a target-mask categoricaldissimilarity principlewhereby themore dissimilar themaskcategory is to the target image category the stronger themaskingOveralltheresultssupportthenotionthatamplitudeonly or amplitudethornphase-defined image statisticsdifferentially contribute to the effective backwardmasking ofrapid scene gist recognition

Visual masking of rapid scenecategorization

The use of visual masking has a long history ofilluminating the mechanisms of early spatial vision

(Carter amp Henning 1971 K K De Valois amp Switkes1983 Henning Hertz amp Hinton 1981 Legge amp Foley1980 Losada amp Mullen 1995 Sekuler 1965 Solomon2000 Stromeyer amp Julesz 1972 Wilson McFarlane ampPhillips 1983) and has recently been argued to alsoserve as an effective means to investigate the timecourse and spatial mechanisms involved in scene gistrecognition which has typically been operationalized interms of rapid scene categorization (eg Bacon-MaceMace Fabre-Thorpe amp Thorpe 2005 Fei-Fei IyerKoch amp Perona 2007 Guyader Chauvin PeyrinHerault amp Marendaz 2004 Joubert Rousselet Fabre-Thorpe amp Fize 2009 Kaping Tzvetanov amp Treue2007 Loschky Hansen Sethi amp Pydimari 2010Loschky amp Larson 2008 Loschky et al 2007 see alsoWichmann Braun amp Gegenfurtner 2006 for animaldetection) Rapid scene categorization masking para-digms typically employ backward masking (thoughsome paradigms involve the direct manipulation oftarget scenes and can be construed as rudimentaryforms of simultaneous masking)

Backward masking paradigms often utilize masksthat are characterized by specific second and higherorder statistical regularities of luminance contrastwithin different real-world scene categories (ieregularities that are captured by the Fourier amplitudespectrum or phase spectrum respectively) A detailedaccount of these relationships is provided in theSupplementary material Briefly the masks typicallyconsist of spatial contrast characteristics defined onlyby amplitude spectral relationships (eg phase-scram-bled scenes with amplitude spectra that vary in thedistribution of contrast across spatial frequency ororientation hereafter referred to as amplitude-onlymasks) or some combination of both amplitude andphase spectral relationships (eg masks where theamplitude spectra properties are held constant but

Citation Hansen B C amp Loschky L C (2013) The contribution of amplitude and phase spectra-defined scene statistics to themasking of rapid scene categorization Journal of Vision 13(13)21 1ndash21 httpwwwjournalofvisionorgcontent131321doi101167131321

Journal of Vision (2013) 13(13)21 1ndash21 1httpwwwjournalofvisionorgcontent131321

doi 10 1167 13 13 21 ISSN 1534-7362 2013 ARVOReceived March 30 2013 published November 20 2013

with the statistical relationships in the phase spectrabeing systematically varied while preserving variousimage features hereafter referred to as amplitude thornphase masks) Specifically amplitudethornphase masks aredesigned to possess the lines and edges in scenes thatoften yield recognizable image structure Since oneneeds some modicum of local contrast to see phase-defined image features we refer to such masks asamplitudethorn phase defined (but note that the criticalmanipulations are often focused on manipulationsmade to the phase spectra)

Most scene gist studies that employ backwardmasking to investigate the spatial or temporal aspectsof gist processing use either amplitude-only masks oramplitudethorn phase masks Thus comparing maskingresults from those studies raises questions aboutwhether markedly different masking functions wouldbe observed depending on whether amplitude-onlymasks or amplitude thorn phase masks were used Thisissue is compounded by the fact that few studies haveexamined the extent to which specific amplitude-only oramplitudethorn phase mask spectral characteristics con-tribute to the resulting masking functions Thus thereare several gaps in our knowledge regarding (a)whether the two types of masks will produce differentmasking functions and if so (b) how the different maskspectral characteristics (eg amplitude-only or ampli-tudethorn phase-defined) modulate the masking functionsThus the current study addresses these questionsregarding backward masking We have focused onbackward masking because in contrast to simultaneousmasking it ensures that the participants will have directaccess to all target image information (in its originalunmanipulated state) prior to being exposed to a maskAdditionally by varying the stimulus onset asynchrony(SOA) backward masking allows the experimenter toassess when amplitude or phase information becomesrelevant to the categorization process (but see Van-Rullen 2011) though here we primarily focus on thespatial aspects of the masking Below we provide amore detailed account of how research on backwardmasking has raised these crucial theoretical questionsregarding amplitude-only masks and amplitudethornphasemasks

Amplitude-only backward masking

Amplitude-only backward masking effects have beeninvestigated in a pair of studies by Loschky et al (2007)and Loschky et al (2010) which examined maskingeffects on rapid scene categorization performanceThose studies used masks created by phase-randomiz-ing scenes thereby leaving only amplitude informationintact which we therefore refer to as amplitude-onlymasks Those masks were unrecognizable and thus

devoid of any semantic meaning which eliminatedmasking effects caused by conceptual processes knownas conceptual masking (eg Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) Doing so crucially limitedmasking effects to those due to visual mechanismsresponding to image structure defined by amplitudespectral properties rather than those attributable toconceptual masking Loschky et al (2007) and Loschkyet al (2010) specifically examined masking effectsproduced by three types of amplitude-only masks

(a) phase-randomized versions of scenes from differentcategories than the target scenes (ie the masks haddifferent amplitude spectral characteristics than thetarget scenes)

(b) phase-randomized versions of the target imagesthemselves (ie the masks had identical amplitudespectral characteristics to the target scenes) or

(c) white noise masks (ie masks lacking any globalamplitude- or phase-defined structural characteris-ticsmdashsee Supplementary material for furtherdetail)

These comparisons produced two critical results (a)there were no differences in the masking betweenamplitude-only masks derived from the same versusdifferent scene category than the target and (b) bothmasking conditions based on phase-randomized scenemasks produced significantly stronger effects than thewhite noise masking condition The former resultsuggests that amplitude spectral properties producelittle if any category-specific masking effects On theother hand the latter result suggests that generalspatial frequency and orientation differences (relativeto white noise) play a critical role in masking rapidscene categorization That is greater masking wascaused by the phase-randomized amplitude-only masksthat possessed typical amplitude spectrum characteris-tics of real-world scenes (ie contrast fall-off withincreasing spatial frequency and contrast biases atdifferent orientations) rather than the white noisemasks which did not More specifically since thephase-randomized scenes as masks still had intactamplitude spectra they possessed amplitude featurescommon to most natural scenes including (a) the 1fa

fall off in contrast with increasing spatial frequency(ie the amplitude spectrum slope a typically rsquo 10)and (b) the verticalhorizontal (ie cardinal) orienta-tion bias found across all scenes (cardinal oblique)(see Hansen Haun amp Essock 2008 for a review)Thus the question is which aspect of the amplitudespectrum is responsible for such masking Specificallyis it (a) the slope a of the amplitude spectrum (b) theorientation biases or (c) both that produced thestrongest masking effects for phase-randomized scenemasks (compared to white noise) in the Loschky et al

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 2

(2007) and Loschky et al (2010) studies Experiments1ndash3 aimed at answering this question

Amplitude thorn phase backward masking

The studies by Loschky et al (2007) found thatactual unaltered scenes used as masks (ie a type ofamplitudethorn phase mask) were much more effective atinterfering with rapid scene categorization than phase-randomized scene masks (ie amplitude-only masks)This finding suggests that amplitude-only masks andamplitudethorn phase masks produce different maskingfunctions However the amplitude thorn phase masksutilized by Loschky et al (2007) contained recognizablescene structure Thus an alternative explanation forsuch differences in masking effects would be conceptualmasking Loschky et al (2010) therefore explored thisconceptual masking account in a follow-up study usingunrecognizable amplitudethorn phase masks Those maskswere created by feeding real-world scene images to thePortilla and Simoncelli (2000) texture synthesis algo-rithm which produced scene-texture masks containingunrecognizable conjoint amplitude thorn phase-definedimage structures Loschky et al (2010) found that suchamplitudethorn phase masks interfered with rapid scenecategorization far more than amplitude-only (phaserandomized) masks and both masked more than whitenoise Critically the unrecognizable amplitudethorn phasemasks produced 74 of the difference in maskingfound between white noise masks and unaltered scenesused as masks namely recognizable amplitudethorn phasemasks That is 74 of the observed masking effectsmentioned above that might have been attributed toconceptual masking could instead be explained byunrecognizable amplitude thorn phase-defined scene struc-ture

Loschky et al (2010) showed that amplitudethorn phasemasks which contained unrecognizable phase-definedimage structure yielded strong masking effects unex-plainable by conceptual masking Nevertheless we arestill left with the question of how such amplitude thornphase regularities modulate rapid scene categorizationperformance Given that those masks were derivedfrom real-world scene images the unrecognizablephase-defined image structures in the amplitude thornphase masks may have regulated scene categorizationin a category-specific manner If so to take an examplewould unrecognizable amplitudethornphase masks derivedfrom beach scenes more strongly mask beach targetsthan unrecognizable amplitude thorn phase masks derivedfrom mountain scenes (or vice versa) Experiments 4ndash5aimed at answering this question

It is important to note that utilizing backwardmasking paradigms to study rapid scene categorizationis not without its limitations One point worth

reiterating here is that masking has a long history in thestudy of early visual processes such as the detection anddiscrimination of sinusoidal gratings or Gabor pat-terns Thus backward masking effects in rapid scenecategorization must be considered in the terms of earlyspatial vision mechanisms assessed by visual maskingSpecifically one must consider whether a given set ofbackward masking effects can be predicted by theknown response characteristics of early visual mecha-nisms which could apply equally well to a wide arrayof visual tasks including rapid scene categorization Forexample if a given backward masking effect can beaccounted for by a reduction of the signal-to-noiseratio in early spatial channels then the information inthe mask may simply have been more effective atstopping informative scene information from beingpassed along to higher levels where scene categorizationis thought to occur (eg the parahippocampal placearea [PPA] or the lateral occipital complex [LOC]) Wetherefore apply this approach to interpreting the resultsof the current study In doing so we attempt todissociate those masking effects more likely to beexplained by earlier spatial channel interactions fromthose less likely to be so explained

The current study

To return to the primary goals laid out in ourintroductory paragraph the current study (a) providesan affirmative answer to the question of whetherunrecognizable amplitude-only masks versus unrecog-nizable amplitude thorn phase masks produce differentmasking functions and (b) addresses how the spectralcharacteristics of amplitude-only masks and amplitudethorn phase masks modulate their respective maskingfunctions Experiments 1ndash3 were concerned with theeffects of the spectral characteristics of amplitude-onlymasks Experiment 1 employed noise masks containingonly amplitude-defined structure (ie variable slope aand variable orientation bias magnitudes) Experiment2 followed up Experiment 1 by extending its crucialmasking conditions into the temporal domain Exper-iment 3 controlled for the perceived contrast (ratherthan physical contrast) of the noise masks used inExperiments 1 and 2 Experiments 4ndash5 were concernedwith the effects of the spectral characteristics ofamplitude-only versus amplitude thorn phase masksExperiment 4 assessed the recognizability of imagestructure in amplitude-only masks versus amplitude thornphase masks that would be used in Experiment 5 Thiswas necessary to ensure minimal to no recognizabilityin order eliminate conceptual masking effects Exper-iment 5 then assessed whether unrecognizable ampli-tude-only versus amplitudethorn phase masks interfered

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 3

with rapid scene categorization in a category-specificmanner

General method

Apparatus

For experiments conducted at Kansas State Uni-versity (Experiments 1 3 4 amp 5) all stimuli werepresented on 19 in Samsung SyncMaster 957 MB Smonitors These were driven by Optiplex 170L Pentium4 processors (28 GHz) which were equipped with 512RAM and Intel 82865G Graphics Controller Integrat-ed Chipset graphics cards which had 8-bit grayscaleresolution Stimuli were displayed using a linearizedlook-up table generated by calibrating with a Color-Vision Spyder2 Pro sensor Maximum luminanceoutput of the display monitors was 806 cdm2 theframe rate was set to 85 Hz and the resolution was setto 1024 middot 768 pixels Single pixels subtended 03698 ofvisual angle (ie 221 arc min) as viewed from adistance of 533 cm using machined chin rests

For experiments conducted at Colgate University(Experiment 2) all stimuli were presented on a 21 inViewsonic (G225fB) monitor This was driven by a dualcore Intelt Xeont processor (160 GHz x2) which wasequipped with 4 GB RAM and a 256 MB PCIe x16ATI FireGL V7200 dual DVIVGA graphics card andwith 8-bit grayscale resolution Stimuli were displayedusing a linearized look-up table generated by cali-brating with a Color-Vision Spyder3 Pro sensorMaximum luminance output of the display monitorwas set to yield 806 cdm2 the frame rate was set to 85Hz and the resolution was set to 1024 middot 768 pixelsSingle pixels subtended 003638 of visual angle (ie218 arc min) as viewed from a distance of 580 cmusing an Applied Science Laboratories (ASL) chin andforehead rest

Scene image database

A total of 600 grayscale scene images (selected fromthe Internet and free of any copyright restrictions) wereused in the current study (100 images per scenecategory) as either target stimuli or as images used toconstruct mask stimuli There were six basic levelcategories (a) beaches (b) forests (c) airports (d)streets (e) home interiors and (f) store interiors Thesecould be further categorized as three conjoint super-ordinate categories natural outdoor (beaches andforests) man-made outdoor (airports and streets) andman-made indoor (home and store interiors) Eachimage category set was assembled by cropping a 512 middot

512 pixel region from a random location within thelarger image

Each image I(x y) was normalized to possess afixed root-mean square (rms) contrast (as defined inPavel Sperling Riedl amp Vanderbeek 1987) and meanluminance in the spatial (ie image) domain using thesequence of functions that can be found in Hansen andHess (2007) Here rms was set to 0126 with the meangrayscale value of all images fixed at 127 Theparticular choice of rms allowed for a suprathresholdcontrast that did not result in pixel grayscale valuesoutside the 0ndash255 range

Experiment 1

The current experiment was designed to determinewhich aspect of the amplitude spectrum (slope ororientation) in amplitude-only masks is responsible formasking rapid scene categorization performance (a)the slope of the amplitude spectrum (b) the orientationbiases or (c) both

Method

Participants

Forty-seven Kansas State University students (34female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent (with written parental consent also given forthose under the age of 18) All observers had normal(or corrected to normal) vision and their ages rangedfrom 16 to 30 years (M frac14 1933 SD frac14 293)

Noise mask creation

All noise masks were constructed in the Fourierdomain using MATLAB (version R2008a) includingImage Processing (version 61) and Signal Processing(version 69) toolboxes Mask construction involvedcreating a base amplitude spectrum BASEAMP(f h)weighted by an orientation filter The orientation filterused here consisted of an orientation biased amplitudespectrum ORNTAMP( f h) constructed by averagingthe amplitude spectra from all images described in theGeneral method section The base and orientationbiased spectra were created in polar coordinates thus fand h represent the spatial frequency and orientationdimensions respectively The orientation filter firstinvolved constructing an amplitude spectrum from theaverage of the entire set of scene category images (nfrac14600 total images) shifted into polar coordinates withthe DC (ie zero frequency) component assigned a

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 4

zero The averaged amplitude spectrum denoted asORNTAMP( f h) possessed the typical orientation biasobserved in natural scenes namely more contrast alongthe cardinal axes vertical and horizontal than theobliques (eg Coppola Purves McCoy amp Purves1998 Hansen amp Essock 2004 Keil amp Cristobal 2000Switkes Mayer amp Sloan 1978 Torralba amp Oliva2003 van der Schaaf amp van Hateren 1996) However italso contained the typical spatial frequency contrastbias in (ie much more contrast at lower than higherspatial frequencies) Since we needed ORNTAMP( f h)to only serve as an orientation filter we next removedthe spatial frequency contrast bias by the followingoperations First we created an orientation averagedvector OA( fu) by averaging the amplitude coefficientsacross all orientations for each spatial frequency inORNTAMP( f h) stated formally as

OAeth fuTHORN frac141

vn

Xufrac141un

vfrac141vn

ORNTAMPeth fu hvTHORN eth1THORN

where the subscripts u and v represent spatial frequencyand orientation coordinates respectively with un beingthe total number of spatial frequencies (in cycles perpicture cpp) and vn being the total number oforientations Importantly due to the nature of thediscrete Fourier transform (DFT) (eg Hansen ampEssock 2004) the total number of orientations vn at agiven spatial frequency varies monotonically withincreasing spatial frequency Next to remove the spatialfrequency bias from ORNTAMP( f h) each orientationvector in ORNTAMP( f h) was divided by OA( fu) Thisproduced a flat amplitude spectrum (ie a frac14 00) butwith the orientation bias preserved which served as theorientation filter used to weight the amplitude spectra ofthe noise masks in the current experiment This isformally stated as

Ofilteth f hTHORN frac14ORNTAMPeth f hvfrac141vnTHORN

OAeth fuTHORN eth2THORN

Here f in the numerator is without a subscript as thedivision is applied to all spatial frequencies along eachorientation vector within ORNTAMP( f h) FinallyOfilt( f h) represents the orientation filter itself

Next the base amplitude spectrum BASEAMP( f h)was constructed in a single 512 middot 512 polar matrixwith all coordinates assigned the same arbitraryamplitude coefficient (except the location of the DCcomponent which was assigned a zero) The result is aflat isotropic broadband spectrum (ie possessingequal contrast across all spatial frequencies andorientations) Then the base spectrum exponent couldbe adjusted by multiplying each spatial frequencyrsquosamplitude coefficient by f a with a representing theamplitude spectrum exponent For the current exper-

iment a was set to either 00 05 10 or 15 to capturethe typical range of as observed in scene imagery (eg05 to 15) and a white noise condition (ie a frac14 00)

Then an orientation bias was applied to a basespectrum having one of the four different a values byweighting it with Ofilt(f h) according to the followingoperation

MASKo frac14 ethBASEAMP f hfrac12 middot 1Omag

THORN

thornethOfilt f hfrac12 middot Omag

THORNg eth3THORN

Here Omag was a scalar value controlling theorientation bias magnitude which took on one of fivevalues from zero to one (in steps of 025) with zeroproducing an isotropic amplitude spectrum and oneproducing a mask amplitude spectrum having the fullbias in ORNTAMP( f h) Thus subscript ao representsmask amplitude spectra possessing one of the fouramplitude spectrum slopes (a) and one of the fiveorientation bias magnitudes (o)

Finally we used a unique phase spectrum for eachmask by assigning random values fromp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar space URAND( f h)frac14 -URAND( f h) The noise patterns were rendered in thespatial domain by taking the inverse Fourier transformof MASKo( f h) and a given URAND( f h) with bothshifted to Cartesian coordinates prior to the inverseDFT The rms contrast of all noise masks was fixed at0126 in the spatial domain by scaling the powerspectrum in the Fourier domain A total of 100 uniquemasks was generated for each of the four a and fiveorientation bias magnitude conditions Examples of the20 mask types are shown in Figure 1A

Design and procedure

Experiment 1 used a six alternative forced choice(AFC) scene categorization backward masking para-digm The experiment had a 4 (Mask a)middot 5 (OrientationBias Magnitude) mixed design with mask a a between-subjects variable and orientation bias magnitude awithin-subjects variable In each mask a condition eachparticipant saw 60 different target category stimuli(described in the General method section) for each levelof mask orientation bias magnitude (10 stimuli ran-domly selected without replacement from each scenecategory) for 300 total trials The order of targetcategory and mask orientation bias magnitude wasrandomized Participants were randomly assigned tobetween-subjects conditions

A given trial sequence consisted of a 500 ms fixationpoint (038 black circle) at the screen center (with thescreen set to mean luminance) followed by a targetscene presented for 2356 ms followed by a 1176 ms

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 5

Figure 1 (A) Example noise masks used in Experiment 1 The examples are arranged in rows for amplitude spectrum slope (ie a) and

columns for orientation magnitude Specifically amplitude spectrum slope increases from the top row (afrac14 00) to the bottom row (a

frac14 15) in steps of 05 The far left column is Omag frac14 00 (ie no orientation bias) to the far right column Omag frac14 10 (maximum

orientation bias) in steps of 025 (B) Shows orientation averaged amplitude spectra for the four different noise mask slopes (C)

Shows a 2-D contour plot of Ofilt(f h) plotted in polar coordinates Note the distinct horizontal and vertical orientation bias

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 6

interstimulus interval (ISI) (a mean luminance blankscreen stimulus onset asynchrony [SOA] frac14 3532 ms)then a noise mask for 7067 ms (ie 31 masktargetduration ratio) followed by a mean luminance screenfor 750 ms and then the response screen containing a 2middot 3 grid of the six basic level scene category labels untilthe observer responded by a mouse click on thecategory label matching the target stimulus Thepositions of the category labels were randomized oneach trial to avoid category location response biasThat is by randomizing the locations of responsechoices on every trial any location-based response bias(eg top left response cell) would be randomlydistributed across all categories rather than only asingle category Note however that this requiresparticipants to first search for the correct answer oneach trial adding to their reaction time making thismethod better suited to analyzing accuracy thanreaction times Before the experiment began partici-pants were given practice trials using target stimulifrom categories not used in the experiment but withnoise masks from the assigned a condition and variablemask orientation bias magnitude

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope aresulted in 12 in the afrac14 00 condition 11 in the afrac14 05condition 11 in the afrac14 10 condition and 13 in the afrac1415 condition Figure 2a shows mean scene categoriza-tion accuracy for each amplitude spectrum slope amask condition as a function of mask orientation bias

magnitude Figure 2b shows the same data plotted foreach level of mask orientation bias as a function of maskamplitude spectrum a which we tested with a 4 (Mask a)middot 5 (Orientation Bias Magnitude) mixed repeatedmeasures analysis of variance (ANOVA) The data inFigure 2 show a strong masking effect for amplitudespectrum a which was confirmed by a significantbetween-subjects effect of mask amplitude spectrum aF(3 43)frac14 2409 p 0001 Cohenrsquos f 2frac14 02711 On theother hand Figure 2 shows little to no effect of maskorientation bias magnitude which is also confirmed by anonsignificant within-subjects effect of mask orientationbias magnitude F(4 172)frac14 040 pfrac14 081 Likewise assuggested by Figure 2 there was no significantinteraction between mask a and mask orientation biasmagnitude F(12 172)frac14 0735 pfrac14 072

We carried out post-hoc comparisons of eachamplitude slope condition against the others using aBonferonni adjusted family-wise a level of 00083 for thesix unique comparisons among the four conditions Tocarry out the independent groups two-tailed t tests wefirst created equal sized groups by randomly selectingone subject to omit from the a frac14 0 condition and twosubjects to omit from the afrac14 15 condition (though thestatistical test results were the same whether the subjectswere omitted or not) As suggested by examination ofFigure 2a all three comparisons with the a frac14 10condition were significant (ts 311 ps 0006Cohenrsquos ds 056) as was the comparison of the afrac14 05and 00 (white noise) conditions t(20)frac14 551 p 0001Cohenrsquos dfrac14 191 Conversely the comparison of the afrac1415 and 00 conditions was not significant t(20)frac14 163 pfrac14 0118 nor was the comparison of the afrac14 15 and 05conditions t(20)frac14 242 pfrac14 0025 (based on the family-wise adjusted alpha level)

Figure 2 Averaged data for Experiment 1 For both plots the ordinate shows averaged percent correct (note that the scale begins at

40 to increase visibility) (A) Shows amplitude spectrum slope masking functions for the different mask orientation bias magnitudes

(OM) (B) Shows mask orientation bias magnitude (OM) masking functions for the four different mask amplitude spectrum slopes

Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 7

Lastly while unlikely (see Introduction andLoschky et al 2007) we examined whether categori-zation performance for any given scene category wasmasked more or less compared to the others as afunction of mask orientation bias magnitude Therewere no significant differences for any of the six scenecategories as a function of mask orientation biasmagnitude for any of the mask amplitude spectrumslopes ps 005 (results not shown) Thus in terms ofwhich amplitude-only spectral characteristics moststrongly mask rapid scene categorization the currentresults show that the distribution of contrast acrossspatial frequency (ie amplitude spectrum slope a) ismuch stronger than the distribution of contrast acrossorientation

Finally the masking functions shown in Figure 2bshow an interesting trend that seems related to thetypical amplitude spectrum slope found in naturalscenes (ie a rsquo 10) Specifically it is tempting toconclude that the more similar the mask slopes wereto the typical slope observed in natural scenes thestronger the masking effect However since the entiretarget image set had a mean slope of 112 (SD frac14012) and because there were no category-specificmasking effects (across orientation magnitude orslope) Occamrsquos razor suggests the simpler (thusbetter) description is that the observed maskingeffects follow an amplitude slope similarity principleSpecifically the more similar the mask amplitudeslope was to the target image slope the stronger themasking effect We will return to this notion inExperiments 4 and 5 as well as in the Generaldiscussion

Experiment 2

Given the amplitude spectrum slope was the drivingforce in the rapid scene categorization masking effectsin Experiment 1 for a fixed targetmask SOAExperiment 2 further explored the crucial conditionsfrom Experiment 1 within the temporal processingdomain Specifically we investigated whether theamplitude spectrum slope a was more or less of a factorat varying processing times Thus we replicatedExperiment 1 for masks set to either afrac14 00 or 10 witheach possessing either a strong or no orientation bias(Omag frac14 10 vs 00) across five different targetmaskSOAs We note along with VanRullen (2011) that therelationship between masking SOA and processing timeis not as simple as suggested by many studies usingbackward masking to study the time course ofperception Specifically neurophysiological studies thathave investigated the link between masking SOA andthe time course of target-related brain activity have

shown that while SOA and processing time are notidentical they are related (eg Rieger Braun Bulthoffamp Gegenfurtner 2005)

Method

Participants

Fifty-three Colgate University students (34 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent All ob-servers had normal (or corrected to normal) visionand their ages ranged from 18 to 21 years (M frac14 188SD frac14 079)

Design and procedure

The mask stimuli were identical to those used inExperiment 1 with the following exceptions Only twomask amplitude spectrum a values (afrac14 00 and afrac14 10)and two mask orientation bias magnitudes (Omagfrac14 00and Omag frac14 10) were employed in the currentexperiment Target stimuli were drawn from the setused in Experiment 1 as described in the Generalmethod section

The current experiment employed the same para-digm as Experiment 1 except that the currentexperiment utilized five different SOAs and a no-maskcondition The experiment used a 2 (Mask a) middot 2(Orientation Bias Magnitude) middot 6 (SOA amp No-Mask)mixed design with mask a and orientation biasmagnitude being between-subjects variables and SOAbeing a within-subjects variable Within a given mask aand orientation bias magnitude condition each par-ticipant saw 60 different target stimuli (described in theGeneral method section) for each SOA (with 10 stimulirandomly selected without replacement from each ofsix scene categories) For the no-mask condition anadditional 60 images were randomly sampled (10 perscene category) for a total of 360 trials Order of targetcategory and SOA (including no mask) was random-ized for each trial Participants were randomly assignedto the between-subjects conditions

The trial sequence was identical to that of Exper-iment 1 except that the targetmask ISI was set to yieldfive different SOAs (frac14 target durationthorn ISI) 2356 ms(ie 0 ms ISI) 3529 ms 4705 ms 7057 ms and11764 ms For no-mask trials the target image wassimply followed by the same mean luminance screenfor 750 ms as in the masking conditions (as inExperiment 1) Participantsrsquo task was the same asExperiment 1 and they were given practice trials usingtarget stimuli from categories not used in theexperiment

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 8

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope athornorientation bias magnitude resulted in 12 in the afrac1400thornorientation biasfrac14 00 condition 14 in the afrac14 00thornorientation biasfrac14 10 condition 13 in the afrac14 00thornorientation biasfrac14 00 condition and 14 in the afrac14 10thornorientation biasfrac14 10 condition Figure 3 shows meanscene categorization accuracy for each amplitude spec-trum slope a and orientation bias mask condition as afunction of SOA (and no mask) and we conducted a 2(Mask a) middot 2 (Mask Orientation Bias Magnitude) middot 5(SOA) mixed repeated measures ANOVA to test theeffects of these factors As expected Figure 3 shows areduction ofmasking strengthwith increasing SOA for allbetween-subjects masking conditions which was con-firmed by a significant main effect of SOA F(4 196)frac1413564 p 0001 partial g2frac14074Also as inExperiment1 noise masks with amplitude spectrum afrac1410 producedstronger effects than that of afrac1400 for the shorter SOAsas shown by a significant main effect of mask a F(1 49)frac141571 p 0001 Cohenrsquos f 2frac14026 anda significantmaskamiddotSOAinteractionF(4 196)frac143645p 0001Cohenrsquosf 2frac14101 Interestingly the effect ofmask orientation biasmagnitudewas absent in both afrac1410 conditions but therewas an apparent effect of orientationmagnitude in the afrac1400 conditions However this was not supported by theANOVA which found neither a significant main effect oforientation magnitude nor any significant interactionsinvolving it (ps 0212) Thuswe explored the significantMask a middot SOA interaction to identify the SOAs yieldingsignificant differences between the twomask a conditionsWe collapsed across orientation magnitude within eachlevel of mask a and ran independent t tests (Bonferonniadjusted family-wise a levelfrac14 00083) between the two agroups at each SOA (and the no-mask condition) Thisshowed significantdifferences for the three shortest SOAs24 ms t(50)frac14 789 p 0001 Cohenrsquos dfrac14 227 36 mst(50)frac14364 pfrac140001Cohenrsquos dfrac14106 and 48ms t(50)frac1431 pfrac140003 Cohenrsquos dfrac14 09 no SOAs 48 ms(including the no-mask condition) showed significantdifferences between the two mask a groups (ps 0164)Thus the effect of noisemask a (ie strongermasking foramplitude spectra afrac14 10) was only found within thecritically important first 50 ms of scene gist processingtime (Loschky et al 2007 Loschky et al 2010) whichfollowed the target-mask amplitude slope similarityprinciple observed in Experiment 1

Experiment 3

So far we have found that amplitude-only slope afrac1410 masks produce the strongest backward masking of

rapid scene categorization (when compared to afrac14 0[white noise] 05 and 15) specifically very early inscene category processing (ie the first 50 ms ofprocessing time) One simple explanation for thestronger masking produced by a frac14 10 masks is thatthey have higher perceived contrast than smaller orlarger a values and higher perceived contrast producesstronger masking Specifically previous studies (egCass Alais Spehar amp Bex 2009 Field amp Brady 1997Tolhurst amp Tadmor 2000) have noted a difference insubjective contrast between rms contrast balancedimages with varying a More specifically Hansen andHess (2012 figure 4a) found that afrac14 10 noise patternswere perceived to have 83 more rms contrast than afrac1400 noise and 50 more rms contrast than a frac14 15noise patterns This trend mirrors the results ofExperiment 1 in which the largest advantage for afrac1410noise masks was in comparison to afrac14 00 noise masksand the next largest advantage was in comparison to afrac14 15 noise masks

Thus Experiment 3 sought to equate the perceivedcontrast of the a frac14 00 and a frac14 10 noise masks Thecontrast matching data reported by Hansen and Hess(2012) found a 83 difference in perceived contrastbetween afrac14 00 and afrac14 10 noise Thus in Experiment3 we set the a frac14 00 [white] noise masks to an rmscontrast twice the rms contrast of the afrac14 10 masks

Method

Participants

Thirty-two Kansas State University students (16female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed written

Figure 3 Averaged data from Experiment 2 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 9

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

with the statistical relationships in the phase spectrabeing systematically varied while preserving variousimage features hereafter referred to as amplitude thornphase masks) Specifically amplitudethornphase masks aredesigned to possess the lines and edges in scenes thatoften yield recognizable image structure Since oneneeds some modicum of local contrast to see phase-defined image features we refer to such masks asamplitudethorn phase defined (but note that the criticalmanipulations are often focused on manipulationsmade to the phase spectra)

Most scene gist studies that employ backwardmasking to investigate the spatial or temporal aspectsof gist processing use either amplitude-only masks oramplitudethorn phase masks Thus comparing maskingresults from those studies raises questions aboutwhether markedly different masking functions wouldbe observed depending on whether amplitude-onlymasks or amplitude thorn phase masks were used Thisissue is compounded by the fact that few studies haveexamined the extent to which specific amplitude-only oramplitudethorn phase mask spectral characteristics con-tribute to the resulting masking functions Thus thereare several gaps in our knowledge regarding (a)whether the two types of masks will produce differentmasking functions and if so (b) how the different maskspectral characteristics (eg amplitude-only or ampli-tudethorn phase-defined) modulate the masking functionsThus the current study addresses these questionsregarding backward masking We have focused onbackward masking because in contrast to simultaneousmasking it ensures that the participants will have directaccess to all target image information (in its originalunmanipulated state) prior to being exposed to a maskAdditionally by varying the stimulus onset asynchrony(SOA) backward masking allows the experimenter toassess when amplitude or phase information becomesrelevant to the categorization process (but see Van-Rullen 2011) though here we primarily focus on thespatial aspects of the masking Below we provide amore detailed account of how research on backwardmasking has raised these crucial theoretical questionsregarding amplitude-only masks and amplitudethornphasemasks

Amplitude-only backward masking

Amplitude-only backward masking effects have beeninvestigated in a pair of studies by Loschky et al (2007)and Loschky et al (2010) which examined maskingeffects on rapid scene categorization performanceThose studies used masks created by phase-randomiz-ing scenes thereby leaving only amplitude informationintact which we therefore refer to as amplitude-onlymasks Those masks were unrecognizable and thus

devoid of any semantic meaning which eliminatedmasking effects caused by conceptual processes knownas conceptual masking (eg Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) Doing so crucially limitedmasking effects to those due to visual mechanismsresponding to image structure defined by amplitudespectral properties rather than those attributable toconceptual masking Loschky et al (2007) and Loschkyet al (2010) specifically examined masking effectsproduced by three types of amplitude-only masks

(a) phase-randomized versions of scenes from differentcategories than the target scenes (ie the masks haddifferent amplitude spectral characteristics than thetarget scenes)

(b) phase-randomized versions of the target imagesthemselves (ie the masks had identical amplitudespectral characteristics to the target scenes) or

(c) white noise masks (ie masks lacking any globalamplitude- or phase-defined structural characteris-ticsmdashsee Supplementary material for furtherdetail)

These comparisons produced two critical results (a)there were no differences in the masking betweenamplitude-only masks derived from the same versusdifferent scene category than the target and (b) bothmasking conditions based on phase-randomized scenemasks produced significantly stronger effects than thewhite noise masking condition The former resultsuggests that amplitude spectral properties producelittle if any category-specific masking effects On theother hand the latter result suggests that generalspatial frequency and orientation differences (relativeto white noise) play a critical role in masking rapidscene categorization That is greater masking wascaused by the phase-randomized amplitude-only masksthat possessed typical amplitude spectrum characteris-tics of real-world scenes (ie contrast fall-off withincreasing spatial frequency and contrast biases atdifferent orientations) rather than the white noisemasks which did not More specifically since thephase-randomized scenes as masks still had intactamplitude spectra they possessed amplitude featurescommon to most natural scenes including (a) the 1fa

fall off in contrast with increasing spatial frequency(ie the amplitude spectrum slope a typically rsquo 10)and (b) the verticalhorizontal (ie cardinal) orienta-tion bias found across all scenes (cardinal oblique)(see Hansen Haun amp Essock 2008 for a review)Thus the question is which aspect of the amplitudespectrum is responsible for such masking Specificallyis it (a) the slope a of the amplitude spectrum (b) theorientation biases or (c) both that produced thestrongest masking effects for phase-randomized scenemasks (compared to white noise) in the Loschky et al

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 2

(2007) and Loschky et al (2010) studies Experiments1ndash3 aimed at answering this question

Amplitude thorn phase backward masking

The studies by Loschky et al (2007) found thatactual unaltered scenes used as masks (ie a type ofamplitudethorn phase mask) were much more effective atinterfering with rapid scene categorization than phase-randomized scene masks (ie amplitude-only masks)This finding suggests that amplitude-only masks andamplitudethorn phase masks produce different maskingfunctions However the amplitude thorn phase masksutilized by Loschky et al (2007) contained recognizablescene structure Thus an alternative explanation forsuch differences in masking effects would be conceptualmasking Loschky et al (2010) therefore explored thisconceptual masking account in a follow-up study usingunrecognizable amplitudethorn phase masks Those maskswere created by feeding real-world scene images to thePortilla and Simoncelli (2000) texture synthesis algo-rithm which produced scene-texture masks containingunrecognizable conjoint amplitude thorn phase-definedimage structures Loschky et al (2010) found that suchamplitudethorn phase masks interfered with rapid scenecategorization far more than amplitude-only (phaserandomized) masks and both masked more than whitenoise Critically the unrecognizable amplitudethorn phasemasks produced 74 of the difference in maskingfound between white noise masks and unaltered scenesused as masks namely recognizable amplitudethorn phasemasks That is 74 of the observed masking effectsmentioned above that might have been attributed toconceptual masking could instead be explained byunrecognizable amplitude thorn phase-defined scene struc-ture

Loschky et al (2010) showed that amplitudethorn phasemasks which contained unrecognizable phase-definedimage structure yielded strong masking effects unex-plainable by conceptual masking Nevertheless we arestill left with the question of how such amplitude thornphase regularities modulate rapid scene categorizationperformance Given that those masks were derivedfrom real-world scene images the unrecognizablephase-defined image structures in the amplitude thornphase masks may have regulated scene categorizationin a category-specific manner If so to take an examplewould unrecognizable amplitudethornphase masks derivedfrom beach scenes more strongly mask beach targetsthan unrecognizable amplitude thorn phase masks derivedfrom mountain scenes (or vice versa) Experiments 4ndash5aimed at answering this question

It is important to note that utilizing backwardmasking paradigms to study rapid scene categorizationis not without its limitations One point worth

reiterating here is that masking has a long history in thestudy of early visual processes such as the detection anddiscrimination of sinusoidal gratings or Gabor pat-terns Thus backward masking effects in rapid scenecategorization must be considered in the terms of earlyspatial vision mechanisms assessed by visual maskingSpecifically one must consider whether a given set ofbackward masking effects can be predicted by theknown response characteristics of early visual mecha-nisms which could apply equally well to a wide arrayof visual tasks including rapid scene categorization Forexample if a given backward masking effect can beaccounted for by a reduction of the signal-to-noiseratio in early spatial channels then the information inthe mask may simply have been more effective atstopping informative scene information from beingpassed along to higher levels where scene categorizationis thought to occur (eg the parahippocampal placearea [PPA] or the lateral occipital complex [LOC]) Wetherefore apply this approach to interpreting the resultsof the current study In doing so we attempt todissociate those masking effects more likely to beexplained by earlier spatial channel interactions fromthose less likely to be so explained

The current study

To return to the primary goals laid out in ourintroductory paragraph the current study (a) providesan affirmative answer to the question of whetherunrecognizable amplitude-only masks versus unrecog-nizable amplitude thorn phase masks produce differentmasking functions and (b) addresses how the spectralcharacteristics of amplitude-only masks and amplitudethorn phase masks modulate their respective maskingfunctions Experiments 1ndash3 were concerned with theeffects of the spectral characteristics of amplitude-onlymasks Experiment 1 employed noise masks containingonly amplitude-defined structure (ie variable slope aand variable orientation bias magnitudes) Experiment2 followed up Experiment 1 by extending its crucialmasking conditions into the temporal domain Exper-iment 3 controlled for the perceived contrast (ratherthan physical contrast) of the noise masks used inExperiments 1 and 2 Experiments 4ndash5 were concernedwith the effects of the spectral characteristics ofamplitude-only versus amplitude thorn phase masksExperiment 4 assessed the recognizability of imagestructure in amplitude-only masks versus amplitude thornphase masks that would be used in Experiment 5 Thiswas necessary to ensure minimal to no recognizabilityin order eliminate conceptual masking effects Exper-iment 5 then assessed whether unrecognizable ampli-tude-only versus amplitudethorn phase masks interfered

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 3

with rapid scene categorization in a category-specificmanner

General method

Apparatus

For experiments conducted at Kansas State Uni-versity (Experiments 1 3 4 amp 5) all stimuli werepresented on 19 in Samsung SyncMaster 957 MB Smonitors These were driven by Optiplex 170L Pentium4 processors (28 GHz) which were equipped with 512RAM and Intel 82865G Graphics Controller Integrat-ed Chipset graphics cards which had 8-bit grayscaleresolution Stimuli were displayed using a linearizedlook-up table generated by calibrating with a Color-Vision Spyder2 Pro sensor Maximum luminanceoutput of the display monitors was 806 cdm2 theframe rate was set to 85 Hz and the resolution was setto 1024 middot 768 pixels Single pixels subtended 03698 ofvisual angle (ie 221 arc min) as viewed from adistance of 533 cm using machined chin rests

For experiments conducted at Colgate University(Experiment 2) all stimuli were presented on a 21 inViewsonic (G225fB) monitor This was driven by a dualcore Intelt Xeont processor (160 GHz x2) which wasequipped with 4 GB RAM and a 256 MB PCIe x16ATI FireGL V7200 dual DVIVGA graphics card andwith 8-bit grayscale resolution Stimuli were displayedusing a linearized look-up table generated by cali-brating with a Color-Vision Spyder3 Pro sensorMaximum luminance output of the display monitorwas set to yield 806 cdm2 the frame rate was set to 85Hz and the resolution was set to 1024 middot 768 pixelsSingle pixels subtended 003638 of visual angle (ie218 arc min) as viewed from a distance of 580 cmusing an Applied Science Laboratories (ASL) chin andforehead rest

Scene image database

A total of 600 grayscale scene images (selected fromthe Internet and free of any copyright restrictions) wereused in the current study (100 images per scenecategory) as either target stimuli or as images used toconstruct mask stimuli There were six basic levelcategories (a) beaches (b) forests (c) airports (d)streets (e) home interiors and (f) store interiors Thesecould be further categorized as three conjoint super-ordinate categories natural outdoor (beaches andforests) man-made outdoor (airports and streets) andman-made indoor (home and store interiors) Eachimage category set was assembled by cropping a 512 middot

512 pixel region from a random location within thelarger image

Each image I(x y) was normalized to possess afixed root-mean square (rms) contrast (as defined inPavel Sperling Riedl amp Vanderbeek 1987) and meanluminance in the spatial (ie image) domain using thesequence of functions that can be found in Hansen andHess (2007) Here rms was set to 0126 with the meangrayscale value of all images fixed at 127 Theparticular choice of rms allowed for a suprathresholdcontrast that did not result in pixel grayscale valuesoutside the 0ndash255 range

Experiment 1

The current experiment was designed to determinewhich aspect of the amplitude spectrum (slope ororientation) in amplitude-only masks is responsible formasking rapid scene categorization performance (a)the slope of the amplitude spectrum (b) the orientationbiases or (c) both

Method

Participants

Forty-seven Kansas State University students (34female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent (with written parental consent also given forthose under the age of 18) All observers had normal(or corrected to normal) vision and their ages rangedfrom 16 to 30 years (M frac14 1933 SD frac14 293)

Noise mask creation

All noise masks were constructed in the Fourierdomain using MATLAB (version R2008a) includingImage Processing (version 61) and Signal Processing(version 69) toolboxes Mask construction involvedcreating a base amplitude spectrum BASEAMP(f h)weighted by an orientation filter The orientation filterused here consisted of an orientation biased amplitudespectrum ORNTAMP( f h) constructed by averagingthe amplitude spectra from all images described in theGeneral method section The base and orientationbiased spectra were created in polar coordinates thus fand h represent the spatial frequency and orientationdimensions respectively The orientation filter firstinvolved constructing an amplitude spectrum from theaverage of the entire set of scene category images (nfrac14600 total images) shifted into polar coordinates withthe DC (ie zero frequency) component assigned a

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 4

zero The averaged amplitude spectrum denoted asORNTAMP( f h) possessed the typical orientation biasobserved in natural scenes namely more contrast alongthe cardinal axes vertical and horizontal than theobliques (eg Coppola Purves McCoy amp Purves1998 Hansen amp Essock 2004 Keil amp Cristobal 2000Switkes Mayer amp Sloan 1978 Torralba amp Oliva2003 van der Schaaf amp van Hateren 1996) However italso contained the typical spatial frequency contrastbias in (ie much more contrast at lower than higherspatial frequencies) Since we needed ORNTAMP( f h)to only serve as an orientation filter we next removedthe spatial frequency contrast bias by the followingoperations First we created an orientation averagedvector OA( fu) by averaging the amplitude coefficientsacross all orientations for each spatial frequency inORNTAMP( f h) stated formally as

OAeth fuTHORN frac141

vn

Xufrac141un

vfrac141vn

ORNTAMPeth fu hvTHORN eth1THORN

where the subscripts u and v represent spatial frequencyand orientation coordinates respectively with un beingthe total number of spatial frequencies (in cycles perpicture cpp) and vn being the total number oforientations Importantly due to the nature of thediscrete Fourier transform (DFT) (eg Hansen ampEssock 2004) the total number of orientations vn at agiven spatial frequency varies monotonically withincreasing spatial frequency Next to remove the spatialfrequency bias from ORNTAMP( f h) each orientationvector in ORNTAMP( f h) was divided by OA( fu) Thisproduced a flat amplitude spectrum (ie a frac14 00) butwith the orientation bias preserved which served as theorientation filter used to weight the amplitude spectra ofthe noise masks in the current experiment This isformally stated as

Ofilteth f hTHORN frac14ORNTAMPeth f hvfrac141vnTHORN

OAeth fuTHORN eth2THORN

Here f in the numerator is without a subscript as thedivision is applied to all spatial frequencies along eachorientation vector within ORNTAMP( f h) FinallyOfilt( f h) represents the orientation filter itself

Next the base amplitude spectrum BASEAMP( f h)was constructed in a single 512 middot 512 polar matrixwith all coordinates assigned the same arbitraryamplitude coefficient (except the location of the DCcomponent which was assigned a zero) The result is aflat isotropic broadband spectrum (ie possessingequal contrast across all spatial frequencies andorientations) Then the base spectrum exponent couldbe adjusted by multiplying each spatial frequencyrsquosamplitude coefficient by f a with a representing theamplitude spectrum exponent For the current exper-

iment a was set to either 00 05 10 or 15 to capturethe typical range of as observed in scene imagery (eg05 to 15) and a white noise condition (ie a frac14 00)

Then an orientation bias was applied to a basespectrum having one of the four different a values byweighting it with Ofilt(f h) according to the followingoperation

MASKo frac14 ethBASEAMP f hfrac12 middot 1Omag

THORN

thornethOfilt f hfrac12 middot Omag

THORNg eth3THORN

Here Omag was a scalar value controlling theorientation bias magnitude which took on one of fivevalues from zero to one (in steps of 025) with zeroproducing an isotropic amplitude spectrum and oneproducing a mask amplitude spectrum having the fullbias in ORNTAMP( f h) Thus subscript ao representsmask amplitude spectra possessing one of the fouramplitude spectrum slopes (a) and one of the fiveorientation bias magnitudes (o)

Finally we used a unique phase spectrum for eachmask by assigning random values fromp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar space URAND( f h)frac14 -URAND( f h) The noise patterns were rendered in thespatial domain by taking the inverse Fourier transformof MASKo( f h) and a given URAND( f h) with bothshifted to Cartesian coordinates prior to the inverseDFT The rms contrast of all noise masks was fixed at0126 in the spatial domain by scaling the powerspectrum in the Fourier domain A total of 100 uniquemasks was generated for each of the four a and fiveorientation bias magnitude conditions Examples of the20 mask types are shown in Figure 1A

Design and procedure

Experiment 1 used a six alternative forced choice(AFC) scene categorization backward masking para-digm The experiment had a 4 (Mask a)middot 5 (OrientationBias Magnitude) mixed design with mask a a between-subjects variable and orientation bias magnitude awithin-subjects variable In each mask a condition eachparticipant saw 60 different target category stimuli(described in the General method section) for each levelof mask orientation bias magnitude (10 stimuli ran-domly selected without replacement from each scenecategory) for 300 total trials The order of targetcategory and mask orientation bias magnitude wasrandomized Participants were randomly assigned tobetween-subjects conditions

A given trial sequence consisted of a 500 ms fixationpoint (038 black circle) at the screen center (with thescreen set to mean luminance) followed by a targetscene presented for 2356 ms followed by a 1176 ms

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 5

Figure 1 (A) Example noise masks used in Experiment 1 The examples are arranged in rows for amplitude spectrum slope (ie a) and

columns for orientation magnitude Specifically amplitude spectrum slope increases from the top row (afrac14 00) to the bottom row (a

frac14 15) in steps of 05 The far left column is Omag frac14 00 (ie no orientation bias) to the far right column Omag frac14 10 (maximum

orientation bias) in steps of 025 (B) Shows orientation averaged amplitude spectra for the four different noise mask slopes (C)

Shows a 2-D contour plot of Ofilt(f h) plotted in polar coordinates Note the distinct horizontal and vertical orientation bias

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 6

interstimulus interval (ISI) (a mean luminance blankscreen stimulus onset asynchrony [SOA] frac14 3532 ms)then a noise mask for 7067 ms (ie 31 masktargetduration ratio) followed by a mean luminance screenfor 750 ms and then the response screen containing a 2middot 3 grid of the six basic level scene category labels untilthe observer responded by a mouse click on thecategory label matching the target stimulus Thepositions of the category labels were randomized oneach trial to avoid category location response biasThat is by randomizing the locations of responsechoices on every trial any location-based response bias(eg top left response cell) would be randomlydistributed across all categories rather than only asingle category Note however that this requiresparticipants to first search for the correct answer oneach trial adding to their reaction time making thismethod better suited to analyzing accuracy thanreaction times Before the experiment began partici-pants were given practice trials using target stimulifrom categories not used in the experiment but withnoise masks from the assigned a condition and variablemask orientation bias magnitude

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope aresulted in 12 in the afrac14 00 condition 11 in the afrac14 05condition 11 in the afrac14 10 condition and 13 in the afrac1415 condition Figure 2a shows mean scene categoriza-tion accuracy for each amplitude spectrum slope amask condition as a function of mask orientation bias

magnitude Figure 2b shows the same data plotted foreach level of mask orientation bias as a function of maskamplitude spectrum a which we tested with a 4 (Mask a)middot 5 (Orientation Bias Magnitude) mixed repeatedmeasures analysis of variance (ANOVA) The data inFigure 2 show a strong masking effect for amplitudespectrum a which was confirmed by a significantbetween-subjects effect of mask amplitude spectrum aF(3 43)frac14 2409 p 0001 Cohenrsquos f 2frac14 02711 On theother hand Figure 2 shows little to no effect of maskorientation bias magnitude which is also confirmed by anonsignificant within-subjects effect of mask orientationbias magnitude F(4 172)frac14 040 pfrac14 081 Likewise assuggested by Figure 2 there was no significantinteraction between mask a and mask orientation biasmagnitude F(12 172)frac14 0735 pfrac14 072

We carried out post-hoc comparisons of eachamplitude slope condition against the others using aBonferonni adjusted family-wise a level of 00083 for thesix unique comparisons among the four conditions Tocarry out the independent groups two-tailed t tests wefirst created equal sized groups by randomly selectingone subject to omit from the a frac14 0 condition and twosubjects to omit from the afrac14 15 condition (though thestatistical test results were the same whether the subjectswere omitted or not) As suggested by examination ofFigure 2a all three comparisons with the a frac14 10condition were significant (ts 311 ps 0006Cohenrsquos ds 056) as was the comparison of the afrac14 05and 00 (white noise) conditions t(20)frac14 551 p 0001Cohenrsquos dfrac14 191 Conversely the comparison of the afrac1415 and 00 conditions was not significant t(20)frac14 163 pfrac14 0118 nor was the comparison of the afrac14 15 and 05conditions t(20)frac14 242 pfrac14 0025 (based on the family-wise adjusted alpha level)

Figure 2 Averaged data for Experiment 1 For both plots the ordinate shows averaged percent correct (note that the scale begins at

40 to increase visibility) (A) Shows amplitude spectrum slope masking functions for the different mask orientation bias magnitudes

(OM) (B) Shows mask orientation bias magnitude (OM) masking functions for the four different mask amplitude spectrum slopes

Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 7

Lastly while unlikely (see Introduction andLoschky et al 2007) we examined whether categori-zation performance for any given scene category wasmasked more or less compared to the others as afunction of mask orientation bias magnitude Therewere no significant differences for any of the six scenecategories as a function of mask orientation biasmagnitude for any of the mask amplitude spectrumslopes ps 005 (results not shown) Thus in terms ofwhich amplitude-only spectral characteristics moststrongly mask rapid scene categorization the currentresults show that the distribution of contrast acrossspatial frequency (ie amplitude spectrum slope a) ismuch stronger than the distribution of contrast acrossorientation

Finally the masking functions shown in Figure 2bshow an interesting trend that seems related to thetypical amplitude spectrum slope found in naturalscenes (ie a rsquo 10) Specifically it is tempting toconclude that the more similar the mask slopes wereto the typical slope observed in natural scenes thestronger the masking effect However since the entiretarget image set had a mean slope of 112 (SD frac14012) and because there were no category-specificmasking effects (across orientation magnitude orslope) Occamrsquos razor suggests the simpler (thusbetter) description is that the observed maskingeffects follow an amplitude slope similarity principleSpecifically the more similar the mask amplitudeslope was to the target image slope the stronger themasking effect We will return to this notion inExperiments 4 and 5 as well as in the Generaldiscussion

Experiment 2

Given the amplitude spectrum slope was the drivingforce in the rapid scene categorization masking effectsin Experiment 1 for a fixed targetmask SOAExperiment 2 further explored the crucial conditionsfrom Experiment 1 within the temporal processingdomain Specifically we investigated whether theamplitude spectrum slope a was more or less of a factorat varying processing times Thus we replicatedExperiment 1 for masks set to either afrac14 00 or 10 witheach possessing either a strong or no orientation bias(Omag frac14 10 vs 00) across five different targetmaskSOAs We note along with VanRullen (2011) that therelationship between masking SOA and processing timeis not as simple as suggested by many studies usingbackward masking to study the time course ofperception Specifically neurophysiological studies thathave investigated the link between masking SOA andthe time course of target-related brain activity have

shown that while SOA and processing time are notidentical they are related (eg Rieger Braun Bulthoffamp Gegenfurtner 2005)

Method

Participants

Fifty-three Colgate University students (34 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent All ob-servers had normal (or corrected to normal) visionand their ages ranged from 18 to 21 years (M frac14 188SD frac14 079)

Design and procedure

The mask stimuli were identical to those used inExperiment 1 with the following exceptions Only twomask amplitude spectrum a values (afrac14 00 and afrac14 10)and two mask orientation bias magnitudes (Omagfrac14 00and Omag frac14 10) were employed in the currentexperiment Target stimuli were drawn from the setused in Experiment 1 as described in the Generalmethod section

The current experiment employed the same para-digm as Experiment 1 except that the currentexperiment utilized five different SOAs and a no-maskcondition The experiment used a 2 (Mask a) middot 2(Orientation Bias Magnitude) middot 6 (SOA amp No-Mask)mixed design with mask a and orientation biasmagnitude being between-subjects variables and SOAbeing a within-subjects variable Within a given mask aand orientation bias magnitude condition each par-ticipant saw 60 different target stimuli (described in theGeneral method section) for each SOA (with 10 stimulirandomly selected without replacement from each ofsix scene categories) For the no-mask condition anadditional 60 images were randomly sampled (10 perscene category) for a total of 360 trials Order of targetcategory and SOA (including no mask) was random-ized for each trial Participants were randomly assignedto the between-subjects conditions

The trial sequence was identical to that of Exper-iment 1 except that the targetmask ISI was set to yieldfive different SOAs (frac14 target durationthorn ISI) 2356 ms(ie 0 ms ISI) 3529 ms 4705 ms 7057 ms and11764 ms For no-mask trials the target image wassimply followed by the same mean luminance screenfor 750 ms as in the masking conditions (as inExperiment 1) Participantsrsquo task was the same asExperiment 1 and they were given practice trials usingtarget stimuli from categories not used in theexperiment

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 8

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope athornorientation bias magnitude resulted in 12 in the afrac1400thornorientation biasfrac14 00 condition 14 in the afrac14 00thornorientation biasfrac14 10 condition 13 in the afrac14 00thornorientation biasfrac14 00 condition and 14 in the afrac14 10thornorientation biasfrac14 10 condition Figure 3 shows meanscene categorization accuracy for each amplitude spec-trum slope a and orientation bias mask condition as afunction of SOA (and no mask) and we conducted a 2(Mask a) middot 2 (Mask Orientation Bias Magnitude) middot 5(SOA) mixed repeated measures ANOVA to test theeffects of these factors As expected Figure 3 shows areduction ofmasking strengthwith increasing SOA for allbetween-subjects masking conditions which was con-firmed by a significant main effect of SOA F(4 196)frac1413564 p 0001 partial g2frac14074Also as inExperiment1 noise masks with amplitude spectrum afrac1410 producedstronger effects than that of afrac1400 for the shorter SOAsas shown by a significant main effect of mask a F(1 49)frac141571 p 0001 Cohenrsquos f 2frac14026 anda significantmaskamiddotSOAinteractionF(4 196)frac143645p 0001Cohenrsquosf 2frac14101 Interestingly the effect ofmask orientation biasmagnitudewas absent in both afrac1410 conditions but therewas an apparent effect of orientationmagnitude in the afrac1400 conditions However this was not supported by theANOVA which found neither a significant main effect oforientation magnitude nor any significant interactionsinvolving it (ps 0212) Thuswe explored the significantMask a middot SOA interaction to identify the SOAs yieldingsignificant differences between the twomask a conditionsWe collapsed across orientation magnitude within eachlevel of mask a and ran independent t tests (Bonferonniadjusted family-wise a levelfrac14 00083) between the two agroups at each SOA (and the no-mask condition) Thisshowed significantdifferences for the three shortest SOAs24 ms t(50)frac14 789 p 0001 Cohenrsquos dfrac14 227 36 mst(50)frac14364 pfrac140001Cohenrsquos dfrac14106 and 48ms t(50)frac1431 pfrac140003 Cohenrsquos dfrac14 09 no SOAs 48 ms(including the no-mask condition) showed significantdifferences between the two mask a groups (ps 0164)Thus the effect of noisemask a (ie strongermasking foramplitude spectra afrac14 10) was only found within thecritically important first 50 ms of scene gist processingtime (Loschky et al 2007 Loschky et al 2010) whichfollowed the target-mask amplitude slope similarityprinciple observed in Experiment 1

Experiment 3

So far we have found that amplitude-only slope afrac1410 masks produce the strongest backward masking of

rapid scene categorization (when compared to afrac14 0[white noise] 05 and 15) specifically very early inscene category processing (ie the first 50 ms ofprocessing time) One simple explanation for thestronger masking produced by a frac14 10 masks is thatthey have higher perceived contrast than smaller orlarger a values and higher perceived contrast producesstronger masking Specifically previous studies (egCass Alais Spehar amp Bex 2009 Field amp Brady 1997Tolhurst amp Tadmor 2000) have noted a difference insubjective contrast between rms contrast balancedimages with varying a More specifically Hansen andHess (2012 figure 4a) found that afrac14 10 noise patternswere perceived to have 83 more rms contrast than afrac1400 noise and 50 more rms contrast than a frac14 15noise patterns This trend mirrors the results ofExperiment 1 in which the largest advantage for afrac1410noise masks was in comparison to afrac14 00 noise masksand the next largest advantage was in comparison to afrac14 15 noise masks

Thus Experiment 3 sought to equate the perceivedcontrast of the a frac14 00 and a frac14 10 noise masks Thecontrast matching data reported by Hansen and Hess(2012) found a 83 difference in perceived contrastbetween afrac14 00 and afrac14 10 noise Thus in Experiment3 we set the a frac14 00 [white] noise masks to an rmscontrast twice the rms contrast of the afrac14 10 masks

Method

Participants

Thirty-two Kansas State University students (16female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed written

Figure 3 Averaged data from Experiment 2 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 9

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

(2007) and Loschky et al (2010) studies Experiments1ndash3 aimed at answering this question

Amplitude thorn phase backward masking

The studies by Loschky et al (2007) found thatactual unaltered scenes used as masks (ie a type ofamplitudethorn phase mask) were much more effective atinterfering with rapid scene categorization than phase-randomized scene masks (ie amplitude-only masks)This finding suggests that amplitude-only masks andamplitudethorn phase masks produce different maskingfunctions However the amplitude thorn phase masksutilized by Loschky et al (2007) contained recognizablescene structure Thus an alternative explanation forsuch differences in masking effects would be conceptualmasking Loschky et al (2010) therefore explored thisconceptual masking account in a follow-up study usingunrecognizable amplitudethorn phase masks Those maskswere created by feeding real-world scene images to thePortilla and Simoncelli (2000) texture synthesis algo-rithm which produced scene-texture masks containingunrecognizable conjoint amplitude thorn phase-definedimage structures Loschky et al (2010) found that suchamplitudethorn phase masks interfered with rapid scenecategorization far more than amplitude-only (phaserandomized) masks and both masked more than whitenoise Critically the unrecognizable amplitudethorn phasemasks produced 74 of the difference in maskingfound between white noise masks and unaltered scenesused as masks namely recognizable amplitudethorn phasemasks That is 74 of the observed masking effectsmentioned above that might have been attributed toconceptual masking could instead be explained byunrecognizable amplitude thorn phase-defined scene struc-ture

Loschky et al (2010) showed that amplitudethorn phasemasks which contained unrecognizable phase-definedimage structure yielded strong masking effects unex-plainable by conceptual masking Nevertheless we arestill left with the question of how such amplitude thornphase regularities modulate rapid scene categorizationperformance Given that those masks were derivedfrom real-world scene images the unrecognizablephase-defined image structures in the amplitude thornphase masks may have regulated scene categorizationin a category-specific manner If so to take an examplewould unrecognizable amplitudethornphase masks derivedfrom beach scenes more strongly mask beach targetsthan unrecognizable amplitude thorn phase masks derivedfrom mountain scenes (or vice versa) Experiments 4ndash5aimed at answering this question

It is important to note that utilizing backwardmasking paradigms to study rapid scene categorizationis not without its limitations One point worth

reiterating here is that masking has a long history in thestudy of early visual processes such as the detection anddiscrimination of sinusoidal gratings or Gabor pat-terns Thus backward masking effects in rapid scenecategorization must be considered in the terms of earlyspatial vision mechanisms assessed by visual maskingSpecifically one must consider whether a given set ofbackward masking effects can be predicted by theknown response characteristics of early visual mecha-nisms which could apply equally well to a wide arrayof visual tasks including rapid scene categorization Forexample if a given backward masking effect can beaccounted for by a reduction of the signal-to-noiseratio in early spatial channels then the information inthe mask may simply have been more effective atstopping informative scene information from beingpassed along to higher levels where scene categorizationis thought to occur (eg the parahippocampal placearea [PPA] or the lateral occipital complex [LOC]) Wetherefore apply this approach to interpreting the resultsof the current study In doing so we attempt todissociate those masking effects more likely to beexplained by earlier spatial channel interactions fromthose less likely to be so explained

The current study

To return to the primary goals laid out in ourintroductory paragraph the current study (a) providesan affirmative answer to the question of whetherunrecognizable amplitude-only masks versus unrecog-nizable amplitude thorn phase masks produce differentmasking functions and (b) addresses how the spectralcharacteristics of amplitude-only masks and amplitudethorn phase masks modulate their respective maskingfunctions Experiments 1ndash3 were concerned with theeffects of the spectral characteristics of amplitude-onlymasks Experiment 1 employed noise masks containingonly amplitude-defined structure (ie variable slope aand variable orientation bias magnitudes) Experiment2 followed up Experiment 1 by extending its crucialmasking conditions into the temporal domain Exper-iment 3 controlled for the perceived contrast (ratherthan physical contrast) of the noise masks used inExperiments 1 and 2 Experiments 4ndash5 were concernedwith the effects of the spectral characteristics ofamplitude-only versus amplitude thorn phase masksExperiment 4 assessed the recognizability of imagestructure in amplitude-only masks versus amplitude thornphase masks that would be used in Experiment 5 Thiswas necessary to ensure minimal to no recognizabilityin order eliminate conceptual masking effects Exper-iment 5 then assessed whether unrecognizable ampli-tude-only versus amplitudethorn phase masks interfered

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 3

with rapid scene categorization in a category-specificmanner

General method

Apparatus

For experiments conducted at Kansas State Uni-versity (Experiments 1 3 4 amp 5) all stimuli werepresented on 19 in Samsung SyncMaster 957 MB Smonitors These were driven by Optiplex 170L Pentium4 processors (28 GHz) which were equipped with 512RAM and Intel 82865G Graphics Controller Integrat-ed Chipset graphics cards which had 8-bit grayscaleresolution Stimuli were displayed using a linearizedlook-up table generated by calibrating with a Color-Vision Spyder2 Pro sensor Maximum luminanceoutput of the display monitors was 806 cdm2 theframe rate was set to 85 Hz and the resolution was setto 1024 middot 768 pixels Single pixels subtended 03698 ofvisual angle (ie 221 arc min) as viewed from adistance of 533 cm using machined chin rests

For experiments conducted at Colgate University(Experiment 2) all stimuli were presented on a 21 inViewsonic (G225fB) monitor This was driven by a dualcore Intelt Xeont processor (160 GHz x2) which wasequipped with 4 GB RAM and a 256 MB PCIe x16ATI FireGL V7200 dual DVIVGA graphics card andwith 8-bit grayscale resolution Stimuli were displayedusing a linearized look-up table generated by cali-brating with a Color-Vision Spyder3 Pro sensorMaximum luminance output of the display monitorwas set to yield 806 cdm2 the frame rate was set to 85Hz and the resolution was set to 1024 middot 768 pixelsSingle pixels subtended 003638 of visual angle (ie218 arc min) as viewed from a distance of 580 cmusing an Applied Science Laboratories (ASL) chin andforehead rest

Scene image database

A total of 600 grayscale scene images (selected fromthe Internet and free of any copyright restrictions) wereused in the current study (100 images per scenecategory) as either target stimuli or as images used toconstruct mask stimuli There were six basic levelcategories (a) beaches (b) forests (c) airports (d)streets (e) home interiors and (f) store interiors Thesecould be further categorized as three conjoint super-ordinate categories natural outdoor (beaches andforests) man-made outdoor (airports and streets) andman-made indoor (home and store interiors) Eachimage category set was assembled by cropping a 512 middot

512 pixel region from a random location within thelarger image

Each image I(x y) was normalized to possess afixed root-mean square (rms) contrast (as defined inPavel Sperling Riedl amp Vanderbeek 1987) and meanluminance in the spatial (ie image) domain using thesequence of functions that can be found in Hansen andHess (2007) Here rms was set to 0126 with the meangrayscale value of all images fixed at 127 Theparticular choice of rms allowed for a suprathresholdcontrast that did not result in pixel grayscale valuesoutside the 0ndash255 range

Experiment 1

The current experiment was designed to determinewhich aspect of the amplitude spectrum (slope ororientation) in amplitude-only masks is responsible formasking rapid scene categorization performance (a)the slope of the amplitude spectrum (b) the orientationbiases or (c) both

Method

Participants

Forty-seven Kansas State University students (34female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent (with written parental consent also given forthose under the age of 18) All observers had normal(or corrected to normal) vision and their ages rangedfrom 16 to 30 years (M frac14 1933 SD frac14 293)

Noise mask creation

All noise masks were constructed in the Fourierdomain using MATLAB (version R2008a) includingImage Processing (version 61) and Signal Processing(version 69) toolboxes Mask construction involvedcreating a base amplitude spectrum BASEAMP(f h)weighted by an orientation filter The orientation filterused here consisted of an orientation biased amplitudespectrum ORNTAMP( f h) constructed by averagingthe amplitude spectra from all images described in theGeneral method section The base and orientationbiased spectra were created in polar coordinates thus fand h represent the spatial frequency and orientationdimensions respectively The orientation filter firstinvolved constructing an amplitude spectrum from theaverage of the entire set of scene category images (nfrac14600 total images) shifted into polar coordinates withthe DC (ie zero frequency) component assigned a

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 4

zero The averaged amplitude spectrum denoted asORNTAMP( f h) possessed the typical orientation biasobserved in natural scenes namely more contrast alongthe cardinal axes vertical and horizontal than theobliques (eg Coppola Purves McCoy amp Purves1998 Hansen amp Essock 2004 Keil amp Cristobal 2000Switkes Mayer amp Sloan 1978 Torralba amp Oliva2003 van der Schaaf amp van Hateren 1996) However italso contained the typical spatial frequency contrastbias in (ie much more contrast at lower than higherspatial frequencies) Since we needed ORNTAMP( f h)to only serve as an orientation filter we next removedthe spatial frequency contrast bias by the followingoperations First we created an orientation averagedvector OA( fu) by averaging the amplitude coefficientsacross all orientations for each spatial frequency inORNTAMP( f h) stated formally as

OAeth fuTHORN frac141

vn

Xufrac141un

vfrac141vn

ORNTAMPeth fu hvTHORN eth1THORN

where the subscripts u and v represent spatial frequencyand orientation coordinates respectively with un beingthe total number of spatial frequencies (in cycles perpicture cpp) and vn being the total number oforientations Importantly due to the nature of thediscrete Fourier transform (DFT) (eg Hansen ampEssock 2004) the total number of orientations vn at agiven spatial frequency varies monotonically withincreasing spatial frequency Next to remove the spatialfrequency bias from ORNTAMP( f h) each orientationvector in ORNTAMP( f h) was divided by OA( fu) Thisproduced a flat amplitude spectrum (ie a frac14 00) butwith the orientation bias preserved which served as theorientation filter used to weight the amplitude spectra ofthe noise masks in the current experiment This isformally stated as

Ofilteth f hTHORN frac14ORNTAMPeth f hvfrac141vnTHORN

OAeth fuTHORN eth2THORN

Here f in the numerator is without a subscript as thedivision is applied to all spatial frequencies along eachorientation vector within ORNTAMP( f h) FinallyOfilt( f h) represents the orientation filter itself

Next the base amplitude spectrum BASEAMP( f h)was constructed in a single 512 middot 512 polar matrixwith all coordinates assigned the same arbitraryamplitude coefficient (except the location of the DCcomponent which was assigned a zero) The result is aflat isotropic broadband spectrum (ie possessingequal contrast across all spatial frequencies andorientations) Then the base spectrum exponent couldbe adjusted by multiplying each spatial frequencyrsquosamplitude coefficient by f a with a representing theamplitude spectrum exponent For the current exper-

iment a was set to either 00 05 10 or 15 to capturethe typical range of as observed in scene imagery (eg05 to 15) and a white noise condition (ie a frac14 00)

Then an orientation bias was applied to a basespectrum having one of the four different a values byweighting it with Ofilt(f h) according to the followingoperation

MASKo frac14 ethBASEAMP f hfrac12 middot 1Omag

THORN

thornethOfilt f hfrac12 middot Omag

THORNg eth3THORN

Here Omag was a scalar value controlling theorientation bias magnitude which took on one of fivevalues from zero to one (in steps of 025) with zeroproducing an isotropic amplitude spectrum and oneproducing a mask amplitude spectrum having the fullbias in ORNTAMP( f h) Thus subscript ao representsmask amplitude spectra possessing one of the fouramplitude spectrum slopes (a) and one of the fiveorientation bias magnitudes (o)

Finally we used a unique phase spectrum for eachmask by assigning random values fromp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar space URAND( f h)frac14 -URAND( f h) The noise patterns were rendered in thespatial domain by taking the inverse Fourier transformof MASKo( f h) and a given URAND( f h) with bothshifted to Cartesian coordinates prior to the inverseDFT The rms contrast of all noise masks was fixed at0126 in the spatial domain by scaling the powerspectrum in the Fourier domain A total of 100 uniquemasks was generated for each of the four a and fiveorientation bias magnitude conditions Examples of the20 mask types are shown in Figure 1A

Design and procedure

Experiment 1 used a six alternative forced choice(AFC) scene categorization backward masking para-digm The experiment had a 4 (Mask a)middot 5 (OrientationBias Magnitude) mixed design with mask a a between-subjects variable and orientation bias magnitude awithin-subjects variable In each mask a condition eachparticipant saw 60 different target category stimuli(described in the General method section) for each levelof mask orientation bias magnitude (10 stimuli ran-domly selected without replacement from each scenecategory) for 300 total trials The order of targetcategory and mask orientation bias magnitude wasrandomized Participants were randomly assigned tobetween-subjects conditions

A given trial sequence consisted of a 500 ms fixationpoint (038 black circle) at the screen center (with thescreen set to mean luminance) followed by a targetscene presented for 2356 ms followed by a 1176 ms

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 5

Figure 1 (A) Example noise masks used in Experiment 1 The examples are arranged in rows for amplitude spectrum slope (ie a) and

columns for orientation magnitude Specifically amplitude spectrum slope increases from the top row (afrac14 00) to the bottom row (a

frac14 15) in steps of 05 The far left column is Omag frac14 00 (ie no orientation bias) to the far right column Omag frac14 10 (maximum

orientation bias) in steps of 025 (B) Shows orientation averaged amplitude spectra for the four different noise mask slopes (C)

Shows a 2-D contour plot of Ofilt(f h) plotted in polar coordinates Note the distinct horizontal and vertical orientation bias

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 6

interstimulus interval (ISI) (a mean luminance blankscreen stimulus onset asynchrony [SOA] frac14 3532 ms)then a noise mask for 7067 ms (ie 31 masktargetduration ratio) followed by a mean luminance screenfor 750 ms and then the response screen containing a 2middot 3 grid of the six basic level scene category labels untilthe observer responded by a mouse click on thecategory label matching the target stimulus Thepositions of the category labels were randomized oneach trial to avoid category location response biasThat is by randomizing the locations of responsechoices on every trial any location-based response bias(eg top left response cell) would be randomlydistributed across all categories rather than only asingle category Note however that this requiresparticipants to first search for the correct answer oneach trial adding to their reaction time making thismethod better suited to analyzing accuracy thanreaction times Before the experiment began partici-pants were given practice trials using target stimulifrom categories not used in the experiment but withnoise masks from the assigned a condition and variablemask orientation bias magnitude

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope aresulted in 12 in the afrac14 00 condition 11 in the afrac14 05condition 11 in the afrac14 10 condition and 13 in the afrac1415 condition Figure 2a shows mean scene categoriza-tion accuracy for each amplitude spectrum slope amask condition as a function of mask orientation bias

magnitude Figure 2b shows the same data plotted foreach level of mask orientation bias as a function of maskamplitude spectrum a which we tested with a 4 (Mask a)middot 5 (Orientation Bias Magnitude) mixed repeatedmeasures analysis of variance (ANOVA) The data inFigure 2 show a strong masking effect for amplitudespectrum a which was confirmed by a significantbetween-subjects effect of mask amplitude spectrum aF(3 43)frac14 2409 p 0001 Cohenrsquos f 2frac14 02711 On theother hand Figure 2 shows little to no effect of maskorientation bias magnitude which is also confirmed by anonsignificant within-subjects effect of mask orientationbias magnitude F(4 172)frac14 040 pfrac14 081 Likewise assuggested by Figure 2 there was no significantinteraction between mask a and mask orientation biasmagnitude F(12 172)frac14 0735 pfrac14 072

We carried out post-hoc comparisons of eachamplitude slope condition against the others using aBonferonni adjusted family-wise a level of 00083 for thesix unique comparisons among the four conditions Tocarry out the independent groups two-tailed t tests wefirst created equal sized groups by randomly selectingone subject to omit from the a frac14 0 condition and twosubjects to omit from the afrac14 15 condition (though thestatistical test results were the same whether the subjectswere omitted or not) As suggested by examination ofFigure 2a all three comparisons with the a frac14 10condition were significant (ts 311 ps 0006Cohenrsquos ds 056) as was the comparison of the afrac14 05and 00 (white noise) conditions t(20)frac14 551 p 0001Cohenrsquos dfrac14 191 Conversely the comparison of the afrac1415 and 00 conditions was not significant t(20)frac14 163 pfrac14 0118 nor was the comparison of the afrac14 15 and 05conditions t(20)frac14 242 pfrac14 0025 (based on the family-wise adjusted alpha level)

Figure 2 Averaged data for Experiment 1 For both plots the ordinate shows averaged percent correct (note that the scale begins at

40 to increase visibility) (A) Shows amplitude spectrum slope masking functions for the different mask orientation bias magnitudes

(OM) (B) Shows mask orientation bias magnitude (OM) masking functions for the four different mask amplitude spectrum slopes

Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 7

Lastly while unlikely (see Introduction andLoschky et al 2007) we examined whether categori-zation performance for any given scene category wasmasked more or less compared to the others as afunction of mask orientation bias magnitude Therewere no significant differences for any of the six scenecategories as a function of mask orientation biasmagnitude for any of the mask amplitude spectrumslopes ps 005 (results not shown) Thus in terms ofwhich amplitude-only spectral characteristics moststrongly mask rapid scene categorization the currentresults show that the distribution of contrast acrossspatial frequency (ie amplitude spectrum slope a) ismuch stronger than the distribution of contrast acrossorientation

Finally the masking functions shown in Figure 2bshow an interesting trend that seems related to thetypical amplitude spectrum slope found in naturalscenes (ie a rsquo 10) Specifically it is tempting toconclude that the more similar the mask slopes wereto the typical slope observed in natural scenes thestronger the masking effect However since the entiretarget image set had a mean slope of 112 (SD frac14012) and because there were no category-specificmasking effects (across orientation magnitude orslope) Occamrsquos razor suggests the simpler (thusbetter) description is that the observed maskingeffects follow an amplitude slope similarity principleSpecifically the more similar the mask amplitudeslope was to the target image slope the stronger themasking effect We will return to this notion inExperiments 4 and 5 as well as in the Generaldiscussion

Experiment 2

Given the amplitude spectrum slope was the drivingforce in the rapid scene categorization masking effectsin Experiment 1 for a fixed targetmask SOAExperiment 2 further explored the crucial conditionsfrom Experiment 1 within the temporal processingdomain Specifically we investigated whether theamplitude spectrum slope a was more or less of a factorat varying processing times Thus we replicatedExperiment 1 for masks set to either afrac14 00 or 10 witheach possessing either a strong or no orientation bias(Omag frac14 10 vs 00) across five different targetmaskSOAs We note along with VanRullen (2011) that therelationship between masking SOA and processing timeis not as simple as suggested by many studies usingbackward masking to study the time course ofperception Specifically neurophysiological studies thathave investigated the link between masking SOA andthe time course of target-related brain activity have

shown that while SOA and processing time are notidentical they are related (eg Rieger Braun Bulthoffamp Gegenfurtner 2005)

Method

Participants

Fifty-three Colgate University students (34 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent All ob-servers had normal (or corrected to normal) visionand their ages ranged from 18 to 21 years (M frac14 188SD frac14 079)

Design and procedure

The mask stimuli were identical to those used inExperiment 1 with the following exceptions Only twomask amplitude spectrum a values (afrac14 00 and afrac14 10)and two mask orientation bias magnitudes (Omagfrac14 00and Omag frac14 10) were employed in the currentexperiment Target stimuli were drawn from the setused in Experiment 1 as described in the Generalmethod section

The current experiment employed the same para-digm as Experiment 1 except that the currentexperiment utilized five different SOAs and a no-maskcondition The experiment used a 2 (Mask a) middot 2(Orientation Bias Magnitude) middot 6 (SOA amp No-Mask)mixed design with mask a and orientation biasmagnitude being between-subjects variables and SOAbeing a within-subjects variable Within a given mask aand orientation bias magnitude condition each par-ticipant saw 60 different target stimuli (described in theGeneral method section) for each SOA (with 10 stimulirandomly selected without replacement from each ofsix scene categories) For the no-mask condition anadditional 60 images were randomly sampled (10 perscene category) for a total of 360 trials Order of targetcategory and SOA (including no mask) was random-ized for each trial Participants were randomly assignedto the between-subjects conditions

The trial sequence was identical to that of Exper-iment 1 except that the targetmask ISI was set to yieldfive different SOAs (frac14 target durationthorn ISI) 2356 ms(ie 0 ms ISI) 3529 ms 4705 ms 7057 ms and11764 ms For no-mask trials the target image wassimply followed by the same mean luminance screenfor 750 ms as in the masking conditions (as inExperiment 1) Participantsrsquo task was the same asExperiment 1 and they were given practice trials usingtarget stimuli from categories not used in theexperiment

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 8

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope athornorientation bias magnitude resulted in 12 in the afrac1400thornorientation biasfrac14 00 condition 14 in the afrac14 00thornorientation biasfrac14 10 condition 13 in the afrac14 00thornorientation biasfrac14 00 condition and 14 in the afrac14 10thornorientation biasfrac14 10 condition Figure 3 shows meanscene categorization accuracy for each amplitude spec-trum slope a and orientation bias mask condition as afunction of SOA (and no mask) and we conducted a 2(Mask a) middot 2 (Mask Orientation Bias Magnitude) middot 5(SOA) mixed repeated measures ANOVA to test theeffects of these factors As expected Figure 3 shows areduction ofmasking strengthwith increasing SOA for allbetween-subjects masking conditions which was con-firmed by a significant main effect of SOA F(4 196)frac1413564 p 0001 partial g2frac14074Also as inExperiment1 noise masks with amplitude spectrum afrac1410 producedstronger effects than that of afrac1400 for the shorter SOAsas shown by a significant main effect of mask a F(1 49)frac141571 p 0001 Cohenrsquos f 2frac14026 anda significantmaskamiddotSOAinteractionF(4 196)frac143645p 0001Cohenrsquosf 2frac14101 Interestingly the effect ofmask orientation biasmagnitudewas absent in both afrac1410 conditions but therewas an apparent effect of orientationmagnitude in the afrac1400 conditions However this was not supported by theANOVA which found neither a significant main effect oforientation magnitude nor any significant interactionsinvolving it (ps 0212) Thuswe explored the significantMask a middot SOA interaction to identify the SOAs yieldingsignificant differences between the twomask a conditionsWe collapsed across orientation magnitude within eachlevel of mask a and ran independent t tests (Bonferonniadjusted family-wise a levelfrac14 00083) between the two agroups at each SOA (and the no-mask condition) Thisshowed significantdifferences for the three shortest SOAs24 ms t(50)frac14 789 p 0001 Cohenrsquos dfrac14 227 36 mst(50)frac14364 pfrac140001Cohenrsquos dfrac14106 and 48ms t(50)frac1431 pfrac140003 Cohenrsquos dfrac14 09 no SOAs 48 ms(including the no-mask condition) showed significantdifferences between the two mask a groups (ps 0164)Thus the effect of noisemask a (ie strongermasking foramplitude spectra afrac14 10) was only found within thecritically important first 50 ms of scene gist processingtime (Loschky et al 2007 Loschky et al 2010) whichfollowed the target-mask amplitude slope similarityprinciple observed in Experiment 1

Experiment 3

So far we have found that amplitude-only slope afrac1410 masks produce the strongest backward masking of

rapid scene categorization (when compared to afrac14 0[white noise] 05 and 15) specifically very early inscene category processing (ie the first 50 ms ofprocessing time) One simple explanation for thestronger masking produced by a frac14 10 masks is thatthey have higher perceived contrast than smaller orlarger a values and higher perceived contrast producesstronger masking Specifically previous studies (egCass Alais Spehar amp Bex 2009 Field amp Brady 1997Tolhurst amp Tadmor 2000) have noted a difference insubjective contrast between rms contrast balancedimages with varying a More specifically Hansen andHess (2012 figure 4a) found that afrac14 10 noise patternswere perceived to have 83 more rms contrast than afrac1400 noise and 50 more rms contrast than a frac14 15noise patterns This trend mirrors the results ofExperiment 1 in which the largest advantage for afrac1410noise masks was in comparison to afrac14 00 noise masksand the next largest advantage was in comparison to afrac14 15 noise masks

Thus Experiment 3 sought to equate the perceivedcontrast of the a frac14 00 and a frac14 10 noise masks Thecontrast matching data reported by Hansen and Hess(2012) found a 83 difference in perceived contrastbetween afrac14 00 and afrac14 10 noise Thus in Experiment3 we set the a frac14 00 [white] noise masks to an rmscontrast twice the rms contrast of the afrac14 10 masks

Method

Participants

Thirty-two Kansas State University students (16female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed written

Figure 3 Averaged data from Experiment 2 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 9

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

with rapid scene categorization in a category-specificmanner

General method

Apparatus

For experiments conducted at Kansas State Uni-versity (Experiments 1 3 4 amp 5) all stimuli werepresented on 19 in Samsung SyncMaster 957 MB Smonitors These were driven by Optiplex 170L Pentium4 processors (28 GHz) which were equipped with 512RAM and Intel 82865G Graphics Controller Integrat-ed Chipset graphics cards which had 8-bit grayscaleresolution Stimuli were displayed using a linearizedlook-up table generated by calibrating with a Color-Vision Spyder2 Pro sensor Maximum luminanceoutput of the display monitors was 806 cdm2 theframe rate was set to 85 Hz and the resolution was setto 1024 middot 768 pixels Single pixels subtended 03698 ofvisual angle (ie 221 arc min) as viewed from adistance of 533 cm using machined chin rests

For experiments conducted at Colgate University(Experiment 2) all stimuli were presented on a 21 inViewsonic (G225fB) monitor This was driven by a dualcore Intelt Xeont processor (160 GHz x2) which wasequipped with 4 GB RAM and a 256 MB PCIe x16ATI FireGL V7200 dual DVIVGA graphics card andwith 8-bit grayscale resolution Stimuli were displayedusing a linearized look-up table generated by cali-brating with a Color-Vision Spyder3 Pro sensorMaximum luminance output of the display monitorwas set to yield 806 cdm2 the frame rate was set to 85Hz and the resolution was set to 1024 middot 768 pixelsSingle pixels subtended 003638 of visual angle (ie218 arc min) as viewed from a distance of 580 cmusing an Applied Science Laboratories (ASL) chin andforehead rest

Scene image database

A total of 600 grayscale scene images (selected fromthe Internet and free of any copyright restrictions) wereused in the current study (100 images per scenecategory) as either target stimuli or as images used toconstruct mask stimuli There were six basic levelcategories (a) beaches (b) forests (c) airports (d)streets (e) home interiors and (f) store interiors Thesecould be further categorized as three conjoint super-ordinate categories natural outdoor (beaches andforests) man-made outdoor (airports and streets) andman-made indoor (home and store interiors) Eachimage category set was assembled by cropping a 512 middot

512 pixel region from a random location within thelarger image

Each image I(x y) was normalized to possess afixed root-mean square (rms) contrast (as defined inPavel Sperling Riedl amp Vanderbeek 1987) and meanluminance in the spatial (ie image) domain using thesequence of functions that can be found in Hansen andHess (2007) Here rms was set to 0126 with the meangrayscale value of all images fixed at 127 Theparticular choice of rms allowed for a suprathresholdcontrast that did not result in pixel grayscale valuesoutside the 0ndash255 range

Experiment 1

The current experiment was designed to determinewhich aspect of the amplitude spectrum (slope ororientation) in amplitude-only masks is responsible formasking rapid scene categorization performance (a)the slope of the amplitude spectrum (b) the orientationbiases or (c) both

Method

Participants

Forty-seven Kansas State University students (34female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent (with written parental consent also given forthose under the age of 18) All observers had normal(or corrected to normal) vision and their ages rangedfrom 16 to 30 years (M frac14 1933 SD frac14 293)

Noise mask creation

All noise masks were constructed in the Fourierdomain using MATLAB (version R2008a) includingImage Processing (version 61) and Signal Processing(version 69) toolboxes Mask construction involvedcreating a base amplitude spectrum BASEAMP(f h)weighted by an orientation filter The orientation filterused here consisted of an orientation biased amplitudespectrum ORNTAMP( f h) constructed by averagingthe amplitude spectra from all images described in theGeneral method section The base and orientationbiased spectra were created in polar coordinates thus fand h represent the spatial frequency and orientationdimensions respectively The orientation filter firstinvolved constructing an amplitude spectrum from theaverage of the entire set of scene category images (nfrac14600 total images) shifted into polar coordinates withthe DC (ie zero frequency) component assigned a

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 4

zero The averaged amplitude spectrum denoted asORNTAMP( f h) possessed the typical orientation biasobserved in natural scenes namely more contrast alongthe cardinal axes vertical and horizontal than theobliques (eg Coppola Purves McCoy amp Purves1998 Hansen amp Essock 2004 Keil amp Cristobal 2000Switkes Mayer amp Sloan 1978 Torralba amp Oliva2003 van der Schaaf amp van Hateren 1996) However italso contained the typical spatial frequency contrastbias in (ie much more contrast at lower than higherspatial frequencies) Since we needed ORNTAMP( f h)to only serve as an orientation filter we next removedthe spatial frequency contrast bias by the followingoperations First we created an orientation averagedvector OA( fu) by averaging the amplitude coefficientsacross all orientations for each spatial frequency inORNTAMP( f h) stated formally as

OAeth fuTHORN frac141

vn

Xufrac141un

vfrac141vn

ORNTAMPeth fu hvTHORN eth1THORN

where the subscripts u and v represent spatial frequencyand orientation coordinates respectively with un beingthe total number of spatial frequencies (in cycles perpicture cpp) and vn being the total number oforientations Importantly due to the nature of thediscrete Fourier transform (DFT) (eg Hansen ampEssock 2004) the total number of orientations vn at agiven spatial frequency varies monotonically withincreasing spatial frequency Next to remove the spatialfrequency bias from ORNTAMP( f h) each orientationvector in ORNTAMP( f h) was divided by OA( fu) Thisproduced a flat amplitude spectrum (ie a frac14 00) butwith the orientation bias preserved which served as theorientation filter used to weight the amplitude spectra ofthe noise masks in the current experiment This isformally stated as

Ofilteth f hTHORN frac14ORNTAMPeth f hvfrac141vnTHORN

OAeth fuTHORN eth2THORN

Here f in the numerator is without a subscript as thedivision is applied to all spatial frequencies along eachorientation vector within ORNTAMP( f h) FinallyOfilt( f h) represents the orientation filter itself

Next the base amplitude spectrum BASEAMP( f h)was constructed in a single 512 middot 512 polar matrixwith all coordinates assigned the same arbitraryamplitude coefficient (except the location of the DCcomponent which was assigned a zero) The result is aflat isotropic broadband spectrum (ie possessingequal contrast across all spatial frequencies andorientations) Then the base spectrum exponent couldbe adjusted by multiplying each spatial frequencyrsquosamplitude coefficient by f a with a representing theamplitude spectrum exponent For the current exper-

iment a was set to either 00 05 10 or 15 to capturethe typical range of as observed in scene imagery (eg05 to 15) and a white noise condition (ie a frac14 00)

Then an orientation bias was applied to a basespectrum having one of the four different a values byweighting it with Ofilt(f h) according to the followingoperation

MASKo frac14 ethBASEAMP f hfrac12 middot 1Omag

THORN

thornethOfilt f hfrac12 middot Omag

THORNg eth3THORN

Here Omag was a scalar value controlling theorientation bias magnitude which took on one of fivevalues from zero to one (in steps of 025) with zeroproducing an isotropic amplitude spectrum and oneproducing a mask amplitude spectrum having the fullbias in ORNTAMP( f h) Thus subscript ao representsmask amplitude spectra possessing one of the fouramplitude spectrum slopes (a) and one of the fiveorientation bias magnitudes (o)

Finally we used a unique phase spectrum for eachmask by assigning random values fromp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar space URAND( f h)frac14 -URAND( f h) The noise patterns were rendered in thespatial domain by taking the inverse Fourier transformof MASKo( f h) and a given URAND( f h) with bothshifted to Cartesian coordinates prior to the inverseDFT The rms contrast of all noise masks was fixed at0126 in the spatial domain by scaling the powerspectrum in the Fourier domain A total of 100 uniquemasks was generated for each of the four a and fiveorientation bias magnitude conditions Examples of the20 mask types are shown in Figure 1A

Design and procedure

Experiment 1 used a six alternative forced choice(AFC) scene categorization backward masking para-digm The experiment had a 4 (Mask a)middot 5 (OrientationBias Magnitude) mixed design with mask a a between-subjects variable and orientation bias magnitude awithin-subjects variable In each mask a condition eachparticipant saw 60 different target category stimuli(described in the General method section) for each levelof mask orientation bias magnitude (10 stimuli ran-domly selected without replacement from each scenecategory) for 300 total trials The order of targetcategory and mask orientation bias magnitude wasrandomized Participants were randomly assigned tobetween-subjects conditions

A given trial sequence consisted of a 500 ms fixationpoint (038 black circle) at the screen center (with thescreen set to mean luminance) followed by a targetscene presented for 2356 ms followed by a 1176 ms

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 5

Figure 1 (A) Example noise masks used in Experiment 1 The examples are arranged in rows for amplitude spectrum slope (ie a) and

columns for orientation magnitude Specifically amplitude spectrum slope increases from the top row (afrac14 00) to the bottom row (a

frac14 15) in steps of 05 The far left column is Omag frac14 00 (ie no orientation bias) to the far right column Omag frac14 10 (maximum

orientation bias) in steps of 025 (B) Shows orientation averaged amplitude spectra for the four different noise mask slopes (C)

Shows a 2-D contour plot of Ofilt(f h) plotted in polar coordinates Note the distinct horizontal and vertical orientation bias

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 6

interstimulus interval (ISI) (a mean luminance blankscreen stimulus onset asynchrony [SOA] frac14 3532 ms)then a noise mask for 7067 ms (ie 31 masktargetduration ratio) followed by a mean luminance screenfor 750 ms and then the response screen containing a 2middot 3 grid of the six basic level scene category labels untilthe observer responded by a mouse click on thecategory label matching the target stimulus Thepositions of the category labels were randomized oneach trial to avoid category location response biasThat is by randomizing the locations of responsechoices on every trial any location-based response bias(eg top left response cell) would be randomlydistributed across all categories rather than only asingle category Note however that this requiresparticipants to first search for the correct answer oneach trial adding to their reaction time making thismethod better suited to analyzing accuracy thanreaction times Before the experiment began partici-pants were given practice trials using target stimulifrom categories not used in the experiment but withnoise masks from the assigned a condition and variablemask orientation bias magnitude

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope aresulted in 12 in the afrac14 00 condition 11 in the afrac14 05condition 11 in the afrac14 10 condition and 13 in the afrac1415 condition Figure 2a shows mean scene categoriza-tion accuracy for each amplitude spectrum slope amask condition as a function of mask orientation bias

magnitude Figure 2b shows the same data plotted foreach level of mask orientation bias as a function of maskamplitude spectrum a which we tested with a 4 (Mask a)middot 5 (Orientation Bias Magnitude) mixed repeatedmeasures analysis of variance (ANOVA) The data inFigure 2 show a strong masking effect for amplitudespectrum a which was confirmed by a significantbetween-subjects effect of mask amplitude spectrum aF(3 43)frac14 2409 p 0001 Cohenrsquos f 2frac14 02711 On theother hand Figure 2 shows little to no effect of maskorientation bias magnitude which is also confirmed by anonsignificant within-subjects effect of mask orientationbias magnitude F(4 172)frac14 040 pfrac14 081 Likewise assuggested by Figure 2 there was no significantinteraction between mask a and mask orientation biasmagnitude F(12 172)frac14 0735 pfrac14 072

We carried out post-hoc comparisons of eachamplitude slope condition against the others using aBonferonni adjusted family-wise a level of 00083 for thesix unique comparisons among the four conditions Tocarry out the independent groups two-tailed t tests wefirst created equal sized groups by randomly selectingone subject to omit from the a frac14 0 condition and twosubjects to omit from the afrac14 15 condition (though thestatistical test results were the same whether the subjectswere omitted or not) As suggested by examination ofFigure 2a all three comparisons with the a frac14 10condition were significant (ts 311 ps 0006Cohenrsquos ds 056) as was the comparison of the afrac14 05and 00 (white noise) conditions t(20)frac14 551 p 0001Cohenrsquos dfrac14 191 Conversely the comparison of the afrac1415 and 00 conditions was not significant t(20)frac14 163 pfrac14 0118 nor was the comparison of the afrac14 15 and 05conditions t(20)frac14 242 pfrac14 0025 (based on the family-wise adjusted alpha level)

Figure 2 Averaged data for Experiment 1 For both plots the ordinate shows averaged percent correct (note that the scale begins at

40 to increase visibility) (A) Shows amplitude spectrum slope masking functions for the different mask orientation bias magnitudes

(OM) (B) Shows mask orientation bias magnitude (OM) masking functions for the four different mask amplitude spectrum slopes

Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 7

Lastly while unlikely (see Introduction andLoschky et al 2007) we examined whether categori-zation performance for any given scene category wasmasked more or less compared to the others as afunction of mask orientation bias magnitude Therewere no significant differences for any of the six scenecategories as a function of mask orientation biasmagnitude for any of the mask amplitude spectrumslopes ps 005 (results not shown) Thus in terms ofwhich amplitude-only spectral characteristics moststrongly mask rapid scene categorization the currentresults show that the distribution of contrast acrossspatial frequency (ie amplitude spectrum slope a) ismuch stronger than the distribution of contrast acrossorientation

Finally the masking functions shown in Figure 2bshow an interesting trend that seems related to thetypical amplitude spectrum slope found in naturalscenes (ie a rsquo 10) Specifically it is tempting toconclude that the more similar the mask slopes wereto the typical slope observed in natural scenes thestronger the masking effect However since the entiretarget image set had a mean slope of 112 (SD frac14012) and because there were no category-specificmasking effects (across orientation magnitude orslope) Occamrsquos razor suggests the simpler (thusbetter) description is that the observed maskingeffects follow an amplitude slope similarity principleSpecifically the more similar the mask amplitudeslope was to the target image slope the stronger themasking effect We will return to this notion inExperiments 4 and 5 as well as in the Generaldiscussion

Experiment 2

Given the amplitude spectrum slope was the drivingforce in the rapid scene categorization masking effectsin Experiment 1 for a fixed targetmask SOAExperiment 2 further explored the crucial conditionsfrom Experiment 1 within the temporal processingdomain Specifically we investigated whether theamplitude spectrum slope a was more or less of a factorat varying processing times Thus we replicatedExperiment 1 for masks set to either afrac14 00 or 10 witheach possessing either a strong or no orientation bias(Omag frac14 10 vs 00) across five different targetmaskSOAs We note along with VanRullen (2011) that therelationship between masking SOA and processing timeis not as simple as suggested by many studies usingbackward masking to study the time course ofperception Specifically neurophysiological studies thathave investigated the link between masking SOA andthe time course of target-related brain activity have

shown that while SOA and processing time are notidentical they are related (eg Rieger Braun Bulthoffamp Gegenfurtner 2005)

Method

Participants

Fifty-three Colgate University students (34 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent All ob-servers had normal (or corrected to normal) visionand their ages ranged from 18 to 21 years (M frac14 188SD frac14 079)

Design and procedure

The mask stimuli were identical to those used inExperiment 1 with the following exceptions Only twomask amplitude spectrum a values (afrac14 00 and afrac14 10)and two mask orientation bias magnitudes (Omagfrac14 00and Omag frac14 10) were employed in the currentexperiment Target stimuli were drawn from the setused in Experiment 1 as described in the Generalmethod section

The current experiment employed the same para-digm as Experiment 1 except that the currentexperiment utilized five different SOAs and a no-maskcondition The experiment used a 2 (Mask a) middot 2(Orientation Bias Magnitude) middot 6 (SOA amp No-Mask)mixed design with mask a and orientation biasmagnitude being between-subjects variables and SOAbeing a within-subjects variable Within a given mask aand orientation bias magnitude condition each par-ticipant saw 60 different target stimuli (described in theGeneral method section) for each SOA (with 10 stimulirandomly selected without replacement from each ofsix scene categories) For the no-mask condition anadditional 60 images were randomly sampled (10 perscene category) for a total of 360 trials Order of targetcategory and SOA (including no mask) was random-ized for each trial Participants were randomly assignedto the between-subjects conditions

The trial sequence was identical to that of Exper-iment 1 except that the targetmask ISI was set to yieldfive different SOAs (frac14 target durationthorn ISI) 2356 ms(ie 0 ms ISI) 3529 ms 4705 ms 7057 ms and11764 ms For no-mask trials the target image wassimply followed by the same mean luminance screenfor 750 ms as in the masking conditions (as inExperiment 1) Participantsrsquo task was the same asExperiment 1 and they were given practice trials usingtarget stimuli from categories not used in theexperiment

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 8

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope athornorientation bias magnitude resulted in 12 in the afrac1400thornorientation biasfrac14 00 condition 14 in the afrac14 00thornorientation biasfrac14 10 condition 13 in the afrac14 00thornorientation biasfrac14 00 condition and 14 in the afrac14 10thornorientation biasfrac14 10 condition Figure 3 shows meanscene categorization accuracy for each amplitude spec-trum slope a and orientation bias mask condition as afunction of SOA (and no mask) and we conducted a 2(Mask a) middot 2 (Mask Orientation Bias Magnitude) middot 5(SOA) mixed repeated measures ANOVA to test theeffects of these factors As expected Figure 3 shows areduction ofmasking strengthwith increasing SOA for allbetween-subjects masking conditions which was con-firmed by a significant main effect of SOA F(4 196)frac1413564 p 0001 partial g2frac14074Also as inExperiment1 noise masks with amplitude spectrum afrac1410 producedstronger effects than that of afrac1400 for the shorter SOAsas shown by a significant main effect of mask a F(1 49)frac141571 p 0001 Cohenrsquos f 2frac14026 anda significantmaskamiddotSOAinteractionF(4 196)frac143645p 0001Cohenrsquosf 2frac14101 Interestingly the effect ofmask orientation biasmagnitudewas absent in both afrac1410 conditions but therewas an apparent effect of orientationmagnitude in the afrac1400 conditions However this was not supported by theANOVA which found neither a significant main effect oforientation magnitude nor any significant interactionsinvolving it (ps 0212) Thuswe explored the significantMask a middot SOA interaction to identify the SOAs yieldingsignificant differences between the twomask a conditionsWe collapsed across orientation magnitude within eachlevel of mask a and ran independent t tests (Bonferonniadjusted family-wise a levelfrac14 00083) between the two agroups at each SOA (and the no-mask condition) Thisshowed significantdifferences for the three shortest SOAs24 ms t(50)frac14 789 p 0001 Cohenrsquos dfrac14 227 36 mst(50)frac14364 pfrac140001Cohenrsquos dfrac14106 and 48ms t(50)frac1431 pfrac140003 Cohenrsquos dfrac14 09 no SOAs 48 ms(including the no-mask condition) showed significantdifferences between the two mask a groups (ps 0164)Thus the effect of noisemask a (ie strongermasking foramplitude spectra afrac14 10) was only found within thecritically important first 50 ms of scene gist processingtime (Loschky et al 2007 Loschky et al 2010) whichfollowed the target-mask amplitude slope similarityprinciple observed in Experiment 1

Experiment 3

So far we have found that amplitude-only slope afrac1410 masks produce the strongest backward masking of

rapid scene categorization (when compared to afrac14 0[white noise] 05 and 15) specifically very early inscene category processing (ie the first 50 ms ofprocessing time) One simple explanation for thestronger masking produced by a frac14 10 masks is thatthey have higher perceived contrast than smaller orlarger a values and higher perceived contrast producesstronger masking Specifically previous studies (egCass Alais Spehar amp Bex 2009 Field amp Brady 1997Tolhurst amp Tadmor 2000) have noted a difference insubjective contrast between rms contrast balancedimages with varying a More specifically Hansen andHess (2012 figure 4a) found that afrac14 10 noise patternswere perceived to have 83 more rms contrast than afrac1400 noise and 50 more rms contrast than a frac14 15noise patterns This trend mirrors the results ofExperiment 1 in which the largest advantage for afrac1410noise masks was in comparison to afrac14 00 noise masksand the next largest advantage was in comparison to afrac14 15 noise masks

Thus Experiment 3 sought to equate the perceivedcontrast of the a frac14 00 and a frac14 10 noise masks Thecontrast matching data reported by Hansen and Hess(2012) found a 83 difference in perceived contrastbetween afrac14 00 and afrac14 10 noise Thus in Experiment3 we set the a frac14 00 [white] noise masks to an rmscontrast twice the rms contrast of the afrac14 10 masks

Method

Participants

Thirty-two Kansas State University students (16female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed written

Figure 3 Averaged data from Experiment 2 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 9

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

zero The averaged amplitude spectrum denoted asORNTAMP( f h) possessed the typical orientation biasobserved in natural scenes namely more contrast alongthe cardinal axes vertical and horizontal than theobliques (eg Coppola Purves McCoy amp Purves1998 Hansen amp Essock 2004 Keil amp Cristobal 2000Switkes Mayer amp Sloan 1978 Torralba amp Oliva2003 van der Schaaf amp van Hateren 1996) However italso contained the typical spatial frequency contrastbias in (ie much more contrast at lower than higherspatial frequencies) Since we needed ORNTAMP( f h)to only serve as an orientation filter we next removedthe spatial frequency contrast bias by the followingoperations First we created an orientation averagedvector OA( fu) by averaging the amplitude coefficientsacross all orientations for each spatial frequency inORNTAMP( f h) stated formally as

OAeth fuTHORN frac141

vn

Xufrac141un

vfrac141vn

ORNTAMPeth fu hvTHORN eth1THORN

where the subscripts u and v represent spatial frequencyand orientation coordinates respectively with un beingthe total number of spatial frequencies (in cycles perpicture cpp) and vn being the total number oforientations Importantly due to the nature of thediscrete Fourier transform (DFT) (eg Hansen ampEssock 2004) the total number of orientations vn at agiven spatial frequency varies monotonically withincreasing spatial frequency Next to remove the spatialfrequency bias from ORNTAMP( f h) each orientationvector in ORNTAMP( f h) was divided by OA( fu) Thisproduced a flat amplitude spectrum (ie a frac14 00) butwith the orientation bias preserved which served as theorientation filter used to weight the amplitude spectra ofthe noise masks in the current experiment This isformally stated as

Ofilteth f hTHORN frac14ORNTAMPeth f hvfrac141vnTHORN

OAeth fuTHORN eth2THORN

Here f in the numerator is without a subscript as thedivision is applied to all spatial frequencies along eachorientation vector within ORNTAMP( f h) FinallyOfilt( f h) represents the orientation filter itself

Next the base amplitude spectrum BASEAMP( f h)was constructed in a single 512 middot 512 polar matrixwith all coordinates assigned the same arbitraryamplitude coefficient (except the location of the DCcomponent which was assigned a zero) The result is aflat isotropic broadband spectrum (ie possessingequal contrast across all spatial frequencies andorientations) Then the base spectrum exponent couldbe adjusted by multiplying each spatial frequencyrsquosamplitude coefficient by f a with a representing theamplitude spectrum exponent For the current exper-

iment a was set to either 00 05 10 or 15 to capturethe typical range of as observed in scene imagery (eg05 to 15) and a white noise condition (ie a frac14 00)

Then an orientation bias was applied to a basespectrum having one of the four different a values byweighting it with Ofilt(f h) according to the followingoperation

MASKo frac14 ethBASEAMP f hfrac12 middot 1Omag

THORN

thornethOfilt f hfrac12 middot Omag

THORNg eth3THORN

Here Omag was a scalar value controlling theorientation bias magnitude which took on one of fivevalues from zero to one (in steps of 025) with zeroproducing an isotropic amplitude spectrum and oneproducing a mask amplitude spectrum having the fullbias in ORNTAMP( f h) Thus subscript ao representsmask amplitude spectra possessing one of the fouramplitude spectrum slopes (a) and one of the fiveorientation bias magnitudes (o)

Finally we used a unique phase spectrum for eachmask by assigning random values fromp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar space URAND( f h)frac14 -URAND( f h) The noise patterns were rendered in thespatial domain by taking the inverse Fourier transformof MASKo( f h) and a given URAND( f h) with bothshifted to Cartesian coordinates prior to the inverseDFT The rms contrast of all noise masks was fixed at0126 in the spatial domain by scaling the powerspectrum in the Fourier domain A total of 100 uniquemasks was generated for each of the four a and fiveorientation bias magnitude conditions Examples of the20 mask types are shown in Figure 1A

Design and procedure

Experiment 1 used a six alternative forced choice(AFC) scene categorization backward masking para-digm The experiment had a 4 (Mask a)middot 5 (OrientationBias Magnitude) mixed design with mask a a between-subjects variable and orientation bias magnitude awithin-subjects variable In each mask a condition eachparticipant saw 60 different target category stimuli(described in the General method section) for each levelof mask orientation bias magnitude (10 stimuli ran-domly selected without replacement from each scenecategory) for 300 total trials The order of targetcategory and mask orientation bias magnitude wasrandomized Participants were randomly assigned tobetween-subjects conditions

A given trial sequence consisted of a 500 ms fixationpoint (038 black circle) at the screen center (with thescreen set to mean luminance) followed by a targetscene presented for 2356 ms followed by a 1176 ms

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 5

Figure 1 (A) Example noise masks used in Experiment 1 The examples are arranged in rows for amplitude spectrum slope (ie a) and

columns for orientation magnitude Specifically amplitude spectrum slope increases from the top row (afrac14 00) to the bottom row (a

frac14 15) in steps of 05 The far left column is Omag frac14 00 (ie no orientation bias) to the far right column Omag frac14 10 (maximum

orientation bias) in steps of 025 (B) Shows orientation averaged amplitude spectra for the four different noise mask slopes (C)

Shows a 2-D contour plot of Ofilt(f h) plotted in polar coordinates Note the distinct horizontal and vertical orientation bias

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 6

interstimulus interval (ISI) (a mean luminance blankscreen stimulus onset asynchrony [SOA] frac14 3532 ms)then a noise mask for 7067 ms (ie 31 masktargetduration ratio) followed by a mean luminance screenfor 750 ms and then the response screen containing a 2middot 3 grid of the six basic level scene category labels untilthe observer responded by a mouse click on thecategory label matching the target stimulus Thepositions of the category labels were randomized oneach trial to avoid category location response biasThat is by randomizing the locations of responsechoices on every trial any location-based response bias(eg top left response cell) would be randomlydistributed across all categories rather than only asingle category Note however that this requiresparticipants to first search for the correct answer oneach trial adding to their reaction time making thismethod better suited to analyzing accuracy thanreaction times Before the experiment began partici-pants were given practice trials using target stimulifrom categories not used in the experiment but withnoise masks from the assigned a condition and variablemask orientation bias magnitude

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope aresulted in 12 in the afrac14 00 condition 11 in the afrac14 05condition 11 in the afrac14 10 condition and 13 in the afrac1415 condition Figure 2a shows mean scene categoriza-tion accuracy for each amplitude spectrum slope amask condition as a function of mask orientation bias

magnitude Figure 2b shows the same data plotted foreach level of mask orientation bias as a function of maskamplitude spectrum a which we tested with a 4 (Mask a)middot 5 (Orientation Bias Magnitude) mixed repeatedmeasures analysis of variance (ANOVA) The data inFigure 2 show a strong masking effect for amplitudespectrum a which was confirmed by a significantbetween-subjects effect of mask amplitude spectrum aF(3 43)frac14 2409 p 0001 Cohenrsquos f 2frac14 02711 On theother hand Figure 2 shows little to no effect of maskorientation bias magnitude which is also confirmed by anonsignificant within-subjects effect of mask orientationbias magnitude F(4 172)frac14 040 pfrac14 081 Likewise assuggested by Figure 2 there was no significantinteraction between mask a and mask orientation biasmagnitude F(12 172)frac14 0735 pfrac14 072

We carried out post-hoc comparisons of eachamplitude slope condition against the others using aBonferonni adjusted family-wise a level of 00083 for thesix unique comparisons among the four conditions Tocarry out the independent groups two-tailed t tests wefirst created equal sized groups by randomly selectingone subject to omit from the a frac14 0 condition and twosubjects to omit from the afrac14 15 condition (though thestatistical test results were the same whether the subjectswere omitted or not) As suggested by examination ofFigure 2a all three comparisons with the a frac14 10condition were significant (ts 311 ps 0006Cohenrsquos ds 056) as was the comparison of the afrac14 05and 00 (white noise) conditions t(20)frac14 551 p 0001Cohenrsquos dfrac14 191 Conversely the comparison of the afrac1415 and 00 conditions was not significant t(20)frac14 163 pfrac14 0118 nor was the comparison of the afrac14 15 and 05conditions t(20)frac14 242 pfrac14 0025 (based on the family-wise adjusted alpha level)

Figure 2 Averaged data for Experiment 1 For both plots the ordinate shows averaged percent correct (note that the scale begins at

40 to increase visibility) (A) Shows amplitude spectrum slope masking functions for the different mask orientation bias magnitudes

(OM) (B) Shows mask orientation bias magnitude (OM) masking functions for the four different mask amplitude spectrum slopes

Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 7

Lastly while unlikely (see Introduction andLoschky et al 2007) we examined whether categori-zation performance for any given scene category wasmasked more or less compared to the others as afunction of mask orientation bias magnitude Therewere no significant differences for any of the six scenecategories as a function of mask orientation biasmagnitude for any of the mask amplitude spectrumslopes ps 005 (results not shown) Thus in terms ofwhich amplitude-only spectral characteristics moststrongly mask rapid scene categorization the currentresults show that the distribution of contrast acrossspatial frequency (ie amplitude spectrum slope a) ismuch stronger than the distribution of contrast acrossorientation

Finally the masking functions shown in Figure 2bshow an interesting trend that seems related to thetypical amplitude spectrum slope found in naturalscenes (ie a rsquo 10) Specifically it is tempting toconclude that the more similar the mask slopes wereto the typical slope observed in natural scenes thestronger the masking effect However since the entiretarget image set had a mean slope of 112 (SD frac14012) and because there were no category-specificmasking effects (across orientation magnitude orslope) Occamrsquos razor suggests the simpler (thusbetter) description is that the observed maskingeffects follow an amplitude slope similarity principleSpecifically the more similar the mask amplitudeslope was to the target image slope the stronger themasking effect We will return to this notion inExperiments 4 and 5 as well as in the Generaldiscussion

Experiment 2

Given the amplitude spectrum slope was the drivingforce in the rapid scene categorization masking effectsin Experiment 1 for a fixed targetmask SOAExperiment 2 further explored the crucial conditionsfrom Experiment 1 within the temporal processingdomain Specifically we investigated whether theamplitude spectrum slope a was more or less of a factorat varying processing times Thus we replicatedExperiment 1 for masks set to either afrac14 00 or 10 witheach possessing either a strong or no orientation bias(Omag frac14 10 vs 00) across five different targetmaskSOAs We note along with VanRullen (2011) that therelationship between masking SOA and processing timeis not as simple as suggested by many studies usingbackward masking to study the time course ofperception Specifically neurophysiological studies thathave investigated the link between masking SOA andthe time course of target-related brain activity have

shown that while SOA and processing time are notidentical they are related (eg Rieger Braun Bulthoffamp Gegenfurtner 2005)

Method

Participants

Fifty-three Colgate University students (34 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent All ob-servers had normal (or corrected to normal) visionand their ages ranged from 18 to 21 years (M frac14 188SD frac14 079)

Design and procedure

The mask stimuli were identical to those used inExperiment 1 with the following exceptions Only twomask amplitude spectrum a values (afrac14 00 and afrac14 10)and two mask orientation bias magnitudes (Omagfrac14 00and Omag frac14 10) were employed in the currentexperiment Target stimuli were drawn from the setused in Experiment 1 as described in the Generalmethod section

The current experiment employed the same para-digm as Experiment 1 except that the currentexperiment utilized five different SOAs and a no-maskcondition The experiment used a 2 (Mask a) middot 2(Orientation Bias Magnitude) middot 6 (SOA amp No-Mask)mixed design with mask a and orientation biasmagnitude being between-subjects variables and SOAbeing a within-subjects variable Within a given mask aand orientation bias magnitude condition each par-ticipant saw 60 different target stimuli (described in theGeneral method section) for each SOA (with 10 stimulirandomly selected without replacement from each ofsix scene categories) For the no-mask condition anadditional 60 images were randomly sampled (10 perscene category) for a total of 360 trials Order of targetcategory and SOA (including no mask) was random-ized for each trial Participants were randomly assignedto the between-subjects conditions

The trial sequence was identical to that of Exper-iment 1 except that the targetmask ISI was set to yieldfive different SOAs (frac14 target durationthorn ISI) 2356 ms(ie 0 ms ISI) 3529 ms 4705 ms 7057 ms and11764 ms For no-mask trials the target image wassimply followed by the same mean luminance screenfor 750 ms as in the masking conditions (as inExperiment 1) Participantsrsquo task was the same asExperiment 1 and they were given practice trials usingtarget stimuli from categories not used in theexperiment

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 8

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope athornorientation bias magnitude resulted in 12 in the afrac1400thornorientation biasfrac14 00 condition 14 in the afrac14 00thornorientation biasfrac14 10 condition 13 in the afrac14 00thornorientation biasfrac14 00 condition and 14 in the afrac14 10thornorientation biasfrac14 10 condition Figure 3 shows meanscene categorization accuracy for each amplitude spec-trum slope a and orientation bias mask condition as afunction of SOA (and no mask) and we conducted a 2(Mask a) middot 2 (Mask Orientation Bias Magnitude) middot 5(SOA) mixed repeated measures ANOVA to test theeffects of these factors As expected Figure 3 shows areduction ofmasking strengthwith increasing SOA for allbetween-subjects masking conditions which was con-firmed by a significant main effect of SOA F(4 196)frac1413564 p 0001 partial g2frac14074Also as inExperiment1 noise masks with amplitude spectrum afrac1410 producedstronger effects than that of afrac1400 for the shorter SOAsas shown by a significant main effect of mask a F(1 49)frac141571 p 0001 Cohenrsquos f 2frac14026 anda significantmaskamiddotSOAinteractionF(4 196)frac143645p 0001Cohenrsquosf 2frac14101 Interestingly the effect ofmask orientation biasmagnitudewas absent in both afrac1410 conditions but therewas an apparent effect of orientationmagnitude in the afrac1400 conditions However this was not supported by theANOVA which found neither a significant main effect oforientation magnitude nor any significant interactionsinvolving it (ps 0212) Thuswe explored the significantMask a middot SOA interaction to identify the SOAs yieldingsignificant differences between the twomask a conditionsWe collapsed across orientation magnitude within eachlevel of mask a and ran independent t tests (Bonferonniadjusted family-wise a levelfrac14 00083) between the two agroups at each SOA (and the no-mask condition) Thisshowed significantdifferences for the three shortest SOAs24 ms t(50)frac14 789 p 0001 Cohenrsquos dfrac14 227 36 mst(50)frac14364 pfrac140001Cohenrsquos dfrac14106 and 48ms t(50)frac1431 pfrac140003 Cohenrsquos dfrac14 09 no SOAs 48 ms(including the no-mask condition) showed significantdifferences between the two mask a groups (ps 0164)Thus the effect of noisemask a (ie strongermasking foramplitude spectra afrac14 10) was only found within thecritically important first 50 ms of scene gist processingtime (Loschky et al 2007 Loschky et al 2010) whichfollowed the target-mask amplitude slope similarityprinciple observed in Experiment 1

Experiment 3

So far we have found that amplitude-only slope afrac1410 masks produce the strongest backward masking of

rapid scene categorization (when compared to afrac14 0[white noise] 05 and 15) specifically very early inscene category processing (ie the first 50 ms ofprocessing time) One simple explanation for thestronger masking produced by a frac14 10 masks is thatthey have higher perceived contrast than smaller orlarger a values and higher perceived contrast producesstronger masking Specifically previous studies (egCass Alais Spehar amp Bex 2009 Field amp Brady 1997Tolhurst amp Tadmor 2000) have noted a difference insubjective contrast between rms contrast balancedimages with varying a More specifically Hansen andHess (2012 figure 4a) found that afrac14 10 noise patternswere perceived to have 83 more rms contrast than afrac1400 noise and 50 more rms contrast than a frac14 15noise patterns This trend mirrors the results ofExperiment 1 in which the largest advantage for afrac1410noise masks was in comparison to afrac14 00 noise masksand the next largest advantage was in comparison to afrac14 15 noise masks

Thus Experiment 3 sought to equate the perceivedcontrast of the a frac14 00 and a frac14 10 noise masks Thecontrast matching data reported by Hansen and Hess(2012) found a 83 difference in perceived contrastbetween afrac14 00 and afrac14 10 noise Thus in Experiment3 we set the a frac14 00 [white] noise masks to an rmscontrast twice the rms contrast of the afrac14 10 masks

Method

Participants

Thirty-two Kansas State University students (16female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed written

Figure 3 Averaged data from Experiment 2 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 9

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

Figure 1 (A) Example noise masks used in Experiment 1 The examples are arranged in rows for amplitude spectrum slope (ie a) and

columns for orientation magnitude Specifically amplitude spectrum slope increases from the top row (afrac14 00) to the bottom row (a

frac14 15) in steps of 05 The far left column is Omag frac14 00 (ie no orientation bias) to the far right column Omag frac14 10 (maximum

orientation bias) in steps of 025 (B) Shows orientation averaged amplitude spectra for the four different noise mask slopes (C)

Shows a 2-D contour plot of Ofilt(f h) plotted in polar coordinates Note the distinct horizontal and vertical orientation bias

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 6

interstimulus interval (ISI) (a mean luminance blankscreen stimulus onset asynchrony [SOA] frac14 3532 ms)then a noise mask for 7067 ms (ie 31 masktargetduration ratio) followed by a mean luminance screenfor 750 ms and then the response screen containing a 2middot 3 grid of the six basic level scene category labels untilthe observer responded by a mouse click on thecategory label matching the target stimulus Thepositions of the category labels were randomized oneach trial to avoid category location response biasThat is by randomizing the locations of responsechoices on every trial any location-based response bias(eg top left response cell) would be randomlydistributed across all categories rather than only asingle category Note however that this requiresparticipants to first search for the correct answer oneach trial adding to their reaction time making thismethod better suited to analyzing accuracy thanreaction times Before the experiment began partici-pants were given practice trials using target stimulifrom categories not used in the experiment but withnoise masks from the assigned a condition and variablemask orientation bias magnitude

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope aresulted in 12 in the afrac14 00 condition 11 in the afrac14 05condition 11 in the afrac14 10 condition and 13 in the afrac1415 condition Figure 2a shows mean scene categoriza-tion accuracy for each amplitude spectrum slope amask condition as a function of mask orientation bias

magnitude Figure 2b shows the same data plotted foreach level of mask orientation bias as a function of maskamplitude spectrum a which we tested with a 4 (Mask a)middot 5 (Orientation Bias Magnitude) mixed repeatedmeasures analysis of variance (ANOVA) The data inFigure 2 show a strong masking effect for amplitudespectrum a which was confirmed by a significantbetween-subjects effect of mask amplitude spectrum aF(3 43)frac14 2409 p 0001 Cohenrsquos f 2frac14 02711 On theother hand Figure 2 shows little to no effect of maskorientation bias magnitude which is also confirmed by anonsignificant within-subjects effect of mask orientationbias magnitude F(4 172)frac14 040 pfrac14 081 Likewise assuggested by Figure 2 there was no significantinteraction between mask a and mask orientation biasmagnitude F(12 172)frac14 0735 pfrac14 072

We carried out post-hoc comparisons of eachamplitude slope condition against the others using aBonferonni adjusted family-wise a level of 00083 for thesix unique comparisons among the four conditions Tocarry out the independent groups two-tailed t tests wefirst created equal sized groups by randomly selectingone subject to omit from the a frac14 0 condition and twosubjects to omit from the afrac14 15 condition (though thestatistical test results were the same whether the subjectswere omitted or not) As suggested by examination ofFigure 2a all three comparisons with the a frac14 10condition were significant (ts 311 ps 0006Cohenrsquos ds 056) as was the comparison of the afrac14 05and 00 (white noise) conditions t(20)frac14 551 p 0001Cohenrsquos dfrac14 191 Conversely the comparison of the afrac1415 and 00 conditions was not significant t(20)frac14 163 pfrac14 0118 nor was the comparison of the afrac14 15 and 05conditions t(20)frac14 242 pfrac14 0025 (based on the family-wise adjusted alpha level)

Figure 2 Averaged data for Experiment 1 For both plots the ordinate shows averaged percent correct (note that the scale begins at

40 to increase visibility) (A) Shows amplitude spectrum slope masking functions for the different mask orientation bias magnitudes

(OM) (B) Shows mask orientation bias magnitude (OM) masking functions for the four different mask amplitude spectrum slopes

Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 7

Lastly while unlikely (see Introduction andLoschky et al 2007) we examined whether categori-zation performance for any given scene category wasmasked more or less compared to the others as afunction of mask orientation bias magnitude Therewere no significant differences for any of the six scenecategories as a function of mask orientation biasmagnitude for any of the mask amplitude spectrumslopes ps 005 (results not shown) Thus in terms ofwhich amplitude-only spectral characteristics moststrongly mask rapid scene categorization the currentresults show that the distribution of contrast acrossspatial frequency (ie amplitude spectrum slope a) ismuch stronger than the distribution of contrast acrossorientation

Finally the masking functions shown in Figure 2bshow an interesting trend that seems related to thetypical amplitude spectrum slope found in naturalscenes (ie a rsquo 10) Specifically it is tempting toconclude that the more similar the mask slopes wereto the typical slope observed in natural scenes thestronger the masking effect However since the entiretarget image set had a mean slope of 112 (SD frac14012) and because there were no category-specificmasking effects (across orientation magnitude orslope) Occamrsquos razor suggests the simpler (thusbetter) description is that the observed maskingeffects follow an amplitude slope similarity principleSpecifically the more similar the mask amplitudeslope was to the target image slope the stronger themasking effect We will return to this notion inExperiments 4 and 5 as well as in the Generaldiscussion

Experiment 2

Given the amplitude spectrum slope was the drivingforce in the rapid scene categorization masking effectsin Experiment 1 for a fixed targetmask SOAExperiment 2 further explored the crucial conditionsfrom Experiment 1 within the temporal processingdomain Specifically we investigated whether theamplitude spectrum slope a was more or less of a factorat varying processing times Thus we replicatedExperiment 1 for masks set to either afrac14 00 or 10 witheach possessing either a strong or no orientation bias(Omag frac14 10 vs 00) across five different targetmaskSOAs We note along with VanRullen (2011) that therelationship between masking SOA and processing timeis not as simple as suggested by many studies usingbackward masking to study the time course ofperception Specifically neurophysiological studies thathave investigated the link between masking SOA andthe time course of target-related brain activity have

shown that while SOA and processing time are notidentical they are related (eg Rieger Braun Bulthoffamp Gegenfurtner 2005)

Method

Participants

Fifty-three Colgate University students (34 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent All ob-servers had normal (or corrected to normal) visionand their ages ranged from 18 to 21 years (M frac14 188SD frac14 079)

Design and procedure

The mask stimuli were identical to those used inExperiment 1 with the following exceptions Only twomask amplitude spectrum a values (afrac14 00 and afrac14 10)and two mask orientation bias magnitudes (Omagfrac14 00and Omag frac14 10) were employed in the currentexperiment Target stimuli were drawn from the setused in Experiment 1 as described in the Generalmethod section

The current experiment employed the same para-digm as Experiment 1 except that the currentexperiment utilized five different SOAs and a no-maskcondition The experiment used a 2 (Mask a) middot 2(Orientation Bias Magnitude) middot 6 (SOA amp No-Mask)mixed design with mask a and orientation biasmagnitude being between-subjects variables and SOAbeing a within-subjects variable Within a given mask aand orientation bias magnitude condition each par-ticipant saw 60 different target stimuli (described in theGeneral method section) for each SOA (with 10 stimulirandomly selected without replacement from each ofsix scene categories) For the no-mask condition anadditional 60 images were randomly sampled (10 perscene category) for a total of 360 trials Order of targetcategory and SOA (including no mask) was random-ized for each trial Participants were randomly assignedto the between-subjects conditions

The trial sequence was identical to that of Exper-iment 1 except that the targetmask ISI was set to yieldfive different SOAs (frac14 target durationthorn ISI) 2356 ms(ie 0 ms ISI) 3529 ms 4705 ms 7057 ms and11764 ms For no-mask trials the target image wassimply followed by the same mean luminance screenfor 750 ms as in the masking conditions (as inExperiment 1) Participantsrsquo task was the same asExperiment 1 and they were given practice trials usingtarget stimuli from categories not used in theexperiment

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 8

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope athornorientation bias magnitude resulted in 12 in the afrac1400thornorientation biasfrac14 00 condition 14 in the afrac14 00thornorientation biasfrac14 10 condition 13 in the afrac14 00thornorientation biasfrac14 00 condition and 14 in the afrac14 10thornorientation biasfrac14 10 condition Figure 3 shows meanscene categorization accuracy for each amplitude spec-trum slope a and orientation bias mask condition as afunction of SOA (and no mask) and we conducted a 2(Mask a) middot 2 (Mask Orientation Bias Magnitude) middot 5(SOA) mixed repeated measures ANOVA to test theeffects of these factors As expected Figure 3 shows areduction ofmasking strengthwith increasing SOA for allbetween-subjects masking conditions which was con-firmed by a significant main effect of SOA F(4 196)frac1413564 p 0001 partial g2frac14074Also as inExperiment1 noise masks with amplitude spectrum afrac1410 producedstronger effects than that of afrac1400 for the shorter SOAsas shown by a significant main effect of mask a F(1 49)frac141571 p 0001 Cohenrsquos f 2frac14026 anda significantmaskamiddotSOAinteractionF(4 196)frac143645p 0001Cohenrsquosf 2frac14101 Interestingly the effect ofmask orientation biasmagnitudewas absent in both afrac1410 conditions but therewas an apparent effect of orientationmagnitude in the afrac1400 conditions However this was not supported by theANOVA which found neither a significant main effect oforientation magnitude nor any significant interactionsinvolving it (ps 0212) Thuswe explored the significantMask a middot SOA interaction to identify the SOAs yieldingsignificant differences between the twomask a conditionsWe collapsed across orientation magnitude within eachlevel of mask a and ran independent t tests (Bonferonniadjusted family-wise a levelfrac14 00083) between the two agroups at each SOA (and the no-mask condition) Thisshowed significantdifferences for the three shortest SOAs24 ms t(50)frac14 789 p 0001 Cohenrsquos dfrac14 227 36 mst(50)frac14364 pfrac140001Cohenrsquos dfrac14106 and 48ms t(50)frac1431 pfrac140003 Cohenrsquos dfrac14 09 no SOAs 48 ms(including the no-mask condition) showed significantdifferences between the two mask a groups (ps 0164)Thus the effect of noisemask a (ie strongermasking foramplitude spectra afrac14 10) was only found within thecritically important first 50 ms of scene gist processingtime (Loschky et al 2007 Loschky et al 2010) whichfollowed the target-mask amplitude slope similarityprinciple observed in Experiment 1

Experiment 3

So far we have found that amplitude-only slope afrac1410 masks produce the strongest backward masking of

rapid scene categorization (when compared to afrac14 0[white noise] 05 and 15) specifically very early inscene category processing (ie the first 50 ms ofprocessing time) One simple explanation for thestronger masking produced by a frac14 10 masks is thatthey have higher perceived contrast than smaller orlarger a values and higher perceived contrast producesstronger masking Specifically previous studies (egCass Alais Spehar amp Bex 2009 Field amp Brady 1997Tolhurst amp Tadmor 2000) have noted a difference insubjective contrast between rms contrast balancedimages with varying a More specifically Hansen andHess (2012 figure 4a) found that afrac14 10 noise patternswere perceived to have 83 more rms contrast than afrac1400 noise and 50 more rms contrast than a frac14 15noise patterns This trend mirrors the results ofExperiment 1 in which the largest advantage for afrac1410noise masks was in comparison to afrac14 00 noise masksand the next largest advantage was in comparison to afrac14 15 noise masks

Thus Experiment 3 sought to equate the perceivedcontrast of the a frac14 00 and a frac14 10 noise masks Thecontrast matching data reported by Hansen and Hess(2012) found a 83 difference in perceived contrastbetween afrac14 00 and afrac14 10 noise Thus in Experiment3 we set the a frac14 00 [white] noise masks to an rmscontrast twice the rms contrast of the afrac14 10 masks

Method

Participants

Thirty-two Kansas State University students (16female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed written

Figure 3 Averaged data from Experiment 2 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 9

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

interstimulus interval (ISI) (a mean luminance blankscreen stimulus onset asynchrony [SOA] frac14 3532 ms)then a noise mask for 7067 ms (ie 31 masktargetduration ratio) followed by a mean luminance screenfor 750 ms and then the response screen containing a 2middot 3 grid of the six basic level scene category labels untilthe observer responded by a mouse click on thecategory label matching the target stimulus Thepositions of the category labels were randomized oneach trial to avoid category location response biasThat is by randomizing the locations of responsechoices on every trial any location-based response bias(eg top left response cell) would be randomlydistributed across all categories rather than only asingle category Note however that this requiresparticipants to first search for the correct answer oneach trial adding to their reaction time making thismethod better suited to analyzing accuracy thanreaction times Before the experiment began partici-pants were given practice trials using target stimulifrom categories not used in the experiment but withnoise masks from the assigned a condition and variablemask orientation bias magnitude

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope aresulted in 12 in the afrac14 00 condition 11 in the afrac14 05condition 11 in the afrac14 10 condition and 13 in the afrac1415 condition Figure 2a shows mean scene categoriza-tion accuracy for each amplitude spectrum slope amask condition as a function of mask orientation bias

magnitude Figure 2b shows the same data plotted foreach level of mask orientation bias as a function of maskamplitude spectrum a which we tested with a 4 (Mask a)middot 5 (Orientation Bias Magnitude) mixed repeatedmeasures analysis of variance (ANOVA) The data inFigure 2 show a strong masking effect for amplitudespectrum a which was confirmed by a significantbetween-subjects effect of mask amplitude spectrum aF(3 43)frac14 2409 p 0001 Cohenrsquos f 2frac14 02711 On theother hand Figure 2 shows little to no effect of maskorientation bias magnitude which is also confirmed by anonsignificant within-subjects effect of mask orientationbias magnitude F(4 172)frac14 040 pfrac14 081 Likewise assuggested by Figure 2 there was no significantinteraction between mask a and mask orientation biasmagnitude F(12 172)frac14 0735 pfrac14 072

We carried out post-hoc comparisons of eachamplitude slope condition against the others using aBonferonni adjusted family-wise a level of 00083 for thesix unique comparisons among the four conditions Tocarry out the independent groups two-tailed t tests wefirst created equal sized groups by randomly selectingone subject to omit from the a frac14 0 condition and twosubjects to omit from the afrac14 15 condition (though thestatistical test results were the same whether the subjectswere omitted or not) As suggested by examination ofFigure 2a all three comparisons with the a frac14 10condition were significant (ts 311 ps 0006Cohenrsquos ds 056) as was the comparison of the afrac14 05and 00 (white noise) conditions t(20)frac14 551 p 0001Cohenrsquos dfrac14 191 Conversely the comparison of the afrac1415 and 00 conditions was not significant t(20)frac14 163 pfrac14 0118 nor was the comparison of the afrac14 15 and 05conditions t(20)frac14 242 pfrac14 0025 (based on the family-wise adjusted alpha level)

Figure 2 Averaged data for Experiment 1 For both plots the ordinate shows averaged percent correct (note that the scale begins at

40 to increase visibility) (A) Shows amplitude spectrum slope masking functions for the different mask orientation bias magnitudes

(OM) (B) Shows mask orientation bias magnitude (OM) masking functions for the four different mask amplitude spectrum slopes

Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 7

Lastly while unlikely (see Introduction andLoschky et al 2007) we examined whether categori-zation performance for any given scene category wasmasked more or less compared to the others as afunction of mask orientation bias magnitude Therewere no significant differences for any of the six scenecategories as a function of mask orientation biasmagnitude for any of the mask amplitude spectrumslopes ps 005 (results not shown) Thus in terms ofwhich amplitude-only spectral characteristics moststrongly mask rapid scene categorization the currentresults show that the distribution of contrast acrossspatial frequency (ie amplitude spectrum slope a) ismuch stronger than the distribution of contrast acrossorientation

Finally the masking functions shown in Figure 2bshow an interesting trend that seems related to thetypical amplitude spectrum slope found in naturalscenes (ie a rsquo 10) Specifically it is tempting toconclude that the more similar the mask slopes wereto the typical slope observed in natural scenes thestronger the masking effect However since the entiretarget image set had a mean slope of 112 (SD frac14012) and because there were no category-specificmasking effects (across orientation magnitude orslope) Occamrsquos razor suggests the simpler (thusbetter) description is that the observed maskingeffects follow an amplitude slope similarity principleSpecifically the more similar the mask amplitudeslope was to the target image slope the stronger themasking effect We will return to this notion inExperiments 4 and 5 as well as in the Generaldiscussion

Experiment 2

Given the amplitude spectrum slope was the drivingforce in the rapid scene categorization masking effectsin Experiment 1 for a fixed targetmask SOAExperiment 2 further explored the crucial conditionsfrom Experiment 1 within the temporal processingdomain Specifically we investigated whether theamplitude spectrum slope a was more or less of a factorat varying processing times Thus we replicatedExperiment 1 for masks set to either afrac14 00 or 10 witheach possessing either a strong or no orientation bias(Omag frac14 10 vs 00) across five different targetmaskSOAs We note along with VanRullen (2011) that therelationship between masking SOA and processing timeis not as simple as suggested by many studies usingbackward masking to study the time course ofperception Specifically neurophysiological studies thathave investigated the link between masking SOA andthe time course of target-related brain activity have

shown that while SOA and processing time are notidentical they are related (eg Rieger Braun Bulthoffamp Gegenfurtner 2005)

Method

Participants

Fifty-three Colgate University students (34 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent All ob-servers had normal (or corrected to normal) visionand their ages ranged from 18 to 21 years (M frac14 188SD frac14 079)

Design and procedure

The mask stimuli were identical to those used inExperiment 1 with the following exceptions Only twomask amplitude spectrum a values (afrac14 00 and afrac14 10)and two mask orientation bias magnitudes (Omagfrac14 00and Omag frac14 10) were employed in the currentexperiment Target stimuli were drawn from the setused in Experiment 1 as described in the Generalmethod section

The current experiment employed the same para-digm as Experiment 1 except that the currentexperiment utilized five different SOAs and a no-maskcondition The experiment used a 2 (Mask a) middot 2(Orientation Bias Magnitude) middot 6 (SOA amp No-Mask)mixed design with mask a and orientation biasmagnitude being between-subjects variables and SOAbeing a within-subjects variable Within a given mask aand orientation bias magnitude condition each par-ticipant saw 60 different target stimuli (described in theGeneral method section) for each SOA (with 10 stimulirandomly selected without replacement from each ofsix scene categories) For the no-mask condition anadditional 60 images were randomly sampled (10 perscene category) for a total of 360 trials Order of targetcategory and SOA (including no mask) was random-ized for each trial Participants were randomly assignedto the between-subjects conditions

The trial sequence was identical to that of Exper-iment 1 except that the targetmask ISI was set to yieldfive different SOAs (frac14 target durationthorn ISI) 2356 ms(ie 0 ms ISI) 3529 ms 4705 ms 7057 ms and11764 ms For no-mask trials the target image wassimply followed by the same mean luminance screenfor 750 ms as in the masking conditions (as inExperiment 1) Participantsrsquo task was the same asExperiment 1 and they were given practice trials usingtarget stimuli from categories not used in theexperiment

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 8

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope athornorientation bias magnitude resulted in 12 in the afrac1400thornorientation biasfrac14 00 condition 14 in the afrac14 00thornorientation biasfrac14 10 condition 13 in the afrac14 00thornorientation biasfrac14 00 condition and 14 in the afrac14 10thornorientation biasfrac14 10 condition Figure 3 shows meanscene categorization accuracy for each amplitude spec-trum slope a and orientation bias mask condition as afunction of SOA (and no mask) and we conducted a 2(Mask a) middot 2 (Mask Orientation Bias Magnitude) middot 5(SOA) mixed repeated measures ANOVA to test theeffects of these factors As expected Figure 3 shows areduction ofmasking strengthwith increasing SOA for allbetween-subjects masking conditions which was con-firmed by a significant main effect of SOA F(4 196)frac1413564 p 0001 partial g2frac14074Also as inExperiment1 noise masks with amplitude spectrum afrac1410 producedstronger effects than that of afrac1400 for the shorter SOAsas shown by a significant main effect of mask a F(1 49)frac141571 p 0001 Cohenrsquos f 2frac14026 anda significantmaskamiddotSOAinteractionF(4 196)frac143645p 0001Cohenrsquosf 2frac14101 Interestingly the effect ofmask orientation biasmagnitudewas absent in both afrac1410 conditions but therewas an apparent effect of orientationmagnitude in the afrac1400 conditions However this was not supported by theANOVA which found neither a significant main effect oforientation magnitude nor any significant interactionsinvolving it (ps 0212) Thuswe explored the significantMask a middot SOA interaction to identify the SOAs yieldingsignificant differences between the twomask a conditionsWe collapsed across orientation magnitude within eachlevel of mask a and ran independent t tests (Bonferonniadjusted family-wise a levelfrac14 00083) between the two agroups at each SOA (and the no-mask condition) Thisshowed significantdifferences for the three shortest SOAs24 ms t(50)frac14 789 p 0001 Cohenrsquos dfrac14 227 36 mst(50)frac14364 pfrac140001Cohenrsquos dfrac14106 and 48ms t(50)frac1431 pfrac140003 Cohenrsquos dfrac14 09 no SOAs 48 ms(including the no-mask condition) showed significantdifferences between the two mask a groups (ps 0164)Thus the effect of noisemask a (ie strongermasking foramplitude spectra afrac14 10) was only found within thecritically important first 50 ms of scene gist processingtime (Loschky et al 2007 Loschky et al 2010) whichfollowed the target-mask amplitude slope similarityprinciple observed in Experiment 1

Experiment 3

So far we have found that amplitude-only slope afrac1410 masks produce the strongest backward masking of

rapid scene categorization (when compared to afrac14 0[white noise] 05 and 15) specifically very early inscene category processing (ie the first 50 ms ofprocessing time) One simple explanation for thestronger masking produced by a frac14 10 masks is thatthey have higher perceived contrast than smaller orlarger a values and higher perceived contrast producesstronger masking Specifically previous studies (egCass Alais Spehar amp Bex 2009 Field amp Brady 1997Tolhurst amp Tadmor 2000) have noted a difference insubjective contrast between rms contrast balancedimages with varying a More specifically Hansen andHess (2012 figure 4a) found that afrac14 10 noise patternswere perceived to have 83 more rms contrast than afrac1400 noise and 50 more rms contrast than a frac14 15noise patterns This trend mirrors the results ofExperiment 1 in which the largest advantage for afrac1410noise masks was in comparison to afrac14 00 noise masksand the next largest advantage was in comparison to afrac14 15 noise masks

Thus Experiment 3 sought to equate the perceivedcontrast of the a frac14 00 and a frac14 10 noise masks Thecontrast matching data reported by Hansen and Hess(2012) found a 83 difference in perceived contrastbetween afrac14 00 and afrac14 10 noise Thus in Experiment3 we set the a frac14 00 [white] noise masks to an rmscontrast twice the rms contrast of the afrac14 10 masks

Method

Participants

Thirty-two Kansas State University students (16female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed written

Figure 3 Averaged data from Experiment 2 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 9

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

Lastly while unlikely (see Introduction andLoschky et al 2007) we examined whether categori-zation performance for any given scene category wasmasked more or less compared to the others as afunction of mask orientation bias magnitude Therewere no significant differences for any of the six scenecategories as a function of mask orientation biasmagnitude for any of the mask amplitude spectrumslopes ps 005 (results not shown) Thus in terms ofwhich amplitude-only spectral characteristics moststrongly mask rapid scene categorization the currentresults show that the distribution of contrast acrossspatial frequency (ie amplitude spectrum slope a) ismuch stronger than the distribution of contrast acrossorientation

Finally the masking functions shown in Figure 2bshow an interesting trend that seems related to thetypical amplitude spectrum slope found in naturalscenes (ie a rsquo 10) Specifically it is tempting toconclude that the more similar the mask slopes wereto the typical slope observed in natural scenes thestronger the masking effect However since the entiretarget image set had a mean slope of 112 (SD frac14012) and because there were no category-specificmasking effects (across orientation magnitude orslope) Occamrsquos razor suggests the simpler (thusbetter) description is that the observed maskingeffects follow an amplitude slope similarity principleSpecifically the more similar the mask amplitudeslope was to the target image slope the stronger themasking effect We will return to this notion inExperiments 4 and 5 as well as in the Generaldiscussion

Experiment 2

Given the amplitude spectrum slope was the drivingforce in the rapid scene categorization masking effectsin Experiment 1 for a fixed targetmask SOAExperiment 2 further explored the crucial conditionsfrom Experiment 1 within the temporal processingdomain Specifically we investigated whether theamplitude spectrum slope a was more or less of a factorat varying processing times Thus we replicatedExperiment 1 for masks set to either afrac14 00 or 10 witheach possessing either a strong or no orientation bias(Omag frac14 10 vs 00) across five different targetmaskSOAs We note along with VanRullen (2011) that therelationship between masking SOA and processing timeis not as simple as suggested by many studies usingbackward masking to study the time course ofperception Specifically neurophysiological studies thathave investigated the link between masking SOA andthe time course of target-related brain activity have

shown that while SOA and processing time are notidentical they are related (eg Rieger Braun Bulthoffamp Gegenfurtner 2005)

Method

Participants

Fifty-three Colgate University students (34 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent All ob-servers had normal (or corrected to normal) visionand their ages ranged from 18 to 21 years (M frac14 188SD frac14 079)

Design and procedure

The mask stimuli were identical to those used inExperiment 1 with the following exceptions Only twomask amplitude spectrum a values (afrac14 00 and afrac14 10)and two mask orientation bias magnitudes (Omagfrac14 00and Omag frac14 10) were employed in the currentexperiment Target stimuli were drawn from the setused in Experiment 1 as described in the Generalmethod section

The current experiment employed the same para-digm as Experiment 1 except that the currentexperiment utilized five different SOAs and a no-maskcondition The experiment used a 2 (Mask a) middot 2(Orientation Bias Magnitude) middot 6 (SOA amp No-Mask)mixed design with mask a and orientation biasmagnitude being between-subjects variables and SOAbeing a within-subjects variable Within a given mask aand orientation bias magnitude condition each par-ticipant saw 60 different target stimuli (described in theGeneral method section) for each SOA (with 10 stimulirandomly selected without replacement from each ofsix scene categories) For the no-mask condition anadditional 60 images were randomly sampled (10 perscene category) for a total of 360 trials Order of targetcategory and SOA (including no mask) was random-ized for each trial Participants were randomly assignedto the between-subjects conditions

The trial sequence was identical to that of Exper-iment 1 except that the targetmask ISI was set to yieldfive different SOAs (frac14 target durationthorn ISI) 2356 ms(ie 0 ms ISI) 3529 ms 4705 ms 7057 ms and11764 ms For no-mask trials the target image wassimply followed by the same mean luminance screenfor 750 ms as in the masking conditions (as inExperiment 1) Participantsrsquo task was the same asExperiment 1 and they were given practice trials usingtarget stimuli from categories not used in theexperiment

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 8

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope athornorientation bias magnitude resulted in 12 in the afrac1400thornorientation biasfrac14 00 condition 14 in the afrac14 00thornorientation biasfrac14 10 condition 13 in the afrac14 00thornorientation biasfrac14 00 condition and 14 in the afrac14 10thornorientation biasfrac14 10 condition Figure 3 shows meanscene categorization accuracy for each amplitude spec-trum slope a and orientation bias mask condition as afunction of SOA (and no mask) and we conducted a 2(Mask a) middot 2 (Mask Orientation Bias Magnitude) middot 5(SOA) mixed repeated measures ANOVA to test theeffects of these factors As expected Figure 3 shows areduction ofmasking strengthwith increasing SOA for allbetween-subjects masking conditions which was con-firmed by a significant main effect of SOA F(4 196)frac1413564 p 0001 partial g2frac14074Also as inExperiment1 noise masks with amplitude spectrum afrac1410 producedstronger effects than that of afrac1400 for the shorter SOAsas shown by a significant main effect of mask a F(1 49)frac141571 p 0001 Cohenrsquos f 2frac14026 anda significantmaskamiddotSOAinteractionF(4 196)frac143645p 0001Cohenrsquosf 2frac14101 Interestingly the effect ofmask orientation biasmagnitudewas absent in both afrac1410 conditions but therewas an apparent effect of orientationmagnitude in the afrac1400 conditions However this was not supported by theANOVA which found neither a significant main effect oforientation magnitude nor any significant interactionsinvolving it (ps 0212) Thuswe explored the significantMask a middot SOA interaction to identify the SOAs yieldingsignificant differences between the twomask a conditionsWe collapsed across orientation magnitude within eachlevel of mask a and ran independent t tests (Bonferonniadjusted family-wise a levelfrac14 00083) between the two agroups at each SOA (and the no-mask condition) Thisshowed significantdifferences for the three shortest SOAs24 ms t(50)frac14 789 p 0001 Cohenrsquos dfrac14 227 36 mst(50)frac14364 pfrac140001Cohenrsquos dfrac14106 and 48ms t(50)frac1431 pfrac140003 Cohenrsquos dfrac14 09 no SOAs 48 ms(including the no-mask condition) showed significantdifferences between the two mask a groups (ps 0164)Thus the effect of noisemask a (ie strongermasking foramplitude spectra afrac14 10) was only found within thecritically important first 50 ms of scene gist processingtime (Loschky et al 2007 Loschky et al 2010) whichfollowed the target-mask amplitude slope similarityprinciple observed in Experiment 1

Experiment 3

So far we have found that amplitude-only slope afrac1410 masks produce the strongest backward masking of

rapid scene categorization (when compared to afrac14 0[white noise] 05 and 15) specifically very early inscene category processing (ie the first 50 ms ofprocessing time) One simple explanation for thestronger masking produced by a frac14 10 masks is thatthey have higher perceived contrast than smaller orlarger a values and higher perceived contrast producesstronger masking Specifically previous studies (egCass Alais Spehar amp Bex 2009 Field amp Brady 1997Tolhurst amp Tadmor 2000) have noted a difference insubjective contrast between rms contrast balancedimages with varying a More specifically Hansen andHess (2012 figure 4a) found that afrac14 10 noise patternswere perceived to have 83 more rms contrast than afrac1400 noise and 50 more rms contrast than a frac14 15noise patterns This trend mirrors the results ofExperiment 1 in which the largest advantage for afrac1410noise masks was in comparison to afrac14 00 noise masksand the next largest advantage was in comparison to afrac14 15 noise masks

Thus Experiment 3 sought to equate the perceivedcontrast of the a frac14 00 and a frac14 10 noise masks Thecontrast matching data reported by Hansen and Hess(2012) found a 83 difference in perceived contrastbetween afrac14 00 and afrac14 10 noise Thus in Experiment3 we set the a frac14 00 [white] noise masks to an rmscontrast twice the rms contrast of the afrac14 10 masks

Method

Participants

Thirty-two Kansas State University students (16female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed written

Figure 3 Averaged data from Experiment 2 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 9

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope athornorientation bias magnitude resulted in 12 in the afrac1400thornorientation biasfrac14 00 condition 14 in the afrac14 00thornorientation biasfrac14 10 condition 13 in the afrac14 00thornorientation biasfrac14 00 condition and 14 in the afrac14 10thornorientation biasfrac14 10 condition Figure 3 shows meanscene categorization accuracy for each amplitude spec-trum slope a and orientation bias mask condition as afunction of SOA (and no mask) and we conducted a 2(Mask a) middot 2 (Mask Orientation Bias Magnitude) middot 5(SOA) mixed repeated measures ANOVA to test theeffects of these factors As expected Figure 3 shows areduction ofmasking strengthwith increasing SOA for allbetween-subjects masking conditions which was con-firmed by a significant main effect of SOA F(4 196)frac1413564 p 0001 partial g2frac14074Also as inExperiment1 noise masks with amplitude spectrum afrac1410 producedstronger effects than that of afrac1400 for the shorter SOAsas shown by a significant main effect of mask a F(1 49)frac141571 p 0001 Cohenrsquos f 2frac14026 anda significantmaskamiddotSOAinteractionF(4 196)frac143645p 0001Cohenrsquosf 2frac14101 Interestingly the effect ofmask orientation biasmagnitudewas absent in both afrac1410 conditions but therewas an apparent effect of orientationmagnitude in the afrac1400 conditions However this was not supported by theANOVA which found neither a significant main effect oforientation magnitude nor any significant interactionsinvolving it (ps 0212) Thuswe explored the significantMask a middot SOA interaction to identify the SOAs yieldingsignificant differences between the twomask a conditionsWe collapsed across orientation magnitude within eachlevel of mask a and ran independent t tests (Bonferonniadjusted family-wise a levelfrac14 00083) between the two agroups at each SOA (and the no-mask condition) Thisshowed significantdifferences for the three shortest SOAs24 ms t(50)frac14 789 p 0001 Cohenrsquos dfrac14 227 36 mst(50)frac14364 pfrac140001Cohenrsquos dfrac14106 and 48ms t(50)frac1431 pfrac140003 Cohenrsquos dfrac14 09 no SOAs 48 ms(including the no-mask condition) showed significantdifferences between the two mask a groups (ps 0164)Thus the effect of noisemask a (ie strongermasking foramplitude spectra afrac14 10) was only found within thecritically important first 50 ms of scene gist processingtime (Loschky et al 2007 Loschky et al 2010) whichfollowed the target-mask amplitude slope similarityprinciple observed in Experiment 1

Experiment 3

So far we have found that amplitude-only slope afrac1410 masks produce the strongest backward masking of

rapid scene categorization (when compared to afrac14 0[white noise] 05 and 15) specifically very early inscene category processing (ie the first 50 ms ofprocessing time) One simple explanation for thestronger masking produced by a frac14 10 masks is thatthey have higher perceived contrast than smaller orlarger a values and higher perceived contrast producesstronger masking Specifically previous studies (egCass Alais Spehar amp Bex 2009 Field amp Brady 1997Tolhurst amp Tadmor 2000) have noted a difference insubjective contrast between rms contrast balancedimages with varying a More specifically Hansen andHess (2012 figure 4a) found that afrac14 10 noise patternswere perceived to have 83 more rms contrast than afrac1400 noise and 50 more rms contrast than a frac14 15noise patterns This trend mirrors the results ofExperiment 1 in which the largest advantage for afrac1410noise masks was in comparison to afrac14 00 noise masksand the next largest advantage was in comparison to afrac14 15 noise masks

Thus Experiment 3 sought to equate the perceivedcontrast of the a frac14 00 and a frac14 10 noise masks Thecontrast matching data reported by Hansen and Hess(2012) found a 83 difference in perceived contrastbetween afrac14 00 and afrac14 10 noise Thus in Experiment3 we set the a frac14 00 [white] noise masks to an rmscontrast twice the rms contrast of the afrac14 10 masks

Method

Participants

Thirty-two Kansas State University students (16female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed written

Figure 3 Averaged data from Experiment 2 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 9

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

consent (with written parental consent also given forthe subject under the age of 18) All observers hadnormal (or corrected to normal) vision and their agesranged from 17 to 23 years (M frac14 191 SDfrac14 142)

Design and procedure

The mask stimuli were identical to those used inExperiment 2 but with the following exceptions Therewere only two mask amplitude spectrum a values (afrac1400 rms frac14 0252 and afrac14 10 rms frac14 0126) and onemask orientation bias magnitude (00 no orientationbias) Target stimuli were drawn from the set describedin the General method section

The current experiment employed the same paradigmas Experiment 2 The experiment used a 2 (Mask a) middot 5(TargetMask SOA) mixed design with mask a being thebetween-subjects variable and SOA being the within-subjects variable Within a given mask a and orientationbias magnitude condition each participant saw 60different target category stimuli for each SOA (10 stimulirandomly selected without replacement from each scenecategory) For the no-mask condition an additional 60images were randomly sampled (10 per scene category)for a total of 360 trials Order of target category and SOA(including no mask) was randomized for each trialParticipants were randomly assigned to the between-subjects condition The trial sequencewas identical to thatreported in Experiment 2 (including practice trials)

Results and discussion

Random assignment of participants to the between-subjects variable mask amplitude spectrum slope a

resulted in 13 in the afrac1400 condition and 19 in the afrac1410condition Figure 4 shows mean scene categorizationaccuracy for each amplitude spectrum slope a conditionas a function of targetmask SOA (and no mask) and weconducted a 2 (Mask a) middot 5 (TargetMask SOA) mixedrepeated measures ANOVA to test the effects of thesefactors As can be seen in Figure 4 there was a largedifference between the two mask a conditions which wasconfirmed a significant between-subjects main effect ofmaskaF(1 30)frac141447p 0001Cohenrsquos f 2frac14021Thisis despite the fact that the afrac1400masks possessed twice asmuch physical contrast as the afrac14 10 masks and wereapproximately equivalent in perceived contrast

We next carried out post-hoc independent t testsbetween the two mask a groups at each SOA For thefive comparisons the Bonferonni adjusted family-wise alevel was set to 001 In order to carry out theindependent groups two-tailed t tests we first createdequal sized groups by randomly selecting six subjects toomit from the afrac14 1 condition (though the statistical testresults were the same whether the subjects were omittedor not) As shown in Figure 4 the comparisons withSOAs of 50 ms or less were statistically significant (ts 395 ps 0001 Cohenrsquos ds 083) but not from 70 msSOA onward SOAfrac1470 ms t(24)frac14221 pfrac140037 SOAfrac14 117 ms t(24)frac14215 pfrac140042 no mask t(24)frac14 077 pfrac14 0450 (based on the family-wise adjusted a level)

Thus Experiment 3 showed that the difference inmasking between the afrac14 10 and 00 conditions inExperiments 1 and 2 could not be eliminated by simplymatching theperceivedcontrast of themasks (bydoublingthe physical contrast of the afrac1400 masks) Even morestrikingly as shown in Figure 4 the afrac14 00 maskingfunction with rms contrastfrac14 0252 did not significantlydiffer from that in Experiment 2with rms contrastfrac140126(p 005) Again masking was greatest for targetmaskSOAs of roughly 50 ms or less as in Experiment 2 andcontinued to follow the target-mask amplitude slopesimilarity principle observed in Experiment 1

Experiment 4

Experiments 1ndash3 established that amplitude-onlymask slope a was the crucial determining factormodulating masking of rapid scene categorizationInterestingly that modulation followed a target-maskamplitude slope similarity principle in which the moresimilar the mask image slope was to the target imageslope the stronger the masking Here we set the stageto examine whether and how unrecognizable amplitudethorn phase-defined mask structure masks rapid scenecategorization differently from amplitude-only definedstructure Specifically does it modulate masking effectsin a category-specific manner The current experiment

Figure 4 Averaged data from Experiment 3 On the ordinate is

averaged percent correct (note that the scale begins at 40)

and on the abscissa is SOA The gray trace plots data from the

noise mask a frac14 00 and Omag frac14 00 from Experiment 2 (rms frac140126) Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 10

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

provided key information about the nature of themasks used in Experiment 5 to answer this question

Given these aims it is important to hold theamplitude-defined mask structure approximately con-stant while varying amplitudethorn phase defined structure(mainly phase-defined structure ie present vs absent)Experiments 1ndash3 all showed amplitude spectrum slopeproducing very strong masking modulation (rangingfrom 55 accuracy to 85 accuracy) so we choseto test for category-specific masking effects with a slopersquo 10 (typical of natural scenes) Thus all masks werecreated to fall within a 012 standard deviation around a112 slope namely equal to the mean and standarddeviation of slope a of our target image set

Since we were interested in assessing the role ofunrecognizable amplitude thorn phase-defined structure inmasking rapid scene categorization we carried out thecurrent experiment to determine the recognizability ofthe images that we planned to use as masks inExperiment 5 (to select masks with the lowestrecognizability) This was the same rationale used byLoschky et al (2007) and Loschky et al (2010) formeasuring the recognizability of mask images Specif-ically if one type of masking image is more recogniz-able than another and that mask type is also shown tocause greater masking then the difference in maskingcould be attributed to conceptual masking (Intraub1984 Loftus amp Ginn 1984 Loftus Hanna amp Lester1988 Potter 1976) rather than the masksrsquo imagestatistical properties Thus by choosing masks with lowrecognition probability we can strongly minimize anyinfluence of conceptual masking

Method

Participants

Fifty-six Kansas State University students (27female) all naıve to the purpose of the studyparticipated for course credit after giving their Insti-tutional Review Board-approved informed writtenconsent All observers had normal (or corrected tonormal) vision and their ages ranged from 18 to 30years (M frac14 1960 SD frac14 219)

Mask construction

Each of the six image category sets consisted of 100images Of those 24 images were randomly selected tobe used as target images (ie for use in Experiment 5)and the remaining 76 images within each set were usedto generate the masking stimuli for that basic levelcategory (ie for use in Experiment 4 and inExperiment 5) Experiment 4 tested the recognizabilityof two types of visual mask stimuli called AMP masks(ie amplitude-only masks) and AMP thorn PHASE

masks to be used in Experiment 5 Both mask typeswere created to possess identical amplitude spectralrelationships but differ in the AMP thorn PHASE maskspossessing scene image features largely defined by thephase spectrum (eg scene-specific edges and lines)Below we describe how each mask type was con-structed based on methodology modified from Hansenand Hess (2007) as illustrated in Figures 5ndash7

The AMPthorn PHASE masks were created on acategory-by-category basis so that for example theairport category masks only included phase informa-tion from airport images (specifically from the 76images not selected as target stimuli) AMPthorn PHASEmasks were created by selectively sampling definedportions of the amplitude and phase spectra of 12randomly selected seed images from the remaining 76images within the given category set Thus each maskwas derived from 12 different seed images within acategory set Once a given mask was generated itscorresponding seed images were returned to the imagepool for the given category set and thus any imagecould be resampled to create another mask stimulusThus the seed images were never repeated within agiven mask stimulus but could be randomly resampledfor a different mask stimulus from the same categoryset

Each AMP thorn PHASE mask was constructed bystarting with two empty composite spectra (ie bothfilled with zeros) in the Fourier domain consisting oftwo 512 middot 512 matrices Each matrix was referenced inpolar coordinates and split into sector segmentsillustrated in Figure 5 Specifically each sector wasdefined by a 458 band of orientations h centered on

Figure 5 Illustration of sectors and sector segment divisions in

Fourier space (referenced in polar coordinates) The dashed gray

arrow shows the spatial frequency f dimension and the solid

black arrow shows the orientation h dimension Color

represents a full sector lsquolsquobow tiersquorsquo for each orientation with

color saturation representing spatial frequency segments Note

that spatial frequency segments are scaled for ease of visibility

and do not represent actual defined spatial frequency area in

Fourier space

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 11

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

four primary orientations namely vertical (08) 458oblique horizontal (908) and 1358 oblique The sectorswere mirrored to the polar opposite side of the Fourierdomain to form a lsquolsquobow tiersquorsquo reference system for eachprimary orientation (as illustrated with the color bowties in Figures 5 and 6) Sector segments were defined asthree 2-octave bands of spatial frequencies f withineach sector (and mirrored to the polar opposite side ofthe Fourier domain) Sector segments were definedaccording to f and extended from 025ndash1 c8 1ndash4 c8and 4ndash16 c8 This reference system yields 12 mirroredsector segments one for each of the four orientationbands and each of the three spatial frequency bands

To construct an AMPthorn PHASE mask we filled itsempty composite spectrum First we randomly selectedone of the 12 seed images in a scene category and ran aDFT of it which yielded its amplitude spectrum A( fihj) and phase spectrum U( fi hj) Then we randomlyselected a mirrored sector segment from A( fi hj) andU( fi hj) with the mirrored sector segments taken fromA( fi hj) and U( fi hj) being identical in terms oflocation The amplitude coefficients and phase angleswithin the selected mirrored segment of A( fi hj) andU( fi hj) were copied into their corresponding mirroredsector segments of the composite amplitude and phase

spectrum as illustrated in Figure 6 For example inFigure 6 the top-left corner represents the mirroredsector segment containing spatial frequencies between025ndash1 c8 centered on vertical To get this informationwe randomly selected a seed image and assigned itscorresponding amplitude coefficients from A( fi hj) and

Figure 6 Illustration of creating a given composite spectrum using scenes from the airport category in Experiments 4 and 5 The blue

bow tie in Fourier space represents vertical image structure in image space Three separate airport scene images (shown in the far left

of the blue box) are used to construct that bow tie To the right of those images (middle column) are the corresponding amplitude

spectra and to the right of those the corresponding phase spectra The lower spatial frequencies (025ndash10 c8) are taken from the top

image the intermediate frequencies (10ndash40 c8) from the middle image and the high frequencies from the bottom image The

process is repeated for the three remaining bow ties but with different images for each bow tie In total 12 randomly selected

airport images would be used to create a composite AMPthorn PHASE airport image

Figure 7 Illustration showing the contribution of the different

composite spectra for an ampthorn phase mask (left) and an amp

mask (right) in Experiments 4 and 5 Thus an ampthornphase mask

is made up of composite amplitude and phase spectra sampled

from 12 different images (see Figure 6) while the correspond-

ing amp mask uses only the composite amplitude spectrum

combined with a randomized phase spectrum Thus the two

masks shown above only differ with respect to their phase

spectra (ie their amplitude spectra are identical)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 12

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

phase angles from U( fi hj) to that location in thecomposite phase and amplitude spectrum We repeatedthis process until all of the 12 mirrored sector segmentsof the composite phase and amplitude spectrum werefilled We then squared the composite amplitudespectrum (to get the power) and normalized it to theaverage power spectrum of the entire normalized scenecategory images We assigned the DC (ie the zerofrequency component) of the same averaged powerspectrum to the squared composite amplitude spec-trum These last two steps ensured that each AMP thornPHASE mask had the identical mean luminance andrms contrast of the target images Finally we renderedthe AMPthorn PHASE masks in the spatial domain bytaking the discrete inverse fast Fourier transform ofcomposite amplitude and phase spectra with aresulting AMPthornPHASE image shown in Figure 7 (lefthalf) Refer to Figures 6 and 7 for further details

Figure 7 (right half) shows how we generated theAMP masks during the process of creating the AMPthornPHASE masks Specifically for any given AMP maskinstead of taking the DFT of the composite amplitudeand phase spectra we replaced the composite phasespectrum with a randomized phase spectrum Further-more we used the same randomized phase spectrum foreach mask stimulus across all scene image categoriesWe constructed a given random phase spectrum byassigning random values from ndashp to p to thecoordinates of a 512 middot 512 polar matrix URAND( f h)such that the phase spectra were odd-symmetricmdashiefor h angles in the [p 2p] half of polar spaceURAND( f h) frac14URAND( f h) Refer to Figure 7 forfurther details Figure 8 shows example masks fromeach basic level scene category

As described above and illustrated in Figures 6 and7 the AMPthorn PHASE masks differed across categoriesby both amplitude and phase whereas the AMP masksdiffered only with respect to amplitude We also notethat the technique we described above for creating bothmask types essentially constitutes a rudimentarywavelet methodology Specifically as in typical waveletmethods composite spectra were constructed fromrelatively narrow bands of spatial frequency andorientation However unlike typical wavelet methodsthese composite global spectra were built up from localportions of different imagesrsquo Fourier spectra Thiswavelet-like approach differs from the standard ap-proach to building amplitude-only masks or amplitudethorn phase masks which has typically relied on applyingglobal manipulations to either the amplitude spectrum(as done in Experiment 1) or systematically random-izing the phase spectrum (eg Loschky et al 2007Wichmann et al 2006) The wavelet-like approach weused to create our AMPthorn PHASE masks means thatthey contained scene-specific features such as lines andedges However the fact that we constructed the masks

from randomly mixed sectors of amplitude and phasespectra from different scenes means that the masks arelikely unrecognizable (see Hansen amp Hess 2007 forfurther detail)

Scene categorization task

The design of the current experiment consisted of asingle-interval six-alternative forced-choice paradigmwith two within-subject factors 2 (Amplitude vsAmplitude thorn Phase Masks) middot 6 (Scene Categories ofMasks) There were 28 mask images per cell andeach image was presented once for a total of 336trials (randomly ordered) On each trial participantssaw a briefly flashed mask stimulus and indicatedwhich of the six basic level scene categories theybelieved it belonged to Each trial began with afixation point (subtending 0318 of visual angle) untilthe participant pressed a mouse button to initiate thetrial This was followed by a variable duration (300ndash900 ms) blank screen (neutral gray luminance frac14mean of the entire target and mask image set) then abriefly flashed stimulus (either AMP thorn PHASE orAMP mask) for 24 ms (two refresh cycles at 85 Hz)then a blank screen (same neutral gray) for 750 msduration then a response screen containing a 2 middot 3grid of the six basic level scene category labels untilthe participant responded by a mouse click As inExperiments 1ndash3 the position of each of the categorylabels in the response label grid was randomized onevery trial

Results and discussion

One participantrsquos data was removed from theanalyses for only responding lsquolsquoForestrsquorsquo on every trial(except one in which they correctly respondedlsquolsquoStreetrsquorsquo)

Results for mean accuracy by image type and basiclevel category are shown in Figure 9a As shownaccuracy for four of the six categories was at or slightlybelow chance (1667) However two of the categoriesbeach and forest were more accurately identified thanthe others This was verified by a 2 (Mask Type AMPvs AMP thorn PHASE) middot 6 (Scene Category) factorialrepeated measures ANOVA on accuracy whichshowed a main effect of category F(5 265)frac14 21812 p 0001 Cohenrsquos f 2frac14 0401 Follow-up Bonferroni-corrected t tests (adjusted alphafrac14 00083) showed thatbeach t(54)frac14 5029 p 0001 Cohenrsquos dfrac14 0684 andforest t(54)frac14 6182 p 0001 Cohenrsquos dfrac14 0841 weresignificantly above chance and that store t(54) frac145843 p 0001 Cohenrsquos dfrac14 0795 was significantlybelow chance The other categories did not reachsignificance

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 13

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

Overall these results can largely be explained in

terms of a response bias to more frequently select

natural categories (a natural bias) (Joubert et al 2009

Loschky amp Larson 2008) To correct for this response

bias we calculated the percentage of correct category

responses out of each categoryrsquos response rate (Figure

9b) Stated more formally we calculated p(correct

responsejresponsefrac14Xcat)p(responsefrac14Xcat) where Xcat

frac14 one of the six categories Bonferroni-corrected t tests

(adjusted alpha frac14 00083) showed only three instances

Figure 8 Example target and mask stimuli used in Experiment 4 (mask images only) and Experiment 5 (target and masks) The top

three text rows show the categorization hierarchy with the top two text rows showing the superordinate category structure and the

bottom text row showing the basic-level category structure The three rows of images show example targets and masks (AMP thornPHASE and AMP) from each of the basic-level categories

Figure 9 (A) Averaged data from Experiment 4 showing recognition accuracy (ordinate) for amplitude and phase masks as a function

of scene category (abscissa) (B) Data from Experiment 4 showing percentage of correct category responses out of total category

response rate The black line marks chance performance in both panels and all error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 14

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

of above-chance correct category response percentagesAMP masks beach t(53)frac14 351 p 0001 Cohenrsquos dfrac140474 and AMPthornPHASE masks beach t(53)frac14 469 p 0001 Cohenrsquos dfrac140633 and forest t(53)frac14370 p

0001 Cohenrsquos dfrac14 0499 We will return to this result inExperiment 5

Experiment 5

Experiments 1ndash3 showed that amplitude-onlymasks produced performance largely dependent onthe amplitude slope following a target-mask ampli-tude spectrum slope similarity principle In Experi-ment 5 we investigated whether amplitude thorn phasemasks would yield masking effects that differed fromamplitude-only masks and if so how The results ofExperiment 1 (and Loschky et al 2007) suggestedthat amplitude-only masks do not yield category-specific masking effects Here we hypothesized that ifcategory-specific features are used to process scenes tocategorization then target-mask categorical similarityshould affect scene gist maskingmdashnamely whether thetarget and mask categories are the same or different atthe superordinate level or at the basic level shouldaffect the degree masking of rapid scene categoriza-tion Furthermore since the masks generated inExperiment 4 possessed amplitude slopes a that allfell within 012 standard deviation of 112 (ieessentially afrac14 1) if amplitude thorn phase masks did notmask differently than observed in Experiments 1ndash3than we would not expect to observe any category-specific masking effects

Method

Participants

Thirty Kansas State University students (14 female)all naıve to the purpose of the study participated forcourse credit after giving their Institutional ReviewBoard-approved informed written consent (with writ-ten parental consent also given for those under the ageof 18) All observers had normal (or corrected tonormal) vision and their ages ranged from 17 to 45years (M frac14 205 SDfrac14 483)

Stimuli

The stimuli were those used in Experiment 4 Therewere 144 target images (24 Images middot 6 SceneCategories) There were also 144 mask images 72 ofeach type (AMP and AMPthornPHASE masks) and 12 ineach of six scene categories

Design and procedure

Experiment 5 involved four factors 2 (AMP vsAMPthorn PHASE masks) middot 6 (Scene Categories ofTargets) middot 6 (Scene Categories of Masks) middot 2 (ValidInvalid Category Labels)frac14 144 cells in the designThere were two trials per cell for a total of 288 trials(randomly ordered) There were 24 images per scene ormask category for a total of 144 target images and 144mask images Each target was used twice in theexperiment once validly labeled and once invalidly(falsely) labeled Each mask was presented twice Eachimage was also masked once with an AMP mask andonce with an AMP thorn PHASE mask

The general procedure was identical to that inExperiments 1ndash3 with the following exceptions Firstthe current experiment used only a single masking SOA(36 ms) based on a target duration of 24 ms and an ISIof 12 ms The mask was 72 ms duration for a 31masktarget duration ratio Second the responseoptions at the end of a trial differed Following thepresentation of the target mask and blank screen asingle category label was presented until the participantmade either a lsquolsquoYesrsquorsquo or lsquolsquoNorsquorsquo response The categorylabels were valid (matched the presented image) 50 ofthe time and invalid 50 of the time There were 288trials with each of the six category labels presentedequally often

Results and discussion

Nine trials across 30 subjects were found to havelonger target ISI or mask durations than intendedand were removed from the analyses leaving a total of8631 trials

We first analyzed our data in cases in which thetarget and mask came from different categories Ofparticular interest was whether the AMP versus AMPthornPHASE masks caused differential rapid scene catego-rization masking since these masking conditionsdiffered in terms of having phase-defined imagefeatures with AMP masks possessing no phase-definedstructure (compare the bottom two rows of Figure 8)To answer this question our first analysis was a 2(Mask Type AMP vs AMPthornPHASE) middot 6 (Category)repeated-measures ANOVA on rapid scene categori-zation accuracy to determine any general differences inmasking strength The results are shown in Figure 10a

As shown in Figure 10a the AMPthorn PHASE masksinterfered more with rapid scene categorization accu-racy than the AMP masks F(1 29)frac14 15062 pfrac14 0001Cohenrsquos f 2 frac14 0122 This suggests that unrecognizableamplitudethornphase-defined image characteristics disruptrapid scene categorization more than amplitude-de-fined characteristics alone As also shown in Figure10a there was a main effect of mask category F(5 145)

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 15

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

frac14 4818 p 0001 Cohenrsquos f 2frac14 0193 with some maskcategories such as beach and forest masks causing lessmasking than others such as street masks Note thatthis runs exactly opposite to the conceptual maskinghypothesis since the most recognizable mask catego-ries beach and forest (as shown in Experiment 4)caused among the least disruption of rapid scenecategorization when used as masks In addition masktype did not significantly interact with mask categoryF(5 145) 1 p frac14 0456 The above results thereforesupport the notion that category-specific phase-definedmask structure produces differential scene gist masking(here in the form of masking strength) than amplitude-only defined mask structure However the question stillremains as to whether such signals operated in acategory-specific manner Thus the other aim of theExperiment 5 was to determine whether target-maskcategory similarity would affect rapid scene categori-zationmdashnamely do amplitude thorn phase-defined catego-ry-specific features selectively mask rapid scenecategorization

To address the above question we analyzed our datain terms of both the superordinate and basic levelcategorical similarity of the target and mask imagesusing the following similarity scale naturalman-madedifferent (eg beach target amp street mask) naturalman-made same basic different (eg beach target ampforest mask) basic level same (eg beach target ampbeach mask) We carried out a 2 (Mask Type AMP vsAMPthorn PHASE) middot 3 (Categorical Similarity NaturalMan-Made Different vs NaturalMan-Made Same

Basic Different vs Basic Level Same) repeated mea-sures ANOVA on scene categorization accuracy

As shown in Figure 10b the more categoricaldissimilarity between target and mask images thegreater the interference with rapid scene categorizationaccuracy F(2 58)frac14 2736 p 0001 Cohenrsquos f 2 frac140541 Specifically Figure 10b shows that the leastcategorically similar target-mask pairings (nat-mandifferent) showed the strongest masking (lowest accu-racy) Planned comparisons showed that whether targetand mask were the same or different at the superordi-nate level (nat-man different vs nat-man same basicdifferent) significantly affected categorization accuracyF(1 29)frac14 5367 pfrac14 0028 Cohenrsquos f 2frac14 0270 but thatthe biggest difference in masking strength was due towhether target and mask were the same or different atthe basic level (nat-man same basic different vs basicsame) F(1 29)frac14 22422 p 0001 Cohenrsquos f 2frac14 0598As before AMP thorn PHASE masks caused significantlygreater masking than AMP masks F(1 29) 4377 pfrac140045 Cohenrsquos f 2 frac14 0137 Interestingly it appears inFigure 10b that categorical dissimilarity between targetand mask images mattered more for the AMP thornPHASE masks and less so for the AMP maskshowever this crossover interaction just failed to reachsignificance when correcting for non-homogeneity ofvariance F(1664 48252 [Huynh-Feldt]) frac14 3356 pfrac140052 Cohenrsquos f 2frac14 0162 Because this was so close tobeing significant we carried out two one-way ANOVAsfor categorical similarity for AMPthorn PHASE masksand for AMP masks In fact as shown in Figure 10b

Figure 10 (A) Averaged data from Experiment 5 showing rapid scene categorization accuracy (ordinate) as a function of mask type

(amplitude vs phase masks) and mask category (abscissa) (note that the ordinate scale begins at 50 [ frac14 chance] to increase

visibility) (B) Averaged data from Experiment 5 data replotted to show rapid scene categorization accuracy (ordinate) as a function of

mask type (amp only vs ampthorn phase masks) and target-mask categorical similarity (abscissa) (note that the ordinate scale begins at

50 [frac14 chance] to increase visibility) Error bars are 61 SEM lsquolsquoNat-man differentrsquorsquofrac14 target and mask are different in terms of the

naturalman-made distinction lsquolsquonat-man samersquorsquofrac14 target and mask are the same in terms of the naturalman-made distinction but

different at the basic level lsquolsquobasic samersquorsquo frac14 target and mask are the same in terms of the basic level Error bars are 61 SEM

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 16

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

there was a significant main effect for categoricaldissimilarity for both types of masks however theeffect size was approximately twice as large for theAMPthorn PHASE masks F(2 58) frac14 19404 p 0001Cohenrsquos f 2 frac14 0640 as for the AMP masks F(2 58)frac14584 pfrac14 0005 Cohenrsquos f 2frac14 0328 It therefore appearsthat both types of mask follow a target-mask categor-ical dissimilarity principle (ie the more dissimilar thetarget-mask pairs were in terms of category thestronger the masking) with AMP thorn PHASE masksyielding the strongest effects

Importantly if the effects shown in Figure 10b weredue to the small level of mask recognizability (Exper-iment 4) they are very inconsistent with the predictionsof conceptual masking Specifically the conceptualmasking hypothesis predicts greater masking by morerecognizable mask categories (Intraub 1984 Loftus ampGinn 1984 Loftus Hanna amp Lester 1988 Michaels ampTurvey 1979 Potter 1976) whereas we found that themore recognizable categories caused less maskingThus the categorical dissimilarity effect observed herecannot be accounted for by conceptual maskingCuriously the AMP masks here in Experiment 5produced a modest categorical dissimilarity effectwhile the amplitude-only masks in Experiments 1ndash3yielded no category-specific effects One possibleexplanation may have to do with the methodologyemployed to construct the different types of amplitude-only masks (ie one consisting of global transforma-tions while the other resulted from rudimentarywavelet techniques) We will return to this issue in theGeneral discussion

General discussion

This study investigated the ability of amplitude-only-defined and amplitudethorn phase-defined unrecognizableimage structure to mask rapid scene categorizationperformance in particular (a) whether the two types ofmasks would yield different masking functions and ifso (b) how the different mask spectral characteristicswould modulate the masking functions Experiments 1ndash3 showed a greater impact on scene categorization ofamplitude spectrum slope a than orientation atprocessing times 50 ms which followed a target-maskamplitude slope similarity principle (ie strongermasking when target and mask amplitude spectrumslopes were more similar) Experiment 5 then showed agreater impact on scene categorization of unrecogniz-able amplitude thorn phase-defined image structure thanamplitude-only-defined structure Further both typesof masking effects in Experiment 5 followed a target-mask categorical dissimilarity principle (ie strongermasking effects when target and mask scene categories

differed more) but with amplitude thorn phase masksshowing effect sizes that were almost twice those of theamplitude-only masks Interestingly the amplitude-only masks in Experiments 1ndash3 produced no category-specific masking effects whereas Experiment 5rsquosamplitude-only masks produced modest category-spe-cific masking effects One possible reason for thisdifference may have been the differences in maskconstruction between Experiments 1ndash3 versus 4ndash5Below we discuss the results of Experiments 1ndash3 andExperiments 4ndash5 in greater detail in an effort toprovide a qualitative account of the possible underlyingmechanisms regulating both types of masking effects

The results from Experiments 1ndash3 showed that theamplitude slope is the dominant factor in producingmasking effects by phase-randomized scenes (with a frac1410 producing the largest masking effect) Further theinterference caused by afrac14 10 masks relative to afrac14 00masks (regardless of orientation bias) occurs within thefirst 50 ms of processing (and cannot be explained bydifferences in perceived contrast) Such an earlymodulation suggests that the relevant interactions tookplace very early in the cortical processing streampossibly within striate cortex (ie V1) Given thepossibility of an early neural substrate for the effectsobserved in Experiments 1ndash3 and the large differencesin contrast at different spatial frequencies as a wasvaried it would be tempting to explain the maskingeffects by differences in spatial frequency-specificcontrast differences (ie a simple spatial channelaccount) Specifically while the noise masks and scenetarget images were equated for rms contrast maskswith a 1 or a 1 would have spatial frequency-specific contrast differences from the target image set(having averaged a frac14 112 SD frac14 012) Thus noisemasks with afrac14 00 contained more luminance contrastat higher spatial frequencies than the target image setwhile noise masks with a frac14 15 had more luminancecontrast at lower spatial frequencies (see Figure 1b)However neither a frac14 00 or 15 produced the largestmasking effects so the effects reported in Experiment 1cannot be explained by contrast differences at either thelowest or highest spatial frequencies Further the a frac1400 noise masks in Experiment 3 had an rms contrasttwice that of the a frac14 10 noise masks yet the maskingstrength advantage for the a frac14 10 noise masks withinthe first 50 ms persisted

Since the masking differences in Experiments 1ndash3imply an early neural substrate for those effects yetcannot be explained by contrast biases at the lowest orhighest spatial frequencies the question as to why afrac1410 noise masks produce the largest masking effects(ie the target-mask slope similarity effect) remainsInterestingly the modulation of masking strength bymask a in the current study is virtually identical to thesimultaneous masking effects observed by Hansen and

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 17

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

Hess (2012) in which participants had to identify theorientation of Gabor patches or identify letters Thereafrac14 10 noise masks were found to produce strongermasking than a frac14 00 or 15 noise masks Hansen andHess (2012) argued that the different a maskingstrengths could be explained by modeled corticalinteractions employing contrast gain control processes(eg Carandini amp Heeger 1994 Heeger 1992a 1992bWilson amp Humanski 1993) which take place exclu-sively within V1 Their contrast gain control accountbuilt on the work of David Field and Nuala Brady(Brady amp Field 1995 Field 1987 Field amp Brady 1997)who proposed a modified multichannel model of V1That model predicts overall larger spatial frequencychannel responses for a frac14 10 stimuli (compared tostimuli with smaller or larger as) In their model thespatial frequency bandwidth of different early visualchannels is held constant in octaves on a log axis withthe peak sensitivity of each filter channel also heldconstant based on the results of a large sample ofstriate neurons (R L De Valois Albrecht amp Thorell1982) With such a configuration any image possessingan amplitude spectrum slope a rsquo 10 will produceequivalently large contrast energy responses across allchannels in an inhibitory contrast gain pool regardlessof the peak spatial frequency to which each filter istuned (see Brady amp Field 1995 for further detail)Thus if afrac14 10 noise largely and equally drives allspatial channels a tuned gain pool for a frac14 10 noiseshould be much more active than with smaller or largeras thus producing the greatest inhibitory effect on theoutput signal to perceptual decision processes Fur-thermore contrast gain control processes were shownto strongly operate during the first 50 ms of processingtime in a backward masking paradigm (Essock Haunamp Kim 2009) Thus the masking effects produced bythe amplitude masks in Experiments 1ndash3 may simplyreflect a variant of a contrast gain control modulatedsignal-to-noise ratio in early visual processes thatdepends much more on the distribution of contrastacross spatial frequency than across orientationTherefore the target-mask slope similarity effectobserved in Experiments 1ndash3 may result from amodified contrast gain control process implementedentirely in V1 The implication here is that the majorityof amplitude-only masking effects may have more to dowith specific early visual mechanisms than later scenecategorization processes (eg in the PPA the LOCetc Walther Caddigan Fei-Fei amp Beck 2009) If sosuch interactions would apply equally well to a widearray of visual tasks (eg Gabor orientation discrim-ination or letter recognition) rather than being limitedonly to rapid scene categorization Additionally thelack of any category-specific effects in Experiments 1ndash3further supports the idea that those masking effectsresulted from general neural operators at least for

amplitude-only-defined content derived from globalspectral manipulations

Regarding the masking effects produced by unrec-ognizable amplitudethorn phase-defined image structure inExperiment 5 we observed that unrecognizable ampli-tude thorn phase masks interfered with rapid scenecategorization in a manner much more specific totarget-mask category pairs (ie Figure 10b) specifi-cally following a target-mask categorical dissimilarityprinciple In contrast to the account given for themechanisms involved in the masking effects observed inExperiments 1ndash3 one cannot explain the target-maskcategorical dissimilarity effect by a modified contrastgain control process That is because all masks inExperiments 4 and 5 were designed to containapproximately equivalent global contrast distributionsacross spatial frequency the spatial channels involvedin processing the masks would yield very similar outputregardless of mask category We verified this by passingall Experiment 5 masks through the bank of filtersreported in Hansen and Hess (2012) and found that thegreatest between-category difference for either masktype (AMP masks or AMP thorn PHASE masks) neverexceeded a quarter of a log unit (across spatialfrequency or orientation) Thus if gain controlmodulated filter outputs did drive the masking effectsreported in Figure 10b the trend would be flatregardless of target-mask category pairing Thusneither spatial frequency- nor orientation-specificcontrast change across the masks can account for theeffects observed in Experiment 5 It is quite plausiblethat the greater accuracy when target and mask arefrom the same basic level category is due to integrationof local category-specific image features from targetand mask which would be less disruptive (ie producegreater accuracy) when those features are more similarFurther research is needed to provide a definitiveaccount

Interestingly Experiment 5 also yielded a modesttarget-mask dissimilarity effect when AMP masks wereemployed This runs counter to the results of Exper-iments 1ndash3 (and Loschky et al 2007 experiment 3)However as noted in Experiment 4 the amplitude-onlymasks in Experiments 4 and 5 were constructed verydifferently from simple phase randomization Specifi-cally they were constructed from rudimentary wavelet-like image sampling techniques (ie AMP masks werecreated by sampling different sector segments from anumber of different image spectra) that may haveresulted in spurious amplitude-only structure thatsignaled category-specific information Further re-search is needed in order to address exactly how suchan approach can lead to category specific maskingNevertheless such a difference in masking trends dueto mask construction (global transformations vswavelet composition) therefore serves as a cautionary

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 18

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

note for future studies that seek to further examine therole of amplitude-only-defined structure in the maskingof rapid scene categorization

In sum the current studyrsquos results strongly supportthe notion that rapid scene categorization maskingeffects differ depending on the masksrsquo Fourier spectralproperties Further research is needed in order todetermine if the slope similarity principle is tied directlyto the slope of the target scene (here our targets had anaverage slope of 112) or to the typical slope observedwith natural scenes (ie a rsquo 10) Interestingly Fourieramplitude spectrum slope produced the largest perfor-mance modulation with respect to categorizationaccuracy (ranging between 55 [afrac14 00] to 85 [afrac14 10] accuracy) and therefore may serve as the largestsource of variability across masking studies using anytype of mask with small a values compared to studiesusing masks with larger a values (at least for SOAs 50 ms) Therefore such modulations of masking by theamplitude spectral slope could easily cloud anymasking studies exploring scene gist masking caused byphase spectrum manipulations if the amplitude spec-trum slopes vary extensively Conversely if amplitudespectrum slope is held approximately constant (as inExperiment 5 here) then it is possible to explore suchphase spectrum effects In doing so we observed atarget-mask category dissimilarity effect One possibleimplication of such an effect might be that masksderived from wavelet techniques contain local imagestructure that can differentially mask the local featuresin target scenes in a target-mask category-dependentmanner Such feature masking may result in suchmasks interfering with mechanisms more directly linkedto categorical processing

It is worth noting that while we refer to two differentmasking effects in this study (ie the target-maskamplitude slope similarity principle for Experiments 1ndash3 and the target-mask categorical dissimilarity principlefor Experiment 5) we do not mean to imply that theyare independent from one another or operate in anysort of hierarchical manner Rather we are simplyemphasizing that specific Fourier characteristics canproduce different masking effects Specifically theamplitude spectrum slope of a given mask plays adominant role in the ability of that mask to disruptrapid scene categorization but it does so in a mannerthat is not contingent on the target-mask categoryrelationship Conversely masks derived from wavelettechniques (used in Experiment 5) mask rapid scenecategorization in a manner contingent on the target-mask category relationship Thus much caution mustbe taken when comparing masking functions acrossstudies that employ masks with unrecognizable ampli-tude-only defined content versus unrecognizable con-joint amplitude thorn phase-defined image structuresSimilarly care must be taken in comparing masking

functions produced in studies employing masks derivedfrom wavelet techniques versus those that do not Asthe current study demonstrates such important differ-ences can lead to differential masking effects in rapidscene categorization tasks The question of whether thetwo masking effects are independent of one another (oroperate in some hierarchical manner) should beaddressed in future studies in which for example theamplitude spectra slopes of wavelet-derived masks aresystematically varied

Keywords real-world scenes scene gist rapid scenecategorization image statistics amplitude spectrumslope orientation bias phase spectrum visual backwardmasking processing time perceptual time course

Acknowledgments

This research was supported in part by a ColgateResearch Council grant to BCH and by grants from theKansas State University Office of Sponsored Researchand the NASAKansas Space Grant Consortium toLCL We thank the two anonymous reviewers forhelpful comments and Kelly Byrne and Ryan V Ringerfor assistance in experiment preparation and datacollection Parts of the research in this article werepreviously presented at the 2009 and 2012 AnnualMeetings of the Vision Sciences Society with theirabstracts published in the Journal of Vision

Commercial relationships noneCorresponding author Bruce C HansenEmail bchansencolgateeduAddress Department of Psychology NeuroscienceProgram Colgate University Hamilton NY USA

Footnote

1Cohenrsquos f 2 effect size values of 010 025 and 040are typically considered to be small medium and largeeffect sizes respectively (Kirk 1995)

References

Bacon-Mace N Mace M J Fabre-Thorpe M ampThorpe S J (2005) The time course of visualprocessing Backward masking and natural scenecategorisation Vision Research 45 1459ndash1469

Brady N amp Field D J (1995) Whatrsquos constant incontrast constancy The effects of scaling on the

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 19

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

perceived contrast of bandpass patterns VisionResearch 35(6) 739ndash756

Carandini M amp Heeger D J (1994) Summation anddivision by neurons in primate visual cortexScience 264(5163) 1333ndash1336

Carter B E amp Henning G B (1971) The detectionof gratings in narrow-band visual noise Journal ofPhysiology 219(2) 355ndash365

Cass J Alais D Spehar B amp Bex P J (2009)Temporal whitening Transient noise perceptuallyequalizes the 1f temporal amplitude spectrumJournal of Vision 9(10)12 1019 httpwwwjournalofvisionorgcontent91012 doi10116791012 [PubMed] [Article]

Coppola D M Purves H R McCoy A N ampPurves D (1998) The distribution of orientedcontours in the real word Proceedings of theNational Academy of Sciences USA 95(7) 4002ndash4006

De Valois K K amp Switkes E (1983) Simultaneousmasking interactions between chromatic and lumi-nance gratings Journal of the Optical Society ofAmerica 73(1) 11ndash18

De Valois R L Albrecht D G amp Thorell L G(1982) Spatial frequency selectivity of cells inmacaque visual cortex Vision Research 22(5) 545ndash559

Essock E A Haun A M amp Kim Y-J (2009) Ananisotropy of orientation-tuned suppression thatmatches the anisotropy of typical natural scenesJournal of Vision 9(1)35 1ndash15 httpwwwjournalofvisionorgcontent9135 doi1011679135 [PubMed] [Article]

Fei-Fei L Iyer A Koch C amp Perona P (2007)What do we perceive in a glance of a real-worldscene Journal of Vision 7(1)10 1ndash29 httpwwwjournalofvisionorgcontent7110 doi1011677110 [PubMed] [Article]

Field D J (1987) Relations between the statistics ofnatural images and the response properties ofcortical cells Journal of the Optical Society ofAmerica 4(12) 2379ndash2394

Field D J amp Brady N (1997) Visual sensitivity blurand the sources of variability in the amplitudespectra of natural scenes Vision Research 37(23)3367ndash3383

Guyader N Chauvin A Peyrin C Herault J ampMarendaz C (2004) Image phase or amplitudeRapid scene categorization is an amplitude-basedprocess Comptes Rendus Biologies 327 313ndash318

Hansen B C amp Essock E A (2004) A horizontalbias in human visual processing of orientation and

its correspondence to the structural components ofnatural scenes Journal of Vision 4(12)5 1044ndash1060 httpwwwjournalofvisionorgcontent4125 doi1011674125 [PubMed] [Article]

Hansen B C Haun A M amp Essock E A (2008)The lsquolsquohorizontal effectrsquorsquo A perceptual anisotropy invisual processing of naturalistic broadband stimuliIn T A Portocello amp R B Velloti (Eds) Visualcortex New research New York NY NovaScience Publishers

Hansen B C amp Hess R F (2007) Structuralsparseness and spatial phase alignment in naturalscenes Journal of the Optical Society of America A24(7) 1873ndash1885

Hansen B C amp Hess R F (2012) On theeffectiveness of noise masks Naturalistic vs un-naturalistic image statistics Vision Research 60101ndash113

Heeger D J (1992a) Half-squaring in responses of catstriate cells Visual Neuroscience 9(5) 427ndash443

Heeger D J (1992b) Normalization of cell responsesin cat striate cortex Visual Neuroscience 9(2) 181ndash197

Henning G B Hertz B G amp Hinton J L (1981)Effects of different hypothetical detection mecha-nisms on the shape of spatial-frequency filtersinferred from masking experiments I Noise masksJournal of the Optical Society of America A 71574ndash581

Intraub H (1984) Conceptual masking The effects ofsubsequent visual events on memory for picturesJournal of Experimental Psychology LearningMemory and Cognition 10(1) 115ndash125

Joubert O R Rousselet G A Fabre-Thorpe M ampFize D (2009) Rapid visual categorization ofnatural scene contexts with equalized amplitudespectrum and increasing phase noise Journal ofVision 9(1)2 1ndash16 httpwwwjournalofvisionorgcontent912 doi101167912 [PubMed][Article]

Kaping D Tzvetanov T amp Treue S (2007)Adaptation to statistical properties of visual scenesbiases rapid categorization Visual Cognition 15(1)12ndash19

Keil M S amp Cristobal G (2000) Separating thechaff from the wheat Possible origins of theoblique effect Journal of the Optical Society ofAmerica A 17(4) 697ndash710

Kirk R E (1995) Experimental design Procedures forthe behavioral sciences (3rd ed) Pacific Grove CABrooksCole

Legge G E amp Foley J M (1980) Contrast masking

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 20

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2

in human vision Journal of the Optical Society ofAmerica 70(12) 1458ndash1471

Loftus G R amp Ginn M (1984) Perceptual andconceptual masking of pictures Journal of Exper-imental Psychology Learning Memory and Cog-nition 10(3) 435ndash441

Loftus G R Hanna A M amp Lester L (1988)Conceptual masking How one picture capturesattention from another picture Cognitive Psychol-ogy 20(2) 237ndash282

Losada M A amp Mullen K T (1995) Color andluminance spatial tuning estimated by noise mask-ing in the absence of off-frequency looking Journalof the Optical Society of America A 12(2) 250ndash260

Loschky L C Hansen B C Sethi A amp PydimariT (2010) The role of higher order image statisticsin masking scene gist recognition AttentionPerception amp Psychophysics 72(2) 427ndash444

Loschky L C amp Larson A M (2008) Localizedinformation is necessary for scene categorizationincluding the naturalman-made distinction Jour-nal of Vision 8(1)4 1ndash9 httpwwwjournalofvisionorgcontent814 doi101167814 [PubMed] [Article]

Loschky L C Sethi A Simons D J Pydimari TN Ochs D amp Corbeille J L (2007) Theimportance of information localization in scene gistrecognition Journal of Experimental PsychologyHuman Perception amp Performance 33(6) 1431ndash1450

Marr D Ullman S amp Poggio T (1979) Bandpasschannels zero-crossings and early visual informa-tion processing Journal of the Optical Society ofAmerica 69(6) 914ndash916

Michaels C F amp Turvey M T (1979) Centralsources of visual masking Indexing structuressupporting seeing at a single brief glance Psycho-logical Research-Psychologische Forschung 41(1)1ndash61

Pavel M Sperling G Riedl T amp Vanderbeek A(1987) Limits of visual communication the effectof signal-to-noise ratio on the intelligibility ofAmerican Sign Language Journal of the OpticalSociety of America A 4(12) 2355ndash2365

Portilla J amp Simoncelli E P (2000) A parametrictexture model based on joint statistics of complexwavelet coefficients International Journal of Com-puter Vision 40(1) 49ndash71

Potter M C (1976) Short-term conceptual memoryfor pictures Journal of Experimental PsychologyHuman Learning amp Memory 2(5) 509ndash522

Rieger J W Braun C Bulthoff H H ampGegenfurtner K R (2005) The dynamics of visualpattern masking in natural scene processing Amagnetoencephalography study Journal of Vision5(3)10 275ndash286 httpwwwjournalofvisionorgcontent5310 doi1011675310 [PubMed][Article]

Sekuler R W (1965) Spatial and temporal determi-nants of visual backward masking Journal ofExperimental Psychology 70(4) 401ndash406

Solomon J A (2000) Channel selection with non-white-noise masks Journal of the Optical Society ofAmerica A 17(6) 986ndash993

Stromeyer C F amp Julesz B (1972) Spatial-frequencymasking in vision Critical bands and spread ofmasking Journal of the Optical Society of AmericaA 62(10) 1221ndash1232

Switkes E Mayer M J amp Sloan J A (1978) Spatialfrequency analysis of the visual environmentAnisotropy and the carpentered environment hy-pothesis Vision Research 18(10) 1393ndash1399

Tolhurst D J amp Tadmor Y (2000) Discriminationof spectrally blended natural images Optimisationof the human visual system for encoding naturalimages Perception 29(9) 1087ndash1100

Torralba A amp Oliva A (2003) Statistics of naturalimage categories Network Computation in NeuralSystems 14(3) 391ndash412

van der Schaaf A amp van Hateren J H (1996)Modelling the power spectra of natural imagesStatistics and information Vision Research 36(17)2759ndash2770

VanRullen R (2011) Four common conceptualfallacies in mapping the time course of recognitionFrontiers in Perception Science 2 365

Walther D B Caddigan E Fei-Fei L amp Beck DM (2009) Natural scene categories revealed indistributed patterns of activity in the human brainJournal of Neuroscience 29 10573ndash10581

Wichmann F A Braun D I amp Gegenfurtner K R(2006) Phase noise and the classification of naturalimages Vision Research 46(8-9) 1520ndash1529

Wilson H R amp Humanski R (1993) Spatialfrequency adaptation and contrast gain controlVision Research 33(8) 1133ndash1149

Wilson H R McFarlane D K amp Phillips G C(1983) Spatial frequency tuning of orientationselective units estimated by oblique masking VisionResearch 23(9) 873ndash882

Journal of Vision (2013) 13(13)21 1ndash21 Hansen amp Loschky 21

  • Visual masking of rapid scene
  • The current study
  • General method
  • Experiment 1
  • e01
  • e02
  • e03
  • f01
  • f02
  • Experiment 2
  • Experiment 3
  • f03
  • Experiment 4
  • f04
  • f05
  • f06
  • f07
  • f08
  • f09
  • Experiment 5
  • f10
  • General discussion
  • n1
  • BaconMace1
  • Brady1
  • Carandini1
  • Carter1
  • Cass1
  • Coppola1
  • DeValois1
  • DeValois2
  • Essock1
  • FeiFei1
  • Field1
  • Field2
  • Guyader1
  • Hansen1
  • Hansen2
  • Hansen3
  • Hansen4
  • Heeger1
  • Heeger2
  • Henning1
  • Intraub1
  • Joubert1
  • Kaping1
  • Keil1
  • Kirk1
  • Legge1
  • Loftus2
  • Loftus3
  • Losada1
  • Loschky1
  • Loschky2
  • Loschky4
  • Marr1
  • Michaels1
  • Pavel1
  • Portilla1
  • Potter1
  • Rieger1
  • Sekuler1
  • Solomon1
  • Stromeyer1
  • Switkes1
  • Tolhurst1
  • Torralba1
  • vanderSchaaf1
  • VanRullen1
  • Walther1
  • Wichmann1
  • Wilson1
  • Wilson2