emotional aspects of music video
DESCRIPTION
Academic article on the emotional aspects of music videosTRANSCRIPT
1
Dr. Kathrin Fahlenbrach Martin-Luther-Universität Halle-Wittenberg Germany
IGEL-Conference Pécs, August 2002
Feeling Sounds. Emotional aspects of music videos.
I. First remarks to the effects of music videos and their aesthetics.
Talking about audiovisual culture today, the high influence of music videos can not be
ignored. Looking at the TV-culture, it is particularly obvious that promotion clips and music
videos have formed the aesthetics and the effects of TV-images on our perception. It is getting
more and more difficult to distinguish the TV-programme from promotional forms like
promotion clips or music videos. And even the film is highly influenced by the culture and the
aesthetics of video clips.
With this growing influence of the clip culture, the effects of audiovisual media have
changed. With the increasing concurrence between the channels in attracting the attention of
the public, the strategies of promotional forms like video clips are expanding – which are first
of all emotional strategies. Given that the audiovisual producers today increasingly use
explicitly modern knowledge about media perception, audiovisual aesthetics, particularly in
promotional forms, can not be isolated from their main intention: to have an influence on our
perception.
Therefore aesthetics and reception are especially closely connected in music videos.
Producers of music videos use directly all the common technical and aesthetical possibilities
of audiovisual creation to make the most of the sensorial and emotional effects of pictures and
sound. Video clips thus represent a special form of the TV-aesthetics because they highly
concentrate the effects of audiovisual media: The fast succession of pictures in a
continuing flow of pictures that nearly can not be perceived and the affective and emotional
effects that are connected to the high density of visual and acoustic stimuli.
Even in the fugitive and superficial perception, fragmented sounds and images are isolated
from the acoustic and visual background and attract the attention of the viewers. Within a split
second they evoke associations, feelings and moods in an observer who looks reflexively at
2
the screen. Even looking unconsciously at them, this sensorial clearness of video clips has an
immediate effect of attraction or repulsion – even if the clips are only diffusely activating
stimuli in the background.
Looking at this sensorial clearness that characterizes not only music videos but audiovisual
aesthetics in general, I would like to demonstrate some approaches that could describe the
specifics of audiovisual media without referring to the traditional models of semiotics or
linguistics, which are still dominating descriptions of the audiovisual construction of meaning.
Even recent studies that deal with the specific relation of images and sound in music videos
treat the visual and even the musical aesthetics as text (cf. Altrogge 2000). But if visual and
musical forms are only treated as narrative or linguistic categories, the specificity of the
sensorial aesthetics and effects of audiovisual media can not be comprehended. First of all the
immediate sensorial and affective processing of music videos is reduced with this linguistic or
semiotic foundation to the construction of more or less cognitive meaning.
Facing these problems, I am currently evaluating diverse approaches of psychological, social
and media research concerning the reception and production of audiovisual media that could
overcome this philological and semiotic paradigm. As there is almost no work done in this
direction, I am presenting you today first approaches that I wish to develop in my following
works.
II. Multidimensional and intermodal perception. Premises for the emotional effects of music videos.
In the following I would like to show some cognitive and emotional premises of the
perception of music videos, which are closely connected to their synthesis of audiovisual
creation and emotional effects.
In zapping fast between the channels many pieces of the TV-programme are rapidly assigned
to simple binary codes, such as: pleasant – unpleasant / interesting – boring etc. In doing so
our sensory and cognitive-emotional system works hardly: within a split second perceived
views, parts of action, single pieces of dialogue as well as genre-characteristics, camera
views, cuttings etc. are identified and their subjective attraction is evaluated. The cognitive
knowledge of media, knowledge in general and the situation of reception are just as
3
important for the decision to stay or to zap over to another channel as the emotional
perception of described feelings, tension and affective intensity.
If one zaps to a running video clip on MTV, the cognitive-emotional system decides
particularly fast. The highly dense stimuli, that are aesthetically designed to fit in the complex
flow of pictures in the zapping-mode, offer all the relevant cognitive and emotional dates to
the young public in a concentrated form: the musical and visual codes of youth- and
subculture as well as the connected moods of lifestyle (cf. Altrogge 2000).
The fast running visual and acoustic data are evaluated mostly on three dimensions:
• Sensorial processing of the audio-visual stimuli
• Emotional experience
• Cognitive evaluation
According to some latest theories on cognition and emotion (Fisher / Shaver / Carnochan
1990, Roth 2001, Damasio 1999, LeDoux 1996, Sokolowski 1993) the different dimensions
of the cognitive, emotional and sensorial processing are strongly correlated. The diverse
sensory stimuli are therefore evaluated in a parallel and simultaneous way. The immediate
scanning of the diverse sensory stimuli is reinforced specifically by the capacity of the brain,
to evaluate them in an intermodal way simultaneously and to relate them.
There are some recent theories on neurology and developmental psychology that deal with
this phenomenon of cross-modal or intermodal processing (Stern 1993, Marks 1978).
Acoustic, visual, tactile and olfactory stimuli are evaluated upon the so-called amodal
qualities – which are qualities of the stimuli that can be perceived in all sensorial modes.
There are several propositions how to name these amodal qualities. According to the main
categories proposed by Daniel Stern I am focussing the following qualities:
• Intensity1: concerning the experience of the density of the stimuli – along the
categories: strong – weak, loud – silent etc..
• Rhythm / Tempo: concerning the sensorial experience of temporal patterns – which
is evaluated in all modes along the categories: fast – slow
• Form / Pattern recognition: concerning the primary tendency of the brain to
integrate all perceived stimuli in well known patterns; this moreover cognitive
1 The level of Intensity concerns in the model of Stern first of all the interaction of mother and child and their specific physical and affective coordination. Cf. Stern 1993, 209 ff.
4
processing follows criteria like: moving – quiet, common – uncommon, harmonic –
disharmonic, complex – simple, varied – redundant, contrasting – similar,
symmetric – asymmetric; (cf. Mehrabian 1976)
Along this intermodal processing the brain is capable to evaluate immediately all the
perceived sensorial stimuli in a parallel way and thereby regulates the cognitive and the
emotional processing.
On the level of cognitive processing the intermodal processing thus regulates the cognitive
attention: In relating the diverse interpreted stimuli, in assigning gradual differences
between the amodal qualities and in differentiating polarities between the diverse
dimensions, the intermodal processing produces semantic structures which regulate the
following process of attention (cf. Marks 1987).
On the level of emotional processing, the intermodal processing regulates by the way of
affective-emotional activation the experienced density of the stimuli. This affective-
emotional regulation is especially relevant for media reception: depending on the
subjectively experienced density of the stimuli, the zapping TV-viewer decides on the
emotional level about ‘staying’ or ‘zapping on’. This reaction relies primarily on the
subjective coping potential: the intensity of the stimuli is experienced as ‘too high’ or ‘too
low’– and thereby as ‘too exhausting’ or ‘too boring’. By their media reception the viewers
thus regulate their acute activation – and their emotional moods (cf. Zillmann 1988). Media
thus can be used to reinforce or to reduce activation and emotion by ‘avoidance’ or by
‘sensation seeking’ (cf. Zillmann 1988, Winterhoff-Spurk 1999).
III. The synthesis of perception and aesthetics. Some aesthetic qualities of music videos with regard to their emotional effects.
There is every reason to believe, that intermodal processing has a particular role in the
reception of music videos – assuming that their successful reception relies on the harmonic
perception of picture and music.
Thus the most important aesthetic quality of music videos that directly corresponds with the
immediate emotional processing is the aesthetical synthesis of pictures and music. In most
of the videos that we see on MTV, this synthesis appears in form of a harmonic
synchronicity of pictures and music to facilitate and guarantee the fast and simple reception.
5
In designing the visual level in direct correspondence to the acoustic level, the music videos
seem to coincide structurally with the sensory processing of our brain.
Looking at the network of emotional, cognitive and sensorial processing I would like to
indicate some main aspects of the cognitive processing of music videos, as they are detected
in empirical studies on the reception of music videos (Altrogge 2000; Haack 1995).
III.1 The design of cognitive data in music videos
On the level of the cognitive evaluation, the visual and musical codes are evaluated on the
basis of the individual socio-cultural – especially the disposition of youth culture. Thus
musical and visual data are assigned to binary codes like: interesting – boring, strange –
familiar etc. In the cognitive evaluation of music videos youth cultural and sub cultural
codes play the most important role. Thus the video clips offer to their young public all of the
most important information for the stylistic and youth cultural assignment in a dense and
complex combination of visual and musical codes. Here are some of the most important data
on the level of the visual and acoustic creation of youth cultural codes:
• The visual style of the protagonists in a clip including all of the visible signs of their
habit, like clothes, hairstyle, make-up, nonverbal signals (gestures, facial expression)
etc. (cf. Altrogge 2000);
• The presentation of familiar or idealized situations and interactions corresponding
to a familiar or an idealized lifestyle of the public;
• The creation of youth cultural codes on all visual levels of the TV-production like:
the decoration of the studio, the creation of trailers and jingles, the clip production
itself;
• The musical codes that refer to the main musical styles like rock, pop, hip-hop etc.
which are recognized first of all by melody and rhythm.
In referring to the common codes and the taste of their public, music videos allow the
immediate cognitive affirmation or repulsion of the viewer on the cognitive level. Though
the cognitive youth cultural codes and taste, represented in music videos, also provoke
emotional effects. Looking at protest cultures for example, that define themselves in explicit
differentiation from their social environment, this emotional codification of musical and
visual codes is especially obvious (cf. Fahlenbrach 2002).
6
III.2 The design of emotional signals in music videos.
The evocation of emotions is related to the presentation of emotional signals that already can
be identified on the neurological level – the primary level of emotional experience. There are
some neurological studies (Damasio 1999, LeDoux 1996) that show, that the primary
emotions like pleasure, love, fear, sorrow, rage etc. can already be evoked in the sub cortical
brain structures directly by single images or sounds.
Emotional signals are often explicitly visually designed in video clips as well. On the level of
visual presentation this seems to happen mostly in two forms:
• The visual representation of (primary) emotions in conventionalised (narrative)
plots (cf. Grodal 1997):
o Love scenes (love)
o Scenes of separation / Scenes of loneliness (sorrow) etc.
• The presentation of emotional interaction which can evoke immediate parasocial
effects. These parasocial effects are based on the tendency of the viewer to feel
empathically the emotions that they see presented by the protagonists on the screen
(cf. Vorderer 1996). In the representation of those emotional interactions, all technical
and aesthetical possibilities of audiovisual media are used, i.e.:
o The visual presentation of face-to-face-interaction by close shots, by shot –
reverse shot, often intensified by slow motion;
o The presentation of single emotional reactions in the facial expression by
close shots, often showing the eyes (cf. Mikunda 2002);
o The creation of erotic proximity by close shots of the protagonist’ body and
the camera turning around him / her.
Concerning the musical design of emotions and moods there are some interesting
approaches in the field of music psychology. Klaus Scherer for example demonstrates that
music can directly evoke primary emotions in associating the acoustic qualities of vocal
expression of primary emotions. In his studies about the relation between emotional and
vocal expression Scherer indicates main acoustic parameters for the attribution of primary
emotional states in a rating scale. Fast tempo and high pitch level i.e. is mostly attributed by
7
the participants of his study to positive emotions like pleasantness and happiness whereas low
tempo and low pitch level is mostly attributed to more negative emotions like sadness.2
This correlates with empirical studies that show that the dominant function of music in video
clips seems to be to communicate moods and emotions (cf. Haack 1995) There is every
reason to believe that music dominantly regulates the experience of moods and emotions in a
music video. Therefore the visual presentation of emotions and moods has to be orientated at
the musical presentation of moods in the songs.
III.3 Sensorial activation through audiovisual design: Intermodal perception of music videos.3
In the common aesthetics of music videos those visual strategies often lead only to a highly
sensorial activation by the density of visual stimuli. In close relation to the musical rhythm
and the melodies, all visual strategies are used for the sensorial and affective activation (cf.
i.e. Mikunda 2002). On this level of sensorial processing the density of stimuli seem to be
experienced first of all by intermodal processing of amodal qualities that were described at
the beginning of this lecture.
In the following paragraph I would like to indicate some main aesthetical characteristics of
music videos concerning the audiovisual synchronicity on the level of intensity, rhythm and
formal patterns. In concentrating at this point explicitly on the aesthetical formation of the
visual and acoustic stimuli, it can only be assumed, that the audiovisual synchronicity on
these three levels refer directly to the intermodal perception and processing as it was
described above in reference to recent studies. I hope that I can verify my hypothesis
concerning the intermodal perception of such an aesthetical design in some later studies.
2 Happiness i.e. is attributed by: fast tempo, large pitch variation, sharp envelope, few harmonics, moderate amplitude variation (salient configurations: large pitch variation plus pitch contour up, fast tempo plus few harmonics); Sadness: Slow tempo, low pitch level, few harmonics, round envelope, pitch contour down (salient configuration: low pitch level plus slow tempo); Potency: many harmonics, fast tempo, high pitch level, round envelope, pitch contour up (salient configurations: large amplitude variation plus high pitch level, high pitch level plus many harmonics. Cf. Scherer, quoted in: Veltman 2001. 3 In the lecture I demonstrated the following criteria at the example of the recent music video ”Work it out” from Beyonce (2002). In this script of the lecture, I abandon this example because my demonstration is based on the running pictures in relation to the music. Without these a written demonstration would need a differentiated structural description which would go beyond the scope of this paper.
8
Audiovisual synchronicity on the level of rhythm:
The connection of musical rhythm and visual rhythm in music videos is mainly construed by
the editing of the pictures. The relation between musical and visual rhythm can be
synchronous or, as it is the recent convention of music videos, syncopical: the cuts between
the pictures don’t correspond exactly with the beat. This syncopic relation between the
rhythm of pictures and sound perfectly assimilate the processing of the cognitive system: the
attention of the accustomed viewer can not be activated through the harmonic composition of
rhythm of pictures and sound anymore, only the deviance of this pattern can stimulate his
attention (cf. Flückiger 2001).
Beneath the musical rhythm, beats, duration and metric schemes are related to other visual
elements, i.e.:
o The rhythm in the movement of persons and objects – including eye
movements of the protagonists
o The tempo of animated elements, like inserts, graphics and other effects;
o The tempo of the running camera: slow motion, fast motion etc.
o The tempo of the camera movement
o The tempo of change between the diverse visual levels of presentation in the
clip (i.e. stage of performance, inserts etc.)
Audiovisual synchronicity on the level of acoustic and visual patterns
Concerning the recognition of formal patterns in the musical and visual creation of a music
video it could be assumed, that the conventionalised schemata and plots of the aesthetical
production are, beneath the described rhythmical patterns, another main aspect of the
cognitive and first of all the affective-emotional activation. As mentioned above the
recognition of formal patterns is evaluated upon categories as moving – quiet, common –
uncommon, harmonic – disharmonic, complex – simple, varied – redundant, contrasting –
similar, symmetric – asymmetric and thereby highly influence the attention of the viewer.
Musical patterns like refrain and melody are related to visual patterns, construed i.e. by:
o The visual composition in the construction of rooms and movements by the so
called “continuity-editing” – a main aesthetic of the Hollywood film: the
coordination of continuity by the editing of the pictures, by camera movement
and camera position.
o The composition of view angles by central perspective as traditional pattern
of visual composition in the western culture. It is one of the most important
9
form to create the image of stars in a picture: referring to the Christian
iconography it indicates both the distance of the viewer to the star and his
originality
o Nonverbal patterns of facial expression, gestures etc.
o Relation of sizes between persons, persons and objects etc.
Audiovisual synchronicity on the level of intensity
The audiovisual intensity of music videos rests on the density of the described acoustic and
visual aesthetical stimuli, first of all the dense composition of musical and visual rhythm by
the editing of the pictures. Analysing the music video „Work it out“, Beyonce (2002) I could
count 65 cuts in one minute – which indicates the high rhythmical density on the visual level.
On the visual level there are several other elements to construct high intensity, i.e.:
o The intensity of the colours on the screen
o The intensity of the contrast of light (bright – dark): i.e. in the low-key / high-
key-style or the contrasting of light to create an artificial atmosphere
o The density of the inner structure of the pictures, the composition of diverse
visual levels
The high intensity of pictures often corresponds to the intensity produced on the acoustic
level by loudness, pitch, contrast of rhythm, melody and pitch and the quality of the musical
representation of mood, like serious – funny, bright – dark, strained – relaxed.
As mentioned above, these are only my first criteria for an analysis of the sensorial aesthetics
of music videos. But even looking at these indicated sensorial, emotional and cognitive
stimuli of a running music video that are perceived within a split second, it is obvious how
many data the brain is processing by zapping fast over the TV-channels. The aesthetics of
music videos today are so perfectly assimilated to the multidimensional and intermodal
processing of our brain that they perfectly seem to reflect their genuine effects.
Literature:
Altrogge, Michael (2000). Tönende Bilder. Interdisziplinäre Studie zu Musik und Bildern in Videoclips und ihre Bedeutung für Jugendliche (3 Vol.). Vol I: Das Feld und die Theorie. Berlin.
10
Clynes, Manfred, Evans, James R. (Ed.). Rhythm in psychological, linguistic and musical
Condon, William S. (1986). Communication: Rhythm and Structure. In: Clynes / Evans (Ed.). 55 – 79.
Damasio, Antonio R. (1999). The Feeling of what happens. Body and Emotion in the Making of Consciousness. New York.
Fahlenbrach, Kathrin (2002). Protestinszenierungen. Visuelle Kommunikation und kollektive Identitäten in Protestbewegungen. Wiesbaden.
Fischer, Kurt W. / Shaver, Phillip R. / Carnochan, Peter. How Emotions Develop and how they Organise Development. In: Cognition and Emotion, H. 4, 1990. S. 81-127.
Flückiger, Barbara (2001). Sound Design. Die virtuelle Klangwelt des Films. Zürich.
Grodal, Torben (1997). Moving Pictures. A new Theory of Film Genres, Feelings, and Cognition. Oxford.
Haack, Stefan (1995). Videoclips im semantischen Raum. (unpublished paper). Berlin. Quoted in: Rötter, Günther (2000). Videoclips und Visualisierung von E-Musik. In: Josef Kloppenburg (Ed.) Musik intermedial. Filmmusik, Videoclip, Fernsehen. Laaber.
Marks, Lawrence (1978). The unity of the senses – Interrelations among the modalities. New York.
LeDoux, Joseph E. 1996. The emotional brain. New York.
Mehrabian, Albert (1976). Public places and private places. The psychology of work, play, and living environments. New York.
Mikunda, Christian (2002). Kino spüren: Strategien emotionaler Filmgestaltung. Wien.
Riess-Jones, Mari (1986). Attentional Rhythmicity in human Perception. In: Clynes / Evans (Ed.). 13 – 41.
Rötter, Günther (2000). Videoclips und Visualisierung von E-Musik. In: Kloppenburg, Josef (Ed.) Musik Multimedial. Filmmusik, Videoclip, Fernsehen. Handbuch der Musik im 20. Jahrhundert. Vol 11. Laaber. 259 – 295.
Stern, Daniel (1993). Die Lebenserfahrung des Säuglings. Stuttgart.
Veltman, Joshua (2001). Notes on selected articles by Klaus R. Scherer (and collaborators) on Vocal Affect Expression. http://www.music-cog.ohio-state.edu/Music829D/Notes/Scherer.html. May 11. 2001.
Vorderer, Peter (Ed.) (1996). Fernsehen als Beziehungskiste: parasoziale Beziehungen und Interaktionen mit TV-Personen. Opladen.
Winterhoff-Spurk, Peter (1999). Medienpsychologie. Eine Einführung. Stuttgart/Berlin/Köln.
Zillmann, Dolf (1988). Mood management: Using Entertainment to full advantage. In: Donohew, L. / Sypher, H.E. / Higgins, E.T. (Eds.). Communication, social cognition, and affect. Hillsdale. 147-172.