8th annual csis research conference 1 client server browsing of sound resources: classification and...
TRANSCRIPT
![Page 1: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/1.jpg)
8th Annual CSIS Research Conference
1
Client Server Browsing of Sound Resources: Classification and Browsing
E. Brazil
Interaction Design Centre
University of Limerick
Ireland
![Page 2: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/2.jpg)
8th Annual CSIS Research Conference
2
Introduction
? - how to classify sound resources and how
to provide an interface to browse these
resources.
! - provide a browsable sound database for
users via intranet / Internet environments
![Page 3: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/3.jpg)
8th Annual CSIS Research Conference
Overview of Research Areas
• Sound Classification
• Sound Representation
• Sound Browsing
![Page 4: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/4.jpg)
8th Annual CSIS Research Conference
Sound Classification
• Two levels of classification
• Course level– Distinguish whether Speech, Music,
Environmental, Silence or Other category
• Fine level– Use human perceptual features
![Page 5: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/5.jpg)
8th Annual CSIS Research Conference
Coarse-level classification of audio (1)
– Audio signals are classified into basic types, including speech, music, several types of environmental sounds, and silence
– Take morphological and statistical analyses of short-time feature curves (energy function, average zero-crossing rate, fundamental frequency), as well as a rule-based heuristic classification procedure
![Page 6: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/6.jpg)
8th Annual CSIS Research Conference
Coarse-level classification of audio (2)
• Short-time energy function– Short-time energy of audio signal reflects the
amplitude variations over time
• Short-time average zero-crossing rate
– ZCR is the number of times the signal passes
through zero in a given time interval
• Spectral Centroid
![Page 7: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/7.jpg)
8th Annual CSIS Research Conference
Fine-level classification of audio
• Further classification will be conducted within each basic type:
– music: classify music played by different instruments, different types of music, singing, plain song
– speech: differentiate voices of man, woman, and child, speech with music background
– environmental sound: divide them into classes such as applause, bell ring, footstep, windstorm, laughter, bird’s sound, and so on
![Page 8: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/8.jpg)
8th Annual CSIS Research Conference
Sound Representation
• Previous work has concentrated on– Visual star-field type display
• New novel visual representations– Visualisations on spheres (non-Euclidean
spaces)– Hyper tree– Excentric labeling
![Page 9: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/9.jpg)
8th Annual CSIS Research Conference
Star-field Display
Virtual University - Uni. Vienna
![Page 10: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/10.jpg)
8th Annual CSIS Research Conference
Visualisations on Spheres
H3: Laying OutLarge DirectedGraphs in 3D HyperbolicSpace - Munzer
![Page 11: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/11.jpg)
8th Annual CSIS Research Conference
Hyper Tree
www.inxight.com
![Page 12: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/12.jpg)
8th Annual CSIS Research Conference
Excentric Labeling
HCIL – Uni. Maryland
![Page 13: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/13.jpg)
8th Annual CSIS Research Conference
Sound Browsing
• Iterative & Interactive Activity:– Opportunistic & Serendipitous
• Enable users’ to explore a data set
• External & internal properties of objects:– Context & Content
• Evaluate and revise understanding of relationships
![Page 14: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/14.jpg)
8th Annual CSIS Research Conference
14
The Sonic Browser ApplicationAudio: Direct representation of tunes
(exploting the cocktailparty effect)
• Sounds are panned out in a stereo field controlled by the visual location of the tunes nearest to the cursor.
• The volume of the tunes playing concurrently is proportional to the visual distance between the objects and the cursor
![Page 15: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/15.jpg)
8th Annual CSIS Research Conference
16
The Sonic Browser Application
![Page 16: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/16.jpg)
8th Annual CSIS Research Conference
Client – Server Issues
• let the server do the mixing and spatialisation
• analysis and classification on server
• lightweight client - Java.
• different network topologies and protocols.– Latency issues– Use of a floating ‘Aura’
![Page 17: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/17.jpg)
8th Annual CSIS Research Conference
Cue Points
• Use Cue Points as Marker Points– Mark a specific point or section of a sound
• Play only significant portion of sound while browsing
• Reduce time to identify sound by playing characteristic or significant part
• Found in many common sound file formats* Technical Report UL-IDC-01-02
![Page 18: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/18.jpg)
8th Annual CSIS Research Conference
22
Application Platform: HW & OS
• Normal Multimedia PC – (Pentium II/III w. SB Live, etc)
• Server – MS Windows 98/2000
• Client– Any O/S with Java Runtime
![Page 19: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/19.jpg)
8th Annual CSIS Research Conference
Conclusion
• Facilitate different visualisation tools, e.g. for non-Euclidean space.
• Address payment and copyright issues
• Investigate other file types, e.g. MPEG-7.
![Page 20: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/20.jpg)
8th Annual CSIS Research Conference
References (1)
• Brazil, E. (2001). Cue Points: An Examination Of Common Sound File Formats. Limerick, University of Limerick.
• Fekete, J. D., Plaisant, C. (1999). Excentric Labeling: Dynamic Neighborhood Labeling for Data Visualization. Conference on Human factors in Computer Systems, New York, ACM.
• Fernström, M., Brazil, E. (2001). Sonic Browsing: An Auditory Tool For Multimedia Asset Management. International Conference on Auditory Display, Espoo, Finland.
• Ó Maidín, D. and M. Fernström (2000). The Best of Two Worlds: Retrieving and Browsing. COST-G6 Conference on Digital Audio Effects DAFx-00, Verona, Universita degli Studi Verona.
![Page 21: 8th Annual CSIS Research Conference 1 Client Server Browsing of Sound Resources: Classification and Browsing E. Brazil Interaction Design Centre University](https://reader035.vdocuments.us/reader035/viewer/2022070407/56649e315503460f94b22770/html5/thumbnails/21.jpg)
8th Annual CSIS Research Conference
References (2)
• Shneiderman, B. (1996). The eyes have it: A task by data type taxonomy for information visualizations. IEEE, Visual Languages, Boulder, CO, USA.
• Zhang, T., Kuo, C.C. (1998). Content-based Classification and Retrieval of Audio. SPIE's 43rd Annual Meeting - Conference on Advanced Signal Processing Algorithms, Architectures, and Implementations VIII, San Diego.
• Zhang, T., Kuo, C.C. (1998). Hierarchical System for Content-Based Audio Classification and Retrieval. SPIE's Conference on Multimedia Storage and Archiving Systems III, Boston.