audio programming in java a presentation for the vancouver island java users’ group kevin matz...

Audio Programming in JavaA presentation for the Vancouver Island Java Users’ Group

Kevin Matz2010.09.30

Topics

• Sound basics• Java Media Framework (avoid)• Java Sound API

– Playback– Real-time capture and processing

• MP3 playback with JLayer• What is a .MOD player…• …and how do you build one?

Ph

oto

cre

dit

: Lo

uis

e D

ock

er,

sx

c.h

u

2

What is sound?• Vibrations propagated through the molecules of

the air are detected by your eardrums and interpreted and perceived by your brain as sound

• Sound waves, as plotted on a Cartesian plane, are graphical representations of the patterns of high and low air pressure in a travelling sound wave:

3

Properties of sound waves

4

Mixing waveforms• To play multiple sounds

simultaneously, you can mix the waveforms together by simply adding them together

• Scale the output back to the standard volume to avoid clipping/distortion

• Basically: take the average of samples from all channels at each point in time

5

Recording and reproducing sound• A microphone contains a diaphragm that vibrates

when struck by sound waves, which vibrates a magnet within a coil to electromagnetically convert mechanical vibrations into an electrical signal with a varying amplitude

• “voice-shaped currents” (Alexander Graham Bell)• In a loudspeaker, the electric signal causes an

electromagnet to vibrate a diaphragm or speaker cone, which pushes the surrounding air out in the same pattern, reproducing the sound waves

6

Representing sound recordings digitally• Levels of electrical impulses quantized using an ADC

(analog-to-digital converter) “Pulse-Code Modulation”

• Properties of raw PCM recordings:– Sampling rate determines how frequently the analog

signal is to be sampled (e.g., 22000 Hz)– Bits per sample (e.g., 8 or 16)

• Little-endian vs. big-endian storage for more than 8 bits• Signed vs. unsigned

– Channels: Mono vs. stereo– Encoding: Uncompressed linear amplitudes (Linear PCM)

vs. logarithmic dynamic range compression (μ-Law, A-Law)

• e.g., CD audio: 44100 Hz, 16 bit, stereo, linear

7

Audio in Java:Java Media Framework (JMF)• Pretty much dead

– API not updated since 1999– Sporadic maintenance (periods of years without

updates)– JMF website full of broken links

• Supports few modern media formats• Add-on to the JRE; separate download and install

needed – Windows, Linux, Solaris supported, but not Mac

• MP3 decoder/encoder removed in 2002; decoder only made available as additional plug-in in 2004

8

Java Sound API (javax.sound)

• Part of the Java SE runtime environment since 1.3 (May 2000)

• javax.sound.sampled– Playback, capture, mixing of sampled audio– Natively supports .WAV, .AU, and .AIFF formats– Service provider interface (SPI) allows extensibility

for new audio devices and sound file formats

• javax.sound.midi– MIDI music synthesis and control of MIDI devices– Not covered in this presentation

9

javax.sound: Interfaces• Line

– An element of the “digital audio pipeline” that can carry audio data

– open(), close()– LineListeners can be registered to monitor open/start/stop/close

events

• Port extends Line– Representations of jacks for output to or input from audio

devices– e.g., microphone, CD player, line in, speaker, headphone, line out

• Mixer extends Line– A representation of any audio device with one or more input

and/or output lines– Can be a software implementation of a mixer, i.e., a device that

combines input from multiple input lines onto a single output line

10

javax.sound: Interfaces• DataLine extends Line

– Provides start(), stop(), available(), drain(), etc. to control audio data playback/capture

• SourceDataLine extends DataLine– To output sound data, you write to a SourceDataLine via

write()– Called “source” as it is intended to be an input to a mixer

• TargetDataLine extends DataLine– To capture incoming sound data, you read it from a

TargetDataLine via read()– Called “target” as it is intended to be an output of a mixer

• Clip extends DataLine– A data line that can have data pre-loaded prior to

playback– Supports looping

11

javax.sound: Classes• AudioFormat

– Describes a format in terms of sample rate, bits per sample, sample encoding, channels (mono/stereo), etc.

• AudioFileFormat– Describes format of an audio file (e.g., .WAV, .AU, .MP3) + an

AudioFormat• AudioInputStream extends InputStream

– An InputStream with a specific audio format (suitable for reading from audio files)

– Note: no AudioOutputStream!• AudioSystem

– Main entry point to audio resources– Query and get mixers (audio devices) available on system– Or, get lines directly without dealing with mixers

• Default mixer (audio device) for various line types determined by system properties, or can be specified in file lib/sound.properties in JRE directory

– Open audio files (returns AudioInputStream)

12

javax.sound: Security

• Playback generally always permitted• Recording

– Always prohibited for applets– Prohibited for applications running under a security

manager (e.g., WebStart apps), but can be overridden by user or admin by editing policy file

– Permitted for applications with no security manager

13

Demo 1: Playing a clip

• SimpleClipDemo.java– Simple playback of a .WAV file

• SoundBoardDemo.java– Simple Swing app using threads for simultaneous

playback of multiple clips

14

Demo 2: Capturing and processing audio in real time

• MicrophoneEchoDemo.java– Demo that captures microphone input and plays it

back, adding an echo effect with a one-second delay

15

JavaZoom open-source projects• JLayer

– Library for playing MP3 files– GNU LGPL license– Has its own API separate from Java Sound– jl1.0.1.jar is 106k

• MP3SPI– A Java Sound SPI plug-in so that Java Sound API

treats .MP3 files like any other already-supported format

• VorbisSPI– A Java Sound SPI for Ogg Vorbis files

• Note: See mp3licensing.com

16

Demo 3: Playing an MP3 song in the background with JLayer

• BackgroundMusicDemo.java– Demo using JLayer’s Player class in a separate

thread to play back an MP3 song in the background

17

Tracked music and the .MOD format• “Tracker” programs allow composition of music by entering notes

in a spreadsheet-like grid• .MOD format originated on the Amiga with Karsten Obarski’s

Ultimate Soundtracker (1987) and derivatives such as Protracker (shown below)

Image c

redit

: W

ikip

edia

18

.MOD format• Main features:

– 15 or 31 eight-bit samples (instruments)– 4 channels– Song consists of patterns (64 rows) arranged in an

order– Effect commands

– Arpeggio, portamento (slide up/down), vibrato, …

– Change speed, jump to pattern, …

• Variations on the .MOD format, and later formats (ScreamTracker .S3M, FastTracker .XM, Impulse Tracker .IT) expanded the number of channels and samples, added effects, and added more control over instruments

19

Writing a .MOD player

• We need to solve two major issues:

1. How do we play multiple sounds simultaneously?➔ Easy: Just add the waveforms together➔ Or even easier: Use Java’s mixer functionality

2. If we have a single recording of an instrument at a certain pitch (Middle C), then how do we reproduce the same instrument sound at a different pitch (e.g., an A in octave 5)?➔ Thankfully, this is also easy!

20

How do you play a sample at a different pitch?

• .MOD file assumes samples are recorded such that playing the sample at 8287 samples/sec will render the sample as a middle C

• To play a sample at a different pitch, i.e., at a different frequency… we simply play the sample at a different frequency!– Play the sample faster to get a higher pitch– Play the sample slower to get a lower pitch

• But by what factor should we scale a sample to get a particular note?

21

Table of frequencies for notesOctave 3 Octave 4 Octave 5

C 262 Hz Middle C 523 Hz C 1047 Hz

C# 277 C# 554 C# 1109

D 294 D 587 D 1175

D# 311 D# 622 D# 1245

E 330 E 659 E 1319

F 349 F 698 F 1397

F# 370 F# 740 F# 1480

G 392 G 784 G 1568

G# 415 G# 831 G# 1661

A 440 A 880 A 1760

A# 466 A# 932 A# 1867

B 494 B 988 B 1976

• Frequency ratio (interval) between two consecutive semitones is the 12th root of 2 = = 1.05946

• e.g. 880 Hz * 1.05946 = 932 Hz

22

Playing individual samples at different frequencies while maintaining a constant output frequency

• Re-sampling:

23

Playing a .MOD song (1 of 2)• Keep track of position in “order” list and look up pattern number• Keep track of current row position in the current pattern• Keep track of song speed (tempo), which is controlled by two

settings:– “Tick speed” setting determines how many “ticks” each row is divided

into– Beats per minute (BPM) determines how much time is spent per tick– Time per tick = 2.5 sec / BPM setting; e.g., 2.5 sec / 125 = 0.02 sec per

tick– So BPM = 125 and tick speed = 6 means each row is played for 0.02 * 6

= 0.12 sec; a pattern with 64 rows will take 0.12 * 64 = 7.68 sec to play• For each channel, keep track of:

– Instrument/sample– Note corresponding frequency

• Current position in sample• Skip/stutter parameters

– Volume– Effect (if any)

24

Playing a .MOD song (2 of 2)• For each row in a pattern:

– For the number of ticks according to the “tick speed”:• If this is the first tick:

– For each channel:» Update each channel with new instrument, note,

and/or effect» Execute “one-time” effect commands» Set up parameters for “continuous” effects

• Else:– For each channel:

» Update continuous effects• Generate audio data for this tick by rendering the sample for

each channel and mixing the channels into a single output channel

• Send the audio data to the playback buffer

25

Demo 4: Roarcore .MOD player

• http://www.roarcore.com

26

http://www.roarcore.com/

Any questions?

27

Thanks!

• By the way…I’m looking for volunteers to take a survey about adding a new type of commenting construct to the Java language!

http://www.kevinmatz.com/survey

28

http://www.kevinmatz.com/survey

audio programming in java a presentation for the vancouver island java users’ group kevin matz...

Documents