elec4140: speech and image compression elec4140: speech and image compression fall semester 2011 tue...
TRANSCRIPT
Elec4140: Speech and Image CompressionElec4140: Speech and Image Compression
Fall Semester 2011Tue and Thu (16:30 – 17:50)
Room 2504
Bing Zeng (Room: 2434)
Email: [email protected]
Office Tel: 2358-7058
GSM: 9418-1354
Course outlineCourse outline
• Introduction
• Entropy coding
• Quantization
• Transform coding
• Predictive coding
• Hierarchical coding
• JPEG standards
• Interframe coding
• Motion estimation
Applicable to any kind of signals (speech, voice, image, video, etc.)
Course outlineCourse outline (cont.)(cont.)
• H.261/H.263/H.263++/H.264
• MPEG-1/2/4 and …
• Sub-band/wavelet coding
• Vector quantization
• Special topics
• Packet video
• Visual contents over IP and wireless networks
• Audio coding
• Speech coding
Applicable to any kind of signals (speech, voice, image, video, etc.)
How to grade?How to grade?
• For all students:– No homework!
– One midterm (20%, on Oct. 18, during the lecture hours)
– On final (40%)
– Two (mini) projects (20% + 20%): UG students can team up (maximum 2); PG students need to do them independently
– Or any other formula as long as it makes sense
Course materialCourse material
• Lecture notes– Accessible at our course website
– Updating from time to time
• References:– Majid Rabbani and Paul W. Jones, Digital Image
Compression Techniques, vol. TT7, SPIE Optical Engineering Press, 1991.
– Vasudev Bhaskaran and Konstantinos Konstantinides, Image and Video Compression Standards, Algorithms and Applications, 2nd Edition, Kluwer, 1997.
– Yao Wang, Jorn Ostermann, and Ya-Qin Zhang, Video Processing and Communications, Prentice Hall, 2002.
Rate-distortion curveRate-distortion curve
Rate
Distortion
R-D curve
Image representationImage representation
Low resolution browsingLow resolution browsing
ZoomingZooming
x
x
Panning aroundPanning around
x
x
JPEGJPEG
CANNON Powershot 11, resulting in about 1.04 MB
JPEG pictures over 2.5G/3GJPEG pictures over 2.5G/3G
• 1040 KB = 1040 x 8 Kb = 8,320 Kb in total
• Need 8,320/16 = 520 seconds over 8 minutes for transmission over 2.5G-GPRS (at 16 Kb/s)
• Over 3G?
• Moreover, protection must be added before transmission so that perhaps 15-20 minutes (or more) are needed!
• What do you think? Acceptable?
• What about bit-errors (at degree of magnitude 10-2 ~ 10-3) that occur in the wireless transmission?
Video over IPVideo over IP
• 2 hours of high-quality DVD (at 1 MB/s)– 2 x 60 (minutes) x 60 (seconds) = 7,200 MB = 7.2 GB in total
– Keeping 1 MB/s bandwidth for each client is still a dream!
– Downloading it at today’s popular Ethernet (at 10 Mb/s) needs about 2 hours – acceptable?
– What can we do?
• Video mobile-phone over 3G (at ~1 Mb/s)– Acceptable in CIF/QCIF frames (288 x 352/144 x 176) at 10-30
frames per second
– Questions: what would be the price (1 Mb/s is 60 times faster than GSM/GPRS) and how about the battery-life?
– Your experience!!!
Streaming versus downloading (for movies)Streaming versus downloading (for movies)
• Streaming
– Bandwidth needed: ~1MB/second
– Peak hours (8pm-11pm) only every day
– Bottleneck not at the last mile, but at the connection between each residential complex to the service center
• Downloading
– Bandwidth needed: ~10MB/second (or ~100Mb/second)
– 24 hours a day (on the push-up mode)
– All contents are stored at each complex and updated from time to time
Two important resourcesTwo important resources
• Computing power
• Networking
Trend 1: Trend 1: Computing everywhereComputing everywhere
0.10
1.00
10.00
100.00
1,000.00
90 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05
1.00
10.00
100.00
1,000.00
90 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05
10.00
100.00
1,000.00
10,000.00
90 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05
• Moore’s Law
– 100X in 10 years
– 10,000X in 20 years
– Even faster for disk & memory
• Not just computing:
– Graphics doubles every 6 months
– Storage doubles every 8 months
– Memory, LCD, …
• If cars follow:
– 1 billion times faster
– 0.01 RBM
摩尔定律及其发展 摩尔定律及其发展 - - ““ 免费免费”” 计算 计算
贝尔定律贝尔定律 性能不变,价格指数下降。性能不变,价格指数下降。 现有现有 10001000 亿微处理器 亿微处理器 - - 远比人口还多远比人口还多 !!
价格指数
时间
大型机 $100,000
2倍
18 月
PC $1,000
PDA / 电话 $100
电视 / 掌上 PC $10
饼干 ? 椅子 ? $1
摩尔定律及其发展 摩尔定律及其发展 - - 无处不在的计算无处不在的计算
Trend 2: Trend 2: Networking everywhereNetworking everywhere
• No more doubt about value of more bandwidth
• Metcalfe’s Law:
– N cost; N2 value
Set Top BoxSet Top Box
PC/TVPC/TV
VOD Server
HostHost
VOD Server
05
1015202530354045
90 91 92 93 94 95 96 97 98 99 00 01 02 03 04 05
US households with > 1Mb connectionsUS households with > 1Mb connections
Metcalfe’s Law – Metcalfe’s Law – NN22 value for N cost value for N cost
1 to 1 (like Telephone) 1 to N (like TV) N to N (Internet!)
Trend 3: Trend 3: Wireless everywhereWireless everywhere
Total Cellular and PCS Subscribers (millions)Total Subs 1997 1998E 1999E 2000E 2001E 2002E 2003E CAGRNorth America 58 70 84 98 111 124 137 14%Europe 59 100 147 197 250 306 365 30%Asia/Pacific 67 102 144 191 242 296 353 28%Latin America 13 21 31 43 57 73 91 34%Africa/ME 7 10 13 17 22 28 35 29%Total 203 303 419 546 682 827 981 26%Source: Merrill Lynch estimates
Trend 4:Trend 4: Appliances everywhereAppliances everywhere
• Where there is electricity, there is computing:
– Television, Telephone, PDA, Car, appliances, …
• And much more:
– Art, E-Book, Wallet PC, Doors, Toilets, Garbage cans, …
Trend 5: Trend 5: “Everyone” online“Everyone” online
• Many more Internet users than PC users today
Growth of Internet Population in China
15 120 6392,0945,766
13,635
26,871
43,08852,016
-
10,000
20,000
30,000
40,000
50,000
60,000
Users'000
199519961997199819992000200120022003
Year
Trend 6:Trend 6: “Everything” online“Everything” online
• 12,000,000,000 GB of information today
– www.lesk.com/mlesk/ksg97/ksg.html
• Soon everything can be recorded and indexed
• Doesn’t mean you can find it!
Zetta
Exa
Peta
Tera
Giga
Mega
KiloA BookA Book
.Movie
All books (text)
All Books (multimedia)
Everything Recorded
A PhotoA Photo