![Page 1: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/1.jpg)
An Experiment To Characterize Videos On
The Web
Soam Acharya
Brian Smith
Cornell University
MMCN 1998
![Page 2: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/2.jpg)
Overview
• Designed and implemented an experiment to search and analyze videos on the web
• 22500 HTML documents
• 57000 movies
• 100 Gbytes of data
www
www
www
www
![Page 3: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/3.jpg)
Why?
• Codec Designers
• Network Engineers
• Other Multimedia Researchers• MM file systems
• Webmasters
![Page 4: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/4.jpg)
• How many movies are out there?
• What are their basic properties?
• What compression formats are popular?
• How well do the formats compare?
• Are standard modem rates enough?
Questions We Asked
Not all that many. We found 57,000.
90% last 45 seconds or less. 1.1 Mbytes is their median size
QuickTime is about 53%, followed by MPEG (30%) and AVI
MPEG compresses best. QuickTime and AVI are similar.
28.8 - 128 Kilobits/sec (Kbps) are useless for real-time download and display of movies.
![Page 5: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/5.jpg)
Roadmap
• Data Collection Methodology
• Analysis
• Results
• Conclusion
• Future Work
• Open Questions
![Page 6: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/6.jpg)
Data Collection Methodology
• Hunting Phase– get links to movies
• Gathering Phase– download movies and gather raw statistics
• Sifting Phase– eliminate outliers
![Page 7: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/7.jpg)
Early April 1997 -Hunting Phase
• Milked AltaVista for documents dated– January 1995 - March 1997
• looked for MPEG, QuickTime, AVI• no streaming video format
![Page 8: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/8.jpg)
Gathering Phasemid April 1997 - May 1997
LP11. http://www.eg.com/movie.html
LDG: movie link distributor/gathererLP: link processor
www.eg.com
2. movie.html
www.vid.com
3. my.mov4. summary statistics
LP0
LP2
LDG
Http://www.eg.com/movie.html
http://www.cnn.com/pepe.html
…..
![Page 9: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/9.jpg)
Sifting Phase
• Processed 100 Gbytes of data and 57,000 titles– used mpegstat and modified xanim
• 4 < frames/sec < 40 {5000 titles}
• duration > 0.5 seconds {1000 titles}
• 0.6 < aspect ratio < 1.667 {1000 titles}
• bitrate < 10 Mbps {1000 titles}– bitrate = (movie size)/(movie duration)
• duplicate URL detection {1500 titles}
![Page 10: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/10.jpg)
Analysis• 47500 titles remained
– 53% QuickTime, 30% MPEG, 17% AVI
• Can be divided into two categories– Distributions:
• by date• fps• size• duration• aspect ratio• bitrate
– Comparing movie formats against each other
![Page 11: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/11.jpg)
Roadmap
• Data Collection Methodology
• Analysis
• Results
• Conclusion
• Future Work
• Open Questions
![Page 12: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/12.jpg)
Movie Growth
0
500
1000
1500
2000
2500
3000
3500Ja
n-94
Apr
-94
Jul-9
4
Oct
-94
Jan-
95
Apr
-95
Jul-9
5
Oct
-95
Jan-
96
Apr
-96
Jul-9
6
Oct
-96
Jan-
97
Apr
-97
Month
Nu
mb
er
of
mo
vie
s
![Page 13: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/13.jpg)
Breakdown of Movie Growth By Type
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Jan-94
Apr-94
Jul-94
Oct-94
Jan-95
Apr-95
Jul-95
Oct-95
Jan-96
Apr-96
Jul-96
Oct-96
Jan-97
Apr-97
Month
Nu
mb
er o
f m
ovi
es
QuickTime
MPEG
AVI
![Page 14: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/14.jpg)
FPS Distribution
0
2000
4000
6000
8000
10000
12000
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Frame Rate
Nu
mb
er o
f m
ovi
es
AVI
MPEG
QuickTime
![Page 15: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/15.jpg)
Movie Size (In bytes)
• 70% of movies are 2Mbytes or less
• Median movie size is about 1.1 MBytes
![Page 16: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/16.jpg)
90% of the movies are 45 sec or less, 50% < 15 sec
Overall Duration Distribution
0
2000
4000
6000
8000
10000
12000
5 15
25
35
45
55
65
75
85
95
10
5
115
Length (in seconds)
Nu
mb
er o
f M
ovi
es
![Page 17: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/17.jpg)
Aspect Ratio
• 74% of all files had an aspect ratio of 1.333– 320 x 240– 160 x 120
• 89% had aspect ratios of 1.2 - 1.5
![Page 18: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/18.jpg)
• Movie Bitrate = movie size / movie duration
Overall Average Bitrate Distribution
0
1000
2000
3000
4000
5000
6000
28
.8
30
0
70
0
110
0
15
00
19
00
23
00
27
00
31
00
35
00
39
00
70
00
Mo
re
Kbits/sec
# o
f m
ov
ies
0%
10%
20%30%
40%
50%
60%
70%80%
90%
100%
Frequency
Cumulative %
![Page 19: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/19.jpg)
So Far ...
• Distributions:– by date– fps– size– duration– aspect ratio– bitrate
• Comparing movie formats
![Page 20: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/20.jpg)
AVI/QuickTime Comparison
Video Codecs AVI QuickTime
Radius Cinepak 43% 60%Intel Indeo R3.2 25% 2%Microsoft Video I 26% 0%Apple Video-RPZA 0% 22%
• 25% of AVI, 33% of QuickTime: video only
AVI QuickTimeAudio Codec PCM PCM
MS-ADPCM TWOS
![Page 21: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/21.jpg)
How Compare Compression?
• Bits/pixel = (video size in bits)__
(width * height * # of frames)
Mean Median (bits/pixel)
AVI 2.51 2.14QT 2.16 1.82MPEG 0.72 0.51
![Page 22: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/22.jpg)
MPEG Bits/pixel Distribution
• Size of I:P:B frames ~ 1: 2 : 5
• 90% of MPEG files were video only
Frame Type Mean bits/pixel Median bits/pixel
I 1.25 1.10P 0.76 0.54B 0.31 0.19
![Page 23: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/23.jpg)
MPEG Frame Patterns
Frame Pattern % Distribution Mean bits/pixel
I 27.1 1.17IBBPBB 15.7 0.7IBBPBBPBBPBBPBB 10.4 0.31IBBPBBPBBPBB 8.1 0.5IBBBPBBBPBBB 4.4 0.66IPBBIBB 4.2 0.39IIP 3.5 0.7
80% of MPEG: some recurring pattern
![Page 24: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/24.jpg)
Recap• Number of movies coming online - exponential, then
flat• MPEG higher fps, QuickTime/AVI lower• Median size of movies: 1.1 Mbytes• 90% of movies last 45 seconds or less• 1.333 is the most common aspect ratio• 28.8 - 128 Kbps modem rates useless for real-time
downloads• Radius Cinepak is widely used by QuickTime and AVI• MPEG compresses better than QuickTime and AVI• 80% of MPEGs have some sort of recurring pattern
![Page 25: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/25.jpg)
Conclusion• Existing compression technologies not
enough for transmission over standard modems– explains rise of streaming video technologies– users cope by making file sizes, duration
smaller– but not by throttling the bitrate
– perceptual threshold?
![Page 26: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/26.jpg)
Future Work
• How do videos age?
• Another study to confirm findings– Brewster Kahle,– www.archive.org
• Develop tools to automate the process
![Page 27: An Experiment To Characterize Videos On The Web Soam Acharya Brian Smith Cornell University MMCN 1998](https://reader038.vdocuments.us/reader038/viewer/2022110404/56649e945503460f94b99139/html5/thumbnails/27.jpg)
Open Questions
• What are video access patterns on the Web?
• How to analyze streaming video files?