vlsi digital signal processingbbaas/281/notes/lecture01.pdf · 2020-01-08 · eec 180b, b. baas 13...
TRANSCRIPT
VLSI Digital Signal Processing
EEC 281
Lecture 1
Bevan M. Baas
Tuesday, January 7, 2020
© B. Baas2
Today
• Administrative items
• Syllabus and course overview
• My background
• Digital signal processing overview
• Read Programmable DSP Architectures, Part Iby E. A. Lee
© B. Baas3
Course Communication
• Email– Urgent announcements
• Web page– http://www.ece.ucdavis.edu/~bbaas/281/
• Office hours– After lecture Tuesday
– After lecture Thursday
– Tentatively Friday 1:30 – 2:30
© B. Baas4
Course Workload
• 4 unit graduate course
• This course requires significant effort and time– Multi-disciplinary field coverage
• DSP algorithms
• Digital processor architectures
• Arithmetic
– Utilizes robust industry-standard CAD tools (but we will make use of only the core essential features)
• Verilog
• Synthesis tool
• Matlab
© B. Baas5
Course Readings
• No required textbook
• Over 800 slides posted as handouts on the course web page– You should fully understand all material in these handouts
• A few required papers
• Several posted tutorials with example code– You should fully understand these
• Several optional textbook references
© B. Baas6
Main Course Material
• The main body of material is presented in the lectures, readings, and handouts
• Generally speaking, the hwk/projets complement the main material– They go into a much greater depth on specific topics
– They give design experience
– They give significant practical application of theory
• The Quizzes generally focus on the main body of material
© B. Baas7
Breadth and Depth
• Breadth and Depth
READINGS
H
W
K
P
R
O
J
P
R
O
J
LECTURES
HANDOUTS
© B. Baas8
Course Overview
• EEC 281 web page contents
– Reading materials and references
– Hwk/Project descriptions
– Handouts– http://www.ece.ucdavis.edu/~bbaas/281/
© B. Baas9
Course Overview
• Canvas– Grades posted here
• Let me know if you ever see a score different than you expect
– Upload electronic portions of hwk/projects here
• Syllabus– Posted on course web page
© B. Baas10
Lectures
• Ask questions at any time
• Please hold conversations outside of class
• Please silence phones
• Integrated Solid-State Circuits Conference (ISSCC) February 17–19– Quiz 3 and short guest lecture on Tue, Feb 18
• I assign a letter grade only for the final course grade
• I look at the final exams and course record of the class and assign two key dividing points: the A/A+ and (probably B/B+) boundaries, and assign course grades from there using equally-sized intervals
– No required numbers of any particular letter grades
– Absolute scores are not important; the boundaries shift accordingto the difficulty of the exams in any quarter
– Ignore any letter grades you might see on smartsite
© B. Baas11
Letter Grade Assignments
A A+
A/A+
B/B+
B B+
Example with hypothetical data:
© B. Baas12
Working With Others
• Collaboration– Asking questions and explaining principles produces better
work and dramatically increases learning
– Working with others• Do homework and prelabs with classmates nearby
• Ask each other questions, help each other—regarding principles, and approaches to solving only
– See Course Collaboration Policy on web page
• Dishonesty– Copying produces similar work, stunts learning, is not fair
to honest students, and is not allowed in this course• Students engaged in dishonest work will be referred to Student
Judicial Affairs
• I will try to keep in-class exams honest
• Steps will be taken to keep out of class work honest
EEC 180B, B. Baas13
Penalties for Violating the Policy on Student Conduct and Discipline
• Penalties
– Minimum penalty: meetings with SJA officer, zero grade on work, record with SJA
– Permanent F grade on your transcript, no credit for the class
– One to three quarter suspension from the university
– Permanent dismissal from all ten campuses of the University of California. Permanent notation on your transcript.
EEC 180B, B. Baas14
Penalties for Violating the Policy on Student Conduct and Discipline
• Several perspectives– Personal obvious reasons
– ECE and UCD (especially for those inclined to share workwith someone doing poorly in class)Cheating harms our major and university’sreputation among employers who interviewour graduates.
• In summary: The purpose of the penalties and me mentioning them is so that no one will get one!!! Don’t do anything that violates the Policy on Student Conduct!
EEC 180B, B. Baas15
Penalties for Violating the Policy on Student Conduct and Discipline
• Typical scenario:
– Someone shares code/design with another
– They get caught
– The “Copier” feels terrible guilt for causing a friend to get a zero
– The “Sharer” deeply regrets sharing resulting in a zero when he/she should have had a full score
EEC 180B, B. Baas16
Exam and Quiz Regrades
• Some number of exams and quizzes will be scanned before being returned
• Key take-away messages:
– Do not change anything on your work if you request a regrade
– One student did last year and got in big BIGtrouble!!!
EEC 180B, B. Baas17
Cheating Websiteschegg, coursehero, etc.
• The university has recently taken a very strong stand against paying for work (2-quarter suspension for first offense last year)
• Key take-away messages:– Do not post assignments
– Of course do not use any unpermitted outside material in work you submit
– Of course do not post solutions
– Two students did last year and got caught!!!
© B. Baas18
MOSS
• “Measure Of Software Similarity” tool
• Utilizes very sophisticated and fast algorithms
• Processes all Order(N^2/2) = N(N–1)/2 pairings
– Ex: examination of 300 submissions includes 44,850 pairwise comparisons
• If needed, MOSS runs will be made with this year’s work combined with work from past years
© B. Baas19
MOSS Demonstration Case
• Added, deleted, changed all comments
• Changed all variable names
• Reordered modules
• Reordered lines of code within modules
• Changed equivalent logic
• 91% similarity for submission 1
• 91% similarity for submission 2
Original versionModified version
© B. Baas 20
MOSS
• Key take-away messages:
1) MOSS is amazingly good at spotting pairs of submissions that share a common design• This meshes very well with the course collaboration policy
2) Follow the course collaboration policy and you have nothing to worry about
3) Violate the course collaboration policy and you will have something to worry about
© B. Baas31
Advancing CMOS Technologies
• Moore’s “Law” (Observation) was made in 1965 and notes that transistor density ~doubles every year (every 1.5 years now)
• "Cramming more components onto integrated circuits," Gordon Moore, Electronics, April 19, 1965.
© B. Baas32
Original data up to the year 2010 collected and plotted by M. Horowitz, F. Labonte, O. Shacham, K. Olukotun, L. Hammond, and C. Batten
New plot and data collected for 2010-2015 by K. Rupp
New data added by B. Baas
Number of Logical Cores
Transistors(thousands)
EEC 116, B. Baas33
GC2: 16 nm, 23.6 billion transistors
© B. Baas34
• Basic trends– Number of available devices: continually increasing– Energy dissipation per operation: decreasing too slowly
• There are a lot of ways to place and connect a billion transistors• The most efficient implementations (throughput, energy, area) will have:
– Processor sizes that capture computational kernels with few excess circuits
– Optimized clock frequencies and supply voltages matched to dynamic workloads
Future Fabrication Technologies
VDD1VDD3
VDD2
© B. Baas 35
Optimal Computational Tile Size
• The most efficient implementations (energy, throughput, chip area) have: Processor sizes that capture computational kernels with few excess circuits
~~~ ~~~
Energy Effic.
Clock rate
Area Effic.
Unused or low
benefit-per-cost
circuits
Inter-
processor
interconnect
Tile Size
Number of Processors on a Single Die vs. Year
36
Note: Each processor capable of independent program execution
Academic
Industry
© B. Baas
© B. Baas37
Processor Eras
• Transistor Era: the Intel 4004 was
the first commercial single-chip
microprocessor and it contained
2300 hand-drawn transistors
• Single/Multi-Processor Era: focus
on components of single processors
and multi-processors, which
generally scale well to only small
numbers of processors
• 1000-Processor Era: focus on
making systems scalable and
working with processors as
building blocks. The 32 nm
1000-processor KiloCore chip would contain approximately 2300-3700
processors if its area were the same as a 32 nm Intel Core i7 processor,
or 11,000 processors if its area were the same as an Nvidia GP100
FFTVit.
Motion
Est. MemMemMem
Intel 4004, 1971
2300 transistors
AsAP2167-Processor Chip
• 65 nm CMOS, 1.2 GHz (fastest processor designed in any university)
• 3 accelerators + 3 shared memories
• New on-chip networks
• Processors choose own supply voltage and clock freq.
• Apps: JPEG, Wi-Fi TX & RX, H.264 video encoder, ultrasound
• Tools: compiler, mapping, simulators
• Undergrad research opportunitiesAsAP2, 2007
2300 processors
(19.8mm x 19.8mm)
© B. Baas38
© B. Baas43
Digital Signal Processing
• Digital– Discrete time
– Discrete valued
• Signal– 1, 2, 3,… dimensional
• Processing– Analysis
– Synthesis
– Enhancement
© B. Baas44
DSP Workloads
• Often “real-time”– Data producer and consumer can not be paused or held up
• Examples: antenna, controller, camera, video monitor,…
– Very strict minimum performance levels
– Performance above that minimum is often of little value
DSPsystem
Dataproducer
Dataconsumer
© B. Baas45
DSP Workloads
DSPsystem
Dataconsumer
Synthesis. Ex: music keyboard
DSPsystem
Dataproducer
Analysis. Ex: anti-lock brakes
Maybe
MSamples/sec
Maybe
1 Sample/sec
© B. Baas46
DSP Workloads
• Data stream can be considered infinite duration– Length of data stream >> any buffering
– Ex: high-pass filter, automotive collision-detection radar distance measurement system
DSPsystem
1
0
0
1
0
0
1
0
0
1
1
0
0
1
1
0
1
0
0
0
1
1
1
0
……… …
© B. Baas47
DSP Workloads
• Digital signal processing
• Typically very numerically intensive– Lots of +, -, x
DSP
system1
0
0
1
0
0
1
0
0
1
1
0
0
1
1
0
1
0
0
0
1
1
1
0
……… …
© B. Baas48
DSP Compared withAnalog Processing
• Digital signal processing
– Compare with analog signal processing
• If possible in analog domain (at required precision), analog processing will likely require far fewer devices
• If possible in analog domain, either domain may produce the most energy-efficient solution
• Many algorithms are possible only with DSP (arbitrarily high precision, non-causal, …)
• DSP arithmetic is completely stable over process, temperature, and voltage variations
– Ex: 2.0000 + 3.0000 = 5.0000 will always be true as long as the circuit is functioning correctly
© B. Baas49
DSP Compared withAnalog Processing
• Digital signal processing– Compare with analog signal processing
• DSP energy-efficiencies are rapidly increasing
• Once a DSP processor has been designed in a portable format (gate netlist, HDL, software), very little effort is required to “port” (re-target) the design to a different processing technology. Analog circuits typically require a nearly-complete re-design.
• DSP capabilities are rapidly increasing
– Analog A/D speed x resolution product doubles every 5 years
– Digital processing performance doubles every 18-24 months (6x to 10x every 5 years)
© B. Baas50
Common DSP Applications
• Early applications
– Instrumentation
– Radar
– Communication
– Imaging
• Current applications
– Consumer audio, video
– Networking
– Telecommunications
– Machine learning
– Imaging
– Many many more…
© B. Baas51
Consumer Products’ Trends
• Analog based Digital based– Music records, tapes CDs, MP3s
– Video VHS, 8mm DVD, Blu-ray, H.264, H.265
– Telephony analog mobile (1G) digital (4G, LTE,…)
– Television NTSC/PAL digital (DVB, ATSC, ISDB, …)
– Many products use digital data and “speak” digital: computers, networks, digital appliances
• Impacts– Processing
– Transmission
– Storage
– etc.
© B. Baas52
Consumer Products’ Trends
• Analog based vs. Digital based– iphone apps???
© B. Baas53
Future Applications
• Very limited power budgets
• Require significant digital signal processing
© B. Baas54
Key Design Metrics(Means to Compare Multiple Designs)
1) Performancea) Throughput (high); e.g., 250 MSamples/secb) Latency (low); e.g., 2.7 µsec from first sample in -> first outc) Numerical precision
2) Chip area (cost); e.g., mm2 die area, area of standard cell netlist
3) Energy dissipation per workload, e.g., Joules per JPEG image
4) Design complexity
– Design time = lower performance
– Software more important as systems become more complex
5) Suitability for future fabrication technologies
– Many transistors
– Faulty devicesi) During manufacturing processii) device wear out due to effects such as NBTI