support across the board ™ blackfin speedway presentation core, memory, and peripherals

65
Support Across The Board Blackfin Speedway Presentation Core, Memory, and Peripherals

Upload: austin-morris

Post on 28-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Blackfin Speedway PresentationCore, Memory, and Peripherals

Page 2: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Blackfin as a Convergent Processor

Commonly asked questions:

What makes Blackfin a “convergent” processor?

What architectural features enable convergent processing?

What type of performance can Blackfin achieve from a networking standpoint?

Page 3: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Agenda

• Blackfin “Convergent Processing”• Blackfin Core Details

– Registers– ALU, MAC, Shifter

– Sequencer, Pipeline, Event Controller • Blackfin Memory

– Memory Architecture– Cache

• Peripherals– General Peripherals (UART,SPORT, SPI, TWI, WD, RTC)– Ethernet, CAN– PPI– DMA

Page 4: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

What architectural features enable convergent processing?

• Integrated instruction set architecture

– Single instruction set for signal processing and control

• Programmable interrupt levels

– Real-time tasks get the highest priority level

• Memory protection with an MMU

– Regions of memory can be protected from access

• Networked peripherals in addition high speed connectivity to ADC, DAC and video peripherals

• Unified address space and byte addressable

• Support for User and Supervisor modes

• Robust ALU including both signal processing functions as well as traditional MPC/MPU functions

Page 5: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

What makes Blackfin a Convergent Processor?

• Blackfin has a mature compiler that produces highly optimized code (with an option to produce “dense code” for control applications)

• Blackfin processors come with a full suite of C-based device drivers for peripherals– Fully documented, common APIs

• Blackfin beats the competition in terms of DSP benchmarks and it is on par with ARM code density benchmarks

• Blackfin is scalable across a broad set of applications– ADSP-BF531 on the low end– Dual-core ADSP-BF561 on the high end

• Latest peripheral integration expands connectivity to network-based applications

• Large set of options for OS and kernel support, including uCLinux

Page 6: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Blackfin ADSP-BF536/537 Architecture

Overview

Page 7: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Blackfin Architecture Basics

CoreRegisters

ALU, MAC, Shifter

Data Addressing Modes

Program Sequencer

Event Controller

Peripherals

Instruction Set Overview

MemoryArchitecture

Cache

Page 8: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 1

Register File

Page 9: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Accessing Registers

• Blackfin processors are register-intensive devices

– All computations are performed on data contained in registers

– All peripherals are setup using registers

– Memory is accessed using pointers in address registers

• There are two types of Blackfin processor registers

– Core registers

– Memory-mapped registers (MMRs)

Page 10: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Blackfin Core Registers

• Core registers are accessed directly by name– Data Registers: R0-R7

– Accumulator Registers: A0, A1

– Pointer Registers: P0-P5, FP, SP,USP

– DAG Registers: I0-I3, M0-M3, B0-B3, L0-L3

– Cycle Counters: CYCLES, CYCLES2

– Program Sequencer: SEQSTAT

– System Configuration Register: SYSCFG

– Loop Registers: LT[1:0], LB[1:0], LC[1:0]

– Interrupt Return Registers: RETI, RETX, RETN, RETE Example:

R0 = SYSCFG; // Load data register with contents of SYSCFG register

Page 11: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Core Registers

LT0LB0

Loop CounterLoop TopLoop Bottom

ASTAT

RETS

RETI

RETX

RETN

RETE

Arithmetic Status

Subroutine Return

Interrupt Return

Exception Return

NMI Return

Emulation Return

LT1LB1

System Config

Sequencer Status

SYSCFG

SEQSTAT

LC0

LC1

I0

I1

I2

I3

L0

L1

L2

L3

B0

B1

B2

B3

M0

M1

M2

M3

31 0 31 0 31 0 31 0

P0

P1

P2

P3

P4

P5

31 0

FP

SP

USP

Address Registers

R0

R1

R2

R3

R4

R5

R6

R7

R0.LR0.H

R1.LR1.H

R4.LR4.H

R7.LR7.H

1531

A1.H A0.L

A0.H A0.L

A1X

A0X

Data Registers

1531

Shaded registers only accessible in Supervisor mode

39

Page 12: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Memory-Mapped Registers (MMRs)

• A majority of registers are memory-mapped and must be accessed indirectly– Core MMRs are used to configure the core registers

• They are listed in Appendix A of the HRM• All Core MMRs must be accessed with 32-bit reads or writes

– System MMRs are used to configure all other peripherals• They are listed in Appendix B of the HRM• Some System MMRs must be accessed with 32-bit reads or writes and others

with 16-bit reads or writes (See the HRM for details)

• MMR addresses are defined in header files– defBF53x.h for assembly– cdefBF53x.h for C/C++

• MMRs can only be accessed in Supervisor mode

Assembly Example:P0.H = HI(SPI_RDBR); // load upper 16-bits of SPI Receive Register address to pointer registerP0.L = LO(SPI_RDBR); // load lower 16-bits of SPI Receive Register address to pointer registerR0 = W[P0] (z); // read 16-bit SPI Receive Register (SPI_RDBR) into data register

C/C++ Example:short temp; // define variable to store contentstemp = *pSPI_RDBR; // read 16-bit SPI Receive Register contents into data element

Page 13: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 2

Arithmetic Logic Units (ALU)

Page 14: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Arithmetic Logic Unit (ALU)

Data Arithmetic Unit

A1

40barrelshifter

A0

40

1616

8 8 8 8

LD0 32-bits

LD1 32-bits

SD 32-bits

R0

R1

R2

R3

R4

R5

R6

R7

R0.L

R1.L

R2.L

R3.L

R4.L

R5.L

R6.L

R7.L

R0.H

R1.H

R2.H

R3.H

R4.H

R5.H

R6.H

R7.H

Page 15: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Arithmetic Logic Unit (ALU)

• Two 40-bit ALUs operate on 16-bit, 32-bit, and 40-bit input data and output 16-bit, 32-bit, and 40-bit results.

• Functions

– Fixed-point addition and subtraction

– Addition and subtraction of immediate values

– Accumulation and subtraction of multiplier results

– Logical AND, OR, NOT, XOR, bitwise XOR (LFSR), Negate

– Functions: ABS, MAX, MIN, Round, division primitives

– Supports conditional instructions

• Four 8-bit video ALUs

Page 16: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

40-bit ALU Operations

• 40-bit ALU operations support the following combinations:

– Single 16-Bit Operations

– Dual 16-Bit Operations

– Quad 16-Bit Operations

– Single 32-Bit Operations

– Dual 32-Bit Operations

Page 17: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 3

Multiply-Accumulators (MAC)

Page 18: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Multiply-Accumulators (MAC)

Data Arithmetic Unit

A1

40barrel

shifter

A0

40

1616

8 8 8 8

LD0 32-bits

LD1 32-bits

SD 32-bits

R0

R1

R2

R3

R4

R5

R6

R7

R0.L

R1.L

R2.L

R3.L

R4.L

R5.L

R6.L

R7.L

R0.H

R1.H

R2.H

R3.H

R4.H

R5.H

R6.H

R7.H

Page 19: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Multiply-Accumulators (MAC)

• Two identical MACs

– Each performs fixed-point multiplication and multiply-accumulate operations on 16-bit fixed-point input data and outputs 32-bit or 40-bit results.

• Functions

– Multiplication

– Multiply-accumulate with addition

– Multiply-accumulate with subtraction

– Dual versions of the above

• Features

– Saturation of accumulator results

– Optional rounding of multiplier results

Page 20: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 4

Barrel-Shifter (Shifter)

Page 21: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Barrel-Shifter (Shifter)

Data Arithmetic Unit

A1

40barrel

shifter

A0

40

1616

8 8 8 8

LD0 32-bits

LD1 32-bits

SD 32-bitsR0

R1

R2

R3

R4

R5

R6

R7

R0.L

R1.L

R2.L

R3.L

R4.L

R5.L

R6.L

R7.L

R0.H

R1.H

R2.H

R3.H

R4.H

R5.H

R6.H

R7.H

Page 22: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Barrel-Shifter (Shifter)

• Performs bitwise shifting for 16-bit, 32-bit or 40-bit inputs and yields 16-bit, 32-bit, or 40-bit outputs.

• Shift Functions

– Arithmetic Shifts preserve the sign of the original number. The sign bit value back-fills the left-most bit positions vacated by the arithmetic right shift.

– Logical Shifts discard any bits shifted out of the register and back-fills vacated bits with zeros.

Page 23: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Barrel-Shifter (Shifter)

• Additional Functions

– Rotate: Rotates a registered number through the CC bit a specified distance and direction.

– Bit Operations – Set, Clear, Toggle, Test

– Field Extract and Deposit

Page 24: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 5

Data Addressing Modes

Page 25: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Address Registers

I0

I1

I2

I3

L0

L1

L2

L3

B0

B1

B2

B3

M0

M1

M2

M3

31 0 31 0 31 0 31

P0

P1

P2

P3

P4

P5

31 0

FP

SP

USP

Address Registers

One set of 32-bit general-purpose Pointer registers P0-P5, SP and FP

One set of 32-bit DSP buffer addressing registers I0-I3, B0-B3, L0-L3, M0-M3

All addresses are byte addresses into a 4 GB address space

SP points to supervisor stack in Supervisor mode and user stack in User mode

USP is accessible in supervisor mode only – Allows access to user stack location while in Supervisor mode

Page 26: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Addressing Methods

• Register Indirect Addressing

– Index Registers (32-bit and 16-bit accesses)

– Pointer Registers P0 – P5 (32-bit, 16-bit, and 8-bit accesses)

– Stack and Frame Pointer Registers (32-bit accesses)

• Types of address pointer modify

– Modify/Post-Modify

• Linear addressing

• Circular buffering / modulo addressing

– Enables automatic maintenance of pointers to stay within bounds of a circular buffer

• Bit-Reversal (Modify only)

– Pre-Modify with update (using Stack Pointer)

– Pre-Modify without update

Page 27: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Linear vs Circular Buffering

• Linear Buffer Access– Index (I0:3) registers hold the address sent out on the address

bus.– Length (L0:3) register set to 0, thus disabling circular buffering.

• Default for C compiler• Provisions in compiler to allow circular buffers

– Modify (M0:3) registers contain the value (positive or negative) that is added to the I registers at the end of each memory access.

• Circular Buffer Access– Base (B0:3) registers contain the circular buffer’s start address.– Length (L0:3) register set to length of circular buffer.– Modify (M0:3) value must be less than or equal to the length of the

circular buffer.– Indexing wraps back to Base address when Index modification

exceeds Base + Length

Page 28: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Circular Buffer Example

0x00000001

0x00000002

0x00000003

0x00000004

0x00000005

0x0000000B

0x00000006

0x00000007

0x00000008

0x00000009

0x0000000A

0x00000001

0x00000002

0x00000003

0x00000004

0x00000005

0x0000000B

0x00000006

0x00000007

0x00000008

0x00000009

0x0000000A

Address

0

4

8

C

10

14

18

1C

20

24

28

Base Address and Starting Index Address (B0 = 0; I0 = 0;) Buffer Length is 44 (L0 = 44;)

There are 11 data elements and each data element is 4 bytes Modify Value is 16 (M0 = 16;)

4 elements * 4 bytes/element

1st Access

2nd Access

5th Access

4th Access

3rd Access

Page 29: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 6

Program Sequencer

Page 30: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

• Controls all program flow

• Contains a 10-stage instruction pipeline

• Maintains in-program branching

– Subroutines

– Jumps

– Interrupts and Exceptions

• Maintains loops

– Includes zero-overhead loop registers

– No cost for wrapping from loop bottom to loop top

Program Sequencer Features

Page 31: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Blackfin Execution Pipeline

• 10-stage super-pipeline

• Sequencer ensures that the pipeline is fully interlocked and that all the data hazards are hidden from the programmer

• If executing an instruction that requires data to be fetched, the pipeline will stall until that data is available– See EE-197 application note for a complete list of stalls and multi-cycle

instructions: http://www.analog.com/ee-notes

Page 32: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Avoiding Pipeline Stalls

Most common numeric operations have no instruction latency

VisualDSP++ Pipeline Viewer highlights Stall and Kill conditions

Page 33: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Sequencer-Related Registers

Page 34: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 10Section 7

Event Controller

Page 35: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Events (Interrupts / Exceptions)

• The Event Controller manages 5 types of Events

– Emulation (via external pin)

– Reset (via SW or external pin)

– Non-Maskable Interrupt (NMI) - for events that require immediate processor attention (via SW, external pin, or Watchdog)

– Exception

– Interrupts• Hardware Error• Core Timer• 9 General-Purpose Interrupts for servicing peripherals

– Can be custom prioritized for optimal system performance

• All events can be serviced by Interrupt Service Routines (ISR)

Page 36: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Interrupts vs. Exceptions

INTERRUPTS• Hardware-generated

– Asynchronous to program flow

– Requested by a peripheral• Software-generated

– Synchronous to program flow– Generated by RAISE

instruction• All instructions preceding the

interrupt in the pipeline are killed

EXCEPTIONS• Service Exception

– Return address (RETE) is the address following the excepting instruction

– Never re-executed– EXCPT instruction is in this

category• Error Condition Exception

– Return address (RETE) is the address of the excepting instruction

– Excepting instruction will be re-executed

The Blackfin is always in Supervisor Mode while executing Event Handler software and can be in User Mode only while executing application tasks.

Page 37: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

BF533 System and Core Interrupt Controllers

Emulator 0 EMU

Reset 1 RST

Non Maskable Interrupt 2 NMI

Exceptions 3 EVSW

Reserved 4 -

Hardware Error 5 IVHW

Core Timer 6 IVTMR

General Purpose 7 7 IVG7

General Purpose 8 8 IVG8

General Purpose 9 9 IVG9

General Purpose 10 10 IVG10

General Purpose 11 11 IVG11

General Purpose 12 12 IVG12

General Purpose 13 13 IVG13

General Purpose 14 14 IVG14

General Purpose 15 15 IVG15

PLL Wakeup interrupt IVG7

DMA error (generic) IVG7

PPI error interrupt IVG7

SPORT0 error interrupt IVG7

SPORT1 error interrupt IVG7

SPI error interrupt IVG7

UART error interrupt IVG7

RTC interrupt IVG8

DMA 0 interrupt (PPI) IVG8

DMA 1 interrupt (SPORT0 RX) IVG9

DMA 2 interrupt (SPORT0 TX) IVG9

DMA 3 interrupt (SPORT1 RX) IVG9

DMA 4 interrupt (SPORT1 TX) IVG9

DMA 5 interrupt (SPI) IVG10

DMA 6 interrupt (UART RX) IVG10

DMA 7 interrupt (UART TX) IVG10

Timer0 interrupt IVG11

Timer1 interrupt IVG11

Timer2 interrupt IVG11

PF interrupt A IVG12

PF interrupt B IVG12

DMA 8/9 interrupt (MemDMA0) IVG13

DMA 10/11 interrupt (MemDMA1) IVG13

Watchdog Timer Interrupt IVG13

Event Source IVG # Core Event Name

System Interrupt Source IVG # 1

1 Note: Default IVG configuration shown.

Highest

Lowest

P r

i o

r i t

y

Page 38: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Event Processing Flow

Page 39: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Interrupt Service Routine (ISR)

• ISR address is stored in the Event Vector Table– Used as the next fetch address when the event occurs

• Program Counter (PC) address is saved to a register– RETI, RETX, RETN, RETE, based on event

• Always concludes with “Return” Instruction– RTI, RTX, RTN, RTE (respectively)– When executed, PC is loaded with address stored in

RETI, RETX, RETN, or RETE to continue app code • Optional nesting of higher-priority interrupts possible

– See appnote EE-192, which covers writing interrupt routines in C (http://www.analog.com/ee-notes)

Page 40: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 8

Blackfin Peripherals

Page 41: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Peripherals and Power Management

Common Peripherals (All Blackfins)• SPI, UART, SPORT, WD, RTC• PPI

BF534/BF536/BF537 Peripherals• TWI, CAN

BF536/BF537 Peripheral• Ethernet

DMA and Handshake DMA

Power Manager

Page 42: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Three Serial Communication Peripherals• SPI (Serial Peripheral Interface)

– High-Speed SPI port (up to SCLK/4, max 33.25 MHz)• Master/Slave compatible with control of up to 7 slave-selects• Single-Duplex DMA (Either TX or RX)

– Typically used to interface with serial EPROMS, CPUs, converters, and displays• UART (Universal Asynchronous Receiver/Transmitter)

– PC-style UART port (baud rate up to SCLK/16, max 8.3125 MHz)• Supports half-duplex IrDA SIR (9.6/115.2 Kbps rate)• Autobaud detection support through the use of the Timers• Separate TX and RX DMA support

– Typically used for maintenance port or interfacing with slow serial peripherals• SPORTs (Synchronous Serial Ports)

– High Speed Serial Port (up to SCLK/2, max 66.5 MHz)• Variable word length support (3 - 32 bits)• I2S-Compatible• Separate TX and RX DMA support• 128 Channels out of 1024-Channel Window for TDM support• Primary and Secondary Data channels

– Typically used for interfacing with CODECs and TDM data streams

Page 43: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Real-Time Clock Features

• Used to implement real-time watch or “life counter”– Time of day, alarm, stopwatch count-down, and elapsed

time since last system reset• Uses four counters - Seconds, Minutes, Hours, Days• Equipped with two alarm features

– Daily and Day-And-Time• Uses dedicated 32.768 kHz crystal to RTXI / RTXO

– Can be pre-scaled to 1 Hz to count in real-time seconds• Uses dedicated power supply pins

– Independent of any reset• Can take processor out of all low-power states

Page 44: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

PPI – What is it?

• Parallel Peripheral Interface

– Programmable bus width (from 8 – 16 bits in 1-bit steps)

– Bidirectional (half-duplex) parallel interface

– Synchronous Interface

• Interface is driven by an external clock (“PPI_CLK”)

• Up to 66MHz rate (SCLK/2)

• Asynchronous to SCLK

– Includes three frame syncs to control the interface timing

– Applications

• Driving LCD Interface

• General Purpose Interface to outside world

• High speed data converters

• Video CODECs

Page 45: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

TWO-WIRE INTERFACE (TWI)

• Fully compliant to the Philips I2C bus protocol– See Philips I2C Bus Specification version 2.1

• 7-bit addressing• 100 Kb/s (normal mode) and 400Kb/s (fast mode) data rates• General call address support

• Supports Master and Slave operation– Separate receive and transmit FIFOs

• SCCB (Serial Camera Control Bus) support– Only in Master mode

• Slave mode cannot be used because the TWI controller always issues an Acknowledge in slave mode

Page 46: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Controller Area Network (CAN)

• Adheres fully to CAN V2.0B standard– Supports both standard (11-bit) and extended (29-bit) Identifiers– Data Rates up to 1Mbit/second

• 32 Configurable Mailboxes– 8 dedicated transmitters and 8 dedicated receivers– 16 configurable (transmit or receive)

• Dedicated Acceptance Mask for each Mailbox

• Data Filtering (first two bytes) can be used for Acceptance Filtering

• CAN wakeup from Hibernation (lowest static power consumption) Mode

• CAN Protocol Stacks– Automotive: CAN drivers and protocol stacks through Vector CANtech – Industrial: Leading third parties will provide a full Industrial suite for

CANOpen, DeviceNet, etc.

Page 47: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

ADSP-BF536/537 Family Ethernet MAC Features

ADSP-BF536/537 Ethernet MAC has advanced features beyond IEEE 802.3: For improved performance:

Automatic Checksum Computation for IP Header and Payload on RX Frames Programmable RX Data Alignment Mode for 32-bit Alignment Independent RX & TX DMA Channels with Delivery of Frame Status to Memory System Wakeup on Magic Packet for 4 User-Definable Wakeup Frame Filters

For lower overall system cost: No PHY XTAL required – Buffered XTAL output from processor feeds PHY Connection to either MII or RMII PHY

ADSP-BF536/537 enhances throughput and dataflow via these features: Enhanced DMA channels allow for processor core independence Direction Control to exploit SDRAM physics Four SDRAM rows can be ‘open’ at any given time

ADSP-BF536/537 overall networking bandwidth:Full 100Mbps wire speed on 1400-bit payload with an optimized networking stack

UDP : ~44% processor core loadingTCP/IP: ~75% processor core loading

Page 48: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

ADSP-BF536/537 DMA Enhancements

• 4 additional DMA channels– All 12 peripheral DMA channels can be assigned to any

of the peripherals

• Provides MAC further control over the assigned DMA channels– Can reload DMA registers if incorrect checksum is detected

• Two External Handshaking Memory DMA Controllers– Good for asynchronous FIFOs or off-chip interface controllers

between Blackfin memory and hardware buffers

Page 49: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Variable Frequency

Clock dividers (1x to 63x) enable low latency changes in system performance

Variable Voltage

On-Chip Voltage Regulator generates accurate voltage from 2.25 – 3.6V input

Core voltage programmable from 0.8V to 1.2V (50 mV increments)

Maximum 40usec latency for PLL to relock (Frequency or Voltage changes)

System Cost Reduction

Po

we

r (m

W)

600 MHz, 1.2V, 264 mW

200 MHz, 1.2V, 156 mW

500 MHz, 1.2V

500 MHz, 1.0V

Frequency Only

Voltage & Frequency

Power Savings

Audio ProcessingVideo Processing

Blackfin – Dynamic Power Management Increases Battery Life

200 MHz, 0.8V, 90 mW

Page 50: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 9

Instruction Set Overview

Page 51: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Instruction Set Description

• Full-featured flexible multifunction instructions

• Employs an algebraic-style syntax

• Optimized to allow access to many of the processor core resources within a single instruction

• Compiled C and C++ source code makes optimal use of instructions

• Format designed for ease of coding and readability

• Tuned to generate dense code (small memory size footprint)

Page 52: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Blackfin Assembly Language Features

• Multi-issue load/store modified-Harvard architecture supports – Two 16-bit MAC or four 8-bit ALU + two load/store + two

pointer updates per cycle.• Unified 4G byte memory space

– All registers, I/O, and memory are mapped a unified 4G byte memory space

– Providing a simplified programming model• Microcontroller features:

– Arbitrary bit and bit-field manipulation, insertion, and extraction– Integer operations on 8-, 16-, and 32-bit data-types– Separate user and supervisor stack pointers

• Code density enhancements– Intermixing of 16- and 32-bit instructions (no mode switching,

no code segregation)– Frequently used instructions are encoded in 16 bits.

Page 53: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Blackfin DSPs Code Density

Instruction Set Tuned for Compact Code Multi-length Instructions

• 16, 32-bit Opcodes• Limited Multi-Issue

Compact Call/Return

No Memory Alignment Restrictions for Code Transparent Alignment HW Blackfin Supports 16 and 32-

bit Memory Systems

16-bit OP32-bit OP

16-bit widememory

015

64-bit Multi-OP Packet

031

32-bit widememory

No Memory Alignment Restrictions: Maximum Code Density and Minimum

System Memory Cost

Instruction Formats

Page 54: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Blackfin Code Density Features

Free intermixing of 16/32-bit instructions - no mode switching, no code segregation

Frequently used instructions encoded as 16-bits

3-bit register fields

Conditional moves

Push/Pop multiple registers

Three operand instructions

Single condition bit and evaluation

Page 55: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Data MovementLD, ST, 8,16,32 bitsUnsigned, Sign-extendRegister moves, P-D-DAG,Push, Pop, Push/PopmultCC to dreg, etc.

Addressing ModesAuto incr, Auto decr,Pre-decr store on SP,IndirectIndexed w/immed offsetPost-incr w/ nonunity strideByte addressable

Program ControlBRCC, UJUMP,Call, RETS, Loop Setup

Arithmetic+,-,*,/,>>>, Negate2 and 3 operand instructs

LogicalAND, OR, XOR, NOTBITtst,set,tgl,clr, CC ops<<,>>

VideoSAA, Byteops: Residual calc,Spatial Interpolation, SpatialFilter

Cache ControlPrefetch, Flush

A DSP with a RISC instruction set and a MMU, an event controller and a wide range of peripherals

Supervisor/user modes

Memory management

Wide range of peripherals

Event control

Blackfin Dual Operational Model

Page 56: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Blackfin MicroController Features

Arbitrary bit and bit-field manipulation, insertion and extraction

Integer operations on 8/16/32-byte data-types

Memory protection and separate user and supervisor stack pointers

Scratch SRAM for context switching

Population and leading digit counting

Byte addressing DAGs

Compact Code Density

Page 57: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Support Across The Board™

Section 10

Blackfin Memory

Page 58: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

ADSP-BF536/7 at a Glance

BlackfinProcessor

L1Instruction

L1Data A

L1Data B

64 bit

25MHzXTAL

EnetPHY

25MHz Enet Data SDRAM

Rows are “open” in 4 SDRAM banks

reducespage activation

ExtBus

W/directionControl

No need for second XTAL

PLL VCO

4 sub-banks allow 2 core accesses at

same time as DMA access

1:64X131MHz

DMA

2 core fetches

or 1 fetch and 1 store

16

Max Bandwidth 266MB/sec

32

Makes best use

of SDRAM

525 MHz

Large enough to run application code

Cache available if operations from SDRAM

are desired

Programmable frequency and voltage control

Page 59: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Memory Hierarchy on the Blackfin

• As processor speeds increase (300Mhz – 1 GHz), it becomes increasingly difficult to have large memories running at full speed.

• The BF5xx uses a memory hierarchy with a primary goal of achieving memory performance similar to that of the fastest memory (i.e. L1) with an overall cost close to that of the least expensive memory (i.e. L2)

L2 Memory

External Larger capacityHigher latency

L1 Memory

InternalSmallest capacity

Single cycle access

CORE

(Registers)L3 Memory

External Largest capacityHighest latency

Page 60: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Memory Architecture: The Basics

Core

L1 Instruction Memory

L1 Data Memory

External Memory

L1 Data Memory

External MemoryExternal MemoryUnified L3External Memory

Unified L2

Single cycle toaccess

10s of Kbytes

Several cycles to access 100s of Kbytes

Several system cycles to access

100s of Mbytes

>600MHz

>600MHz

>300MHz

<133MHz

On-chip

Off-chip

DMA

Page 61: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Configurable Memory

• Best system performance can be achieved when executing code or fetching data out of L1 memory

• Two methods can be used to fill L1 memory – Caching and Dynamic Downloading – Blackfin Processor supports both– General Purpose processors have typically used the

caching method, as they often have large programs residing in external memory and determinism is not as important.

– DSPs have typically used dynamic downloading, as they need direct control over which code runs in the fastest memory.

• Blackfin processors allow the programmer to choose one or both methods to optimize system performance.

Page 62: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

What is Cache?

• In a hierarchical memory system, cache is the first level of memory reached once the address leaves the core (i.e L1)– If the instruction/data word (8, 16, 32, or 64 bits) that

corresponds to the address is in the cache, there is a cache hit and the word is forwarded to the core from the cache.

– If the word that corresponds to the address is not in the cache, there is a cache miss. This causes a fetch of a fixed size block (which contains the requested word) from the main memory.

• The Blackfin allows the user to specify which regions (i.e. pages) of main memory are cacheable and which are not through the use of CPLBs (more on this later).

– If a page is cacheable, the block (i.e. cache line containing 32 bytes) is stored in the cache after the requested word is forwarded to the core

– If a page is non-cacheable, the requested word is simply forwarded to the core

Page 63: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

Cache Hits and Misses

• A cache hit occurs when the address for an instruction fetch request from the core matches a valid entry in the cache.

• A cache hit is determined by comparing the upper 18 bits, and bits 11 and 10 of the instruction fetch address to the address tags of valid lines currently stored in a cache set.

• Only valid cache lines (i.e. cache lines with their valid bits set) are included in the address tag compare operation.

• When a cache hit occurs, the target 64-bit instruction word is sent to the instruction alignment unit where it is stored in one of two 64-bit instruction buffers.

• When a cache miss occurs, the instruction memory unit generates a cache line-fill access to retrieve the missing cache line from external memory to the core.

Page 64: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

L1 Instruction Memory 16KB Configurable Bank

Instruction

DCB- DMA

4KBsub-bank

EAB – Cache Line Fill

4KBsub-bank

4KBsub-bank

4KBsub-bank

16 KB cache

• 4-way set associative with arbitrary locking of ways and lines

• LRU replacement

• No DMA access

16 KB SRAM

• Four 4KB single-ported sub-banks

• Allows simultaneous core and DMA accesses to different banks

Page 65: Support Across The Board ™ Blackfin Speedway Presentation Core, Memory, and Peripherals

Copyright © Avnet, Inc., Analog Devices, Inc. All rights reserved.

L1 Data Memory 16KB Configurable Bank

Block is Multi-ported when:Accessing different sub-bank

OR

Accessing one odd and one even access (Addr bit 2 different) within the same sub-bank.

Data 1

Data 0

4KBsub-bank

4KBsub-bank

4KBsub-bank

4KBsub-bank

• When Used as Cache– Each bank is 2-way

set-associative– No DMA access– Allows simultaneous

dual DAG access

• When Used as SRAM– Allows simultaneous

dual DAG and DMA access

DCB- DMA

EAB – Cache Line Fill