new table of contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous...

33
Embedded Software for Radar Signal Processing Applications Table of Contents 1 LRR Hardware .................................................................................................................................................3 2 VisionMid, DM8148 Centaurus Devices ...........................................................................................................4 3 System Interconnect ........................................................................................................................................7 4 System Design Approach ................................................................................................................................7 4.1 Critical Timing Requirements ..................................................................................................................7 4.2 Architecture Hardware Features and Application Notes. ..........................................................................7 4.3 Estimate of Performance Requirements. ................................................................................................8 4.4 Measuring Performance ..........................................................................................................................8 5 TI DSP + ARM Software Development Resources .........................................................................................8 5.1 Starting Software Development: “Quick Start Guide Installation Guide”, “Floating-Point Starter Kit” ....9 6 System Software Architecture ..........................................................................................................................9 6.1 ARM Cortex A8 core processor with NeonTM ........................................................................................10 6.1.1 ARM NeonTM .................................................................................................................................10 6.2 Unix System V Architecture ....................................................................................................................10 6.3 Inter-Process Communication (IPC) .......................................................................................................10 6.3.1 System V IPC .................................................................................................................................10 6.3.2 System V STREAMS .....................................................................................................................10 6.3.3 POSIX, pthreads ............................................................................................................................10 6.3.4 Sockets, Pipes ...............................................................................................................................10 6.3.5 uClinux and IPC .............................................................................................................................11 6.3.6 OMAP-L1x MSGQ (IPC) and DSPLIB/DSPLink .............................................................................11 6.3.6.1 DSPLINK ...............................................................................................................................12 6.3.7 DSPLINK Summary .......................................................................................................................14 6.3.8 Host Port Interface .........................................................................................................................14 6.3.9 Radar Data and the Host Port Interface .........................................................................................16 6.4 Signal Processing for Long Range Radar ..............................................................................................17 6.5 Texas Instruments DSP BIOS SYS BIOS ........................................................................................18 6.5.1 DSPBIOS 6.x ................................................................................................................................18 6.6 Texas Instruments FastRTS Library .....................................................................................................18 6.7 Applying New Technology: .....................................................................................................................18 6.8 Software Libraries ..................................................................................................................................19 6.9 Linear Math Libraries cblas_zgemm and zgemm: ..................................................................................19 6.10 OMAP-L137 ..........................................................................................................................................19 6.10.1 JTAG ...........................................................................................................................................20 7 References: ....................................................................................................................................................20 7.1 Texas Instruments ..................................................................................................................................20 7.2 Texas Instruments Vision & VisionMid ...................................................................................................21 7.3 SAAB (Microwave) .................................................................................................................................21 8 Appendix .......................................................................................................................................................21 9 Linux and the Texas Instruments c6x family ...................................................................................................21 10 Software as Texas Instruments redefines it. ...............................................................................................22 11 ARC5-B Hardware .......................................................................................................................................23 12 OMAP Software Resources. .......................................................................................................................23 13 VisionMid TMS320DM814x Software Resources .........................................................................................26 14 Pseudo Code ...............................................................................................................................................27 Michael Nolin 1 of 33 February 22, 2011

Upload: others

Post on 11-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

Embedded Software for Radar Signal Processing Applications

Table of Contents 1 LRR Hardware.................................................................................................................................................3 2 VisionMid, DM8148 Centaurus Devices...........................................................................................................4 3 System Interconnect........................................................................................................................................7 4 System Design Approach ................................................................................................................................7

4.1 Critical Timing Requirements ..................................................................................................................7 4.2 Architecture Hardware Features and Application Notes...........................................................................7 4.3 Estimate of Performance Requirements. ................................................................................................8 4.4 Measuring Performance..........................................................................................................................8

5 TI DSP + ARM Software Development Resources.........................................................................................8 5.1 Starting Software Development: “Quick Start Guide Installation Guide”, “Floating-Point Starter Kit” ....9

6 System Software Architecture..........................................................................................................................9 6.1 ARM Cortex A8 core processor with NeonTM........................................................................................10

6.1.1 ARM NeonTM.................................................................................................................................10 6.2 Unix System V Architecture....................................................................................................................10 6.3 Inter-Process Communication (IPC).......................................................................................................10

6.3.1 System V IPC.................................................................................................................................10 6.3.2 System V STREAMS.....................................................................................................................10 6.3.3 POSIX, pthreads............................................................................................................................10 6.3.4 Sockets, Pipes...............................................................................................................................10 6.3.5 uClinux and IPC.............................................................................................................................11 6.3.6 OMAP-L1x MSGQ (IPC) and DSPLIB/DSPLink.............................................................................11

6.3.6.1 DSPLINK...............................................................................................................................12 6.3.7 DSPLINK Summary.......................................................................................................................14 6.3.8 Host Port Interface.........................................................................................................................14 6.3.9 Radar Data and the Host Port Interface.........................................................................................16

6.4 Signal Processing for Long Range Radar..............................................................................................17 6.5 Texas Instruments DSP BIOS SYS BIOS→ ........................................................................................18

6.5.1 DSPBIOS 6.x ................................................................................................................................18 6.6 Texas Instruments FastRTS Library.....................................................................................................18 6.7 Applying New Technology:.....................................................................................................................18 6.8 Software Libraries..................................................................................................................................19 6.9 Linear Math Libraries cblas_zgemm and zgemm:..................................................................................19 6.10 OMAP-L137..........................................................................................................................................19

6.10.1 JTAG ...........................................................................................................................................20 7 References:....................................................................................................................................................20

7.1 Texas Instruments..................................................................................................................................20 7.2 Texas Instruments Vision & VisionMid...................................................................................................21 7.3 SAAB (Microwave) .................................................................................................................................21

8 Appendix .......................................................................................................................................................21 9 Linux and the Texas Instruments c6x family...................................................................................................21 10 Software as Texas Instruments redefines it. ...............................................................................................22 11 ARC5-B Hardware.......................................................................................................................................23 12 OMAP Software Resources. .......................................................................................................................23 13 VisionMid TMS320DM814x Software Resources.........................................................................................26 14 Pseudo Code...............................................................................................................................................27

Michael Nolin 1 of 33 February 22, 2011

Page 2: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

Illustration IndexIllustration 1: DSP/BIOS Link Architecture..............................................................................................................3Illustration 2: LRR Hardware Block Diagram...........................................................................................................5Illustration 3: VisionMid Functional Block Diagram Feb 2010..................................................................................7Illustration 4: System Interconnect..........................................................................................................................8Illustration 5: Texas Instruments ARM - DSP Layered Architecture.......................................................................11Illustration 6: Texas Instruments Inter-OS communications..................................................................................13Illustration 7: Texas Instruments DSP/BIOS Link Architecture..............................................................................19Illustration 8: GPP-DSP connectivity through DSP/BIOS LINK.............................................................................19Illustration 9: Architecture RingIO Transfer, Shared Memory................................................................................20Illustration 10: CCStudio MSGQ configuration .....................................................................................................21Illustration 11: DSP/LINK Message Queue MSGQ...............................................................................................22Illustration 12: Initial ARC5-B Digital Signal Processing System...........................................................................29Illustration 13: DSP hardware accelerator or algorithm co-processing engine......................................................30Illustration 14: DSP algorithm co-processing engine and external peripherals......................................................31Illustration 15: OMAP-L137, running “Example DSPLIB/DSPLink Application on OMAP-L1x”. ...........................32Illustration 16: BIOS PSP Users Guide (OMAP-L137) block driver.......................................................................34Illustration 17: BIOS PSP driver with streaming interface (OMAP-L137)...............................................................35

Scope:The initial motivation of this document was a request to explain “IPC” as it would be required to demonstrate the capture of basic radar data on the ARC5-B digital signal processing board. Inter-Processor Communication is difficult to explain outside of its operating systems context and related computer architecture building blocks. As a result this document continues to expand into general software and architecture concepts, which more suitable and organized explanations may be found in the numerous references.

Embedded Real Time operating systems have continued to evolve over several decades from simple micro controllers to the multi core systems being used today for image and signal processing systems of all types. The Reference manuals Users Guides and texts are invaluable to the understanding of embedded architecture building blocks. In many cases freely available software, operating systems and documentation have been leading the way in 64 bit, multi-core, RF. Hand-held user applications...

The hardware designs for radar signal processing, to date have been implementations of Texas Instrument Floating Point DSP's, as a result much of the hardware and software integration is also suitably described on www.ti.com and http://processors.wiki.ti.com.

Michael Nolin 2 of 33 February 22, 2011

Page 3: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

Texas Instrument DaVinci/OMAP System Design Workshop with Linux – (Optional) Introduction to DSP/BIOS Link page 15-13. The DSP/BIOS Link Architecture is being carried forward into the latest hardware and PSP releases as the SYS/BIOS Link Architecture, the soon to be released DM8148 (“March-April”) continues the DaVinci model.

Texas Instruments consistently uses a hierarchical architecture description for its latest SOC devices and publications. Hierarchical definitions are useful in that they are closely coupled to model working hardware designs, often times these definitions can be layered one upon another, from silicon, hardware/board level, embedded software, platform and operating systems (Linux and UNIX), through user application programs and displays.

“The hardware design of a chip typically has a tree-like hierarchy--the chip has several logical blocks, each block contains many registers, and each register has multiple bitfields,....”

This hierarchical method is often used by silicon FPGA and ASIC designers for design specifications. Some design tools such as the Xilinx FPGA synthesis package directly implement a hierarchical method in the toolset.

“An ideal hardware specification tool provides: Formatting that easily represents the hierarchical structure of the hardware specification data.”

The Internet follows a hierarchical architecture. Network and protocol stacks follow a hierarchical 7 layer

Michael Nolin 3 of 33 February 22, 2011

Illustration 1: DSP/BIOS Link Architecture

Page 4: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

open systems integration model layers with real names..

Michael Nolin 4 of 33 February 22, 2011

Page 5: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

1 LRR HardwareThe above diagram summarizes the hardware design1 central to the processing application is the latest available VisionMid device.

1 Reference hardware documentation “LRR Block Diagram with Vision Mid Application processor”, LRR_Block_Diagram.doc

Michael Nolin 5 of 33 February 22, 2011

Illustration 2: LRR Hardware Block Diagram

Page 6: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

2 VisionMid, DM8148 Centaurus DevicesIntroducing:VICP: Video Image Co-Processor Engine with High-Definition. ISS: Imaging Subsystem

Michael Nolin 6 of 33 February 22, 2011

Illustration 3: VisionMid Functional Block Diagram Feb 2010

Page 7: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

The VisionMid Functional Block Diagram Texas Instruments Vision & VisionMid introduces, a Medial Controller, Video Processing, and Imaging functional blocks, resulting in a redesign of the “System Interconnect”.

Michael Nolin 7 of 33 February 22, 2011

Page 8: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

Michael Nolin 8 of 33 February 22, 2011

Illustration 4: System Interconnect

Page 9: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

3 System Interconnect

A Switch Fabric architecture is used to provide an interconnection between various processing subsystems (ARM, DSP, VICP, ISS..). Not all 'initiators' are connected to all peripheral I/O targets. A table of “Target Initiator Connectivity” is provided. Further documentation for the System Interconnect Architecture is not yet available (TBD) Q1 2011. With additional processing devices, a more comprehensive system interconnect architecture, has replaced previous DaVinci and OMAP device support, where only ARM core – DSP co-processor support was needed.

4 System Design Approach 1. “Identify critical timing or CPU performance requirements.”2

2. Check vendor application notes or example of similar application notes. if not identify architecture hardware features that can be leveraged. This can indicate how far an application can be optimized.

3. Roughly estimate performance requirements MIPS, FLOPS, precise calculations may not be necessary for large variations in performance.

4. For small performance gaps (10-20%) on major components of the application specific implementation test models should be measured using the provided SDK Software Development Kit.

4.1 Critical Timing Requirements 12 bit A/D converters with 80ns conversion time, this conversion time is determined by the TMS320F28335 in current designs. Also under consideration, 14,16 bit A/D converters 24 bit converters are sometimes implemented in high end audio equipment. 5ns software real time requirement for Sequencer control, this may be addressed with a PWM??.1 Mbps serial CAN buss interface for communicating 'target' information 50ms..7040 256pt IFFTS per sweep for 8 antenna array 40-75ms sweeps? 4, 6,8, 9 and 12(12x64) antenna arrays are being considered. 1ms DSP algorithm goal for processing between radar sweeps. At 40ms per sweep an estimated 75% to process ESPRIT angle estimation or 30ms. 10 targets per sweep for 3ms to process each target. 3

16 bit (½ speed) memory bus differs significantly from reference hardware implementation and measurements. 150MHz F28335 (6.66ns cycle time) 300MHz 6747 (3.33ns cycle time). 7MHz SPI slave limitations TI specific design constraints. Performance goals and critical timing requirements continue to emerge. Vendors have offered an array of options from 100MHz devices through multi core 1GHz devices with hardware acceleration. Cost goals continue to drive component selection.

2 Embedded Systems Design October 2010 “Why MIPS is just a number” Gaurang Kavaiya3 August 4,2010 email “Timing for TI functions”.

Michael Nolin 9 of 33 February 22, 2011

Page 10: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

4.2 Architecture Hardware Features and Application Notes.Clearly titled example “Example DSPLIB/DSPLink Application on OMAP-L1x” addresses both processor to processor communication as well as key benchmarks specific to the Radar algorithm being developed. The example provided a valuable working program that can be built and modified by the developer to further specialize the application and measure performance for a variety of DSP functions on the OMAP-L137 EVM hardware.ARM to 6747 shared memory as well as the 335 to 6747 HPI provide the hardware transports for processor to DSP co-processor communications. The effectiveness of processor to processor communication is a significant system consideration. Existing product implements a SPI serial communication scheme significantly limiting effectiveness. (asynchronous serial verses synchronous parallel)

4.3 Estimate of Performance Requirements. Benchmark spreadsheets.

4.4 Measuring PerformanceOMAP-L137 EVM and platform software, DSPLIB Function for DSP_sp_mat_mul_cplx 32x32 was run and confirmed .8ms. The software package was rebuilt along with the linux kernel and similar results were obtained. A series of discrete algorithm functions, for single and double precision floating point test were measured on the OMAP-L137 EVM using CCStudio resources. Measured end to end 3ms was consumed through ½ the algorithm CSVD.

Michael Nolin 10 of 33 February 22, 2011

Page 11: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

5 TI DSP + ARM Software Development Resources

With the development of new SoC designs from TI updates and recent releases (October-November 2010) are continually reviewed for architecture resources that can be leveraged in the LRR design goals. TI continues to design and expand its SoC ARM+DSP architectures. The software development tools and resources continues to evolve around the frame work introduced with OMAP-L137. A familiar list of cross host compatible tools:• TI C6000 Code Generation Tools v6.0.9 or higher • TI DSP/BIOS v5.41.x • TI Codec Engine 2.25 or higher • TI XDC Tools eXpanDed C.. • TI Frame work components • TI DVSDK [Platform dependent]

C6Accel C6FloC6RunNote: The VisionMid product development is not consistent with the above mentioned tools, in the earliest available “engineering” releases.

5.1 Starting Software Development: “Quick Start Guide Installation Guide”, “Floating-Point Starter Kit”

Always a good place to start typically included with the initial developers kit shipped in along with other necessities (power supply, console cable).

Michael Nolin 11 of 33 February 22, 2011

Illustration 5: Texas Instruments ARM - DSP Layered Architecture

Page 12: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

As recommended in the System Design approach basic developer resources can often contain invaluable resources that may be directly related to the development effort. In the case of Radar signal processing effective benchmarks on linear processing functions were provided illustrating the performance concerns of DSP co-processing tasks.

5.2 Platform Support Package/Products PSP Linuxhttp://processors.wiki.ti.com/index.php/PSP_Linux_Development_FAQ

A replacement for previous licensed MontaVista Pro SDK packages. TI814x-PSP-04.01.00.01-User-Guide as required by the DM8148 EVM is scheduled for release (March – April) current pre-release development can be tracked on the staging tree on Arago server active development started in December 2010. “TI is taking a more active role in Linux kernel development for its SOC devices”. Arago Project is an open integration, build, and test infrastructure that provides a portal into how Texas Instruments creates customer ready Linux SDKs

LSP Linux Support Package a component of DVSDK and PSPDVSDK DiVinci/Digital Video Software Development Kit

“The Linux PSP includes these components (sources and pre-built binaries) for target hardware supported in the release:

• Primary bootloader (e.g. x-loader, ubl, ...), if applicable. • Secondary bootloader (u-boot) specific to supported devices • Linux kernel”

Documentation Release Notes, User guide and Feature and Performance documentation.

6 System Software ArchitectureDesigns for Radar Sensors under consideration include multi-CPU cores. Current designs two cores are used, one for Control tasks related to integration with the overall automotive system design and a second for signal processing tasks. With continued technological innovation new low power parts for signal processing and SoC designs have introduced a third ARM core into the system architectures. The subdivision of system operations and synchronization into a functional signal processing time line requires the use of Inter-Process Communication, signaling, and interrupts between CPU cores. Serial, Parallel, internal/external shared memory interfaces are available to support system communication needs. Without common device driver interface architecture and openly defined communication interfaces, each sensor effort will develop software implementations specific each unique design. Growth and scaling successful product designs may result in increasing difficulty while support of exiting designs overwhelms new product and production efforts

Michael Nolin 12 of 33 February 22, 2011

Page 13: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

6.1 ARM Cortex A8 core processor with NeonTM

Central to the embedded system architecture is the industry leading “power-optimized for mobile devices” ARM cores. ARM cores are common to embedded general purpose processing they are emerging as specialized, machines now capable of floating point operations and clock rates not previously available to embedded products 600MHz- 1GHz. ARM cores have long been available in a variety of Texas Instruments devices, DaVinci (DM6446, DM355), OMAP (OMAPL-137)...

6.1.1 ARM NeonTM

http://www.arm.com/products/processors/technologies/neon.phpNeon Technology brings the latest in ARMs parallelization and vectorization technology together. SIMID(Single Instruction Multiple Data) allows for accelerated processing of repetitive operations on large data sets. http://processors.wiki.ti.com/index.php/Cortex_A8

• OpenDL library supporting FIR, IIR, FFT, Dot Product, • Vectorizing compilers• Open Source tools and community support • Supported by gcc in versions 2007q3 and later

•challenge: find an OS or RTOS commercial or free that does not support ARM,

Michael Nolin 13 of 33 February 22, 2011

Illustration 6: Texas Instruments Inter-OS communications

Page 14: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

6.2 Unix System V ArchitectureEssential to the understanding of common computer architecture building blocks such as IPC and STREAMS. Also: 'man ipc'

6.3 Inter-Process Communication (IPC)

Introduced by early Unix architectures, System V IPC, has taken on a more generalized meaning and function with the steady advance of technologies. Linux Kernel development continues with architectural concepts introduced by Unix as well as these concepts are available to micro-controllers (MCU) through uClinux.org and commercially available packages. Openly available documentation for these packages provides easy reference to this construct.

New SPRUG06B “SYS/BIOS Inter-Processor Communication (IPC) and I/O User's Guide” May 2010 now available on ti.com release date Q3-Q4 2010.

6.3.1 System V IPCSupport for Inter-Process communication also includes shared memory, message queues, and semaphores.

6.3.2 System V STREAMS

6.3.3 POSIX, pthreadsPOSIX compliant systems include Mutexes, semaphores, condition variables as well as shared memory access routines. Perhaps one of the only relevant IEEE contributions to recent computer technology.

6.3.4 Sockets, Pipes

For embedded Radar applications inter-process communication is dependent on the underlying the full duplex serial peripheral interface bus (SPI), Host Port Interface (HPI) or shared memory. A common device driver interface is required to implement inter-process communications. IPC can be used with a variety of physical interfaces, shared memory, serial, parallel, and Ethernet the physical interface is abstracted by the common device driver IO model. Historically IPC has been implemented over a variety of physical interfaces dating back to main frame development when CPU's, memory and disks existed in different rack mounted chassis.

TI DSP BIOS offers “6.5 Message Queues” for “homogeneous or heterogeneous multi-processor messaging” MSGQ. QUE and MBX offer smaller implementation footprints while sacrificing advanced features. The TMS320C6000 DSP/BIOS 5.31 Application Programming Interface (API) Reference Guide section 2.19 “MSGQ Module” provides a detailed description of the MSGQ construct for system integration. For serial interface communications common practice can support the implementation of a checksum in the device driver layer where parallel HPI interfaces would not require checksum support.

Michael Nolin 14 of 33 February 22, 2011

Page 15: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

6747 SPI0 including optional slave chip select (SPI0_SCS) is connected. SPI1 is unused for serial communications.

TMS320F28335 GPIO53 – GPIO 56 drive SPI_DSP_ENA SPI_SIMO_DSP SPI_SOMI_DSP SPI_CLK_DSP respectively optional DSP_CS SPI_SCS the slave ship select is driven by GPIO17. SPI Master Slave hand shake signals master SPIx_ENA and slave SPIx_ENA “increase SPI bus throughput since the master does not need to delay each transfer long enough to allow for the worst case latency...”4

IPC messaging short messages matched to queue lengths for efficiency, as many protocols are developed. SPI Module 16bit shift register and 16bit buffer.

IPC data; large data transfers may be passed outside the short message queues through BIOS streaming “continuous sequence of real-time data. Messaging is generally performed with zero copying.” bios_5_41_07_24\packages\tibios\example\advanced\streaming provides valuable streaming Pipe IO example code.

6.3.5 uClinux and IPCInter-Process Communication inherited from UNIX architectures is consistent with xNIX architectures including MMU less variants uClinux.org for ARM9 cores. TI has provided uClinux (ucLinux) compatibility for the DSPLINK/dsplinkk.ko

6.3.6 OMAP-L1x MSGQ (IPC) and DSPLIB/DSPLinkA valuable application of Inter-Process Communication is provided by: http://processors.wiki.ti.com/http://processors.wiki.ti.com/ index.php/Example_DSPLIB/DSPLink_Application_on_OMAP-L1x A valuable example to consider as manufactured boards are arriving with OMAP/6747 hardware installed. An ARM Linux shell application loads and runs a DSPLINK application connected through the MSGQ API. Following the “getting started guide for the EVM” Using command line arguments supplied by the user the ARM application can then report total processing time (including DSP execution and MSGQ communication). An effective system model of the ARC5_B hardware as built. Performance of TI DSPLIB resources are referenced for both OMAP-L137 and 674x implementations. A performance spreadsheet is provided with the docs folder of the C674x DSPLIB installation. (C:\CCStudio_v3.3\c674x\dsplib_v12\docs and http://10.106.10.119/pub/c674x_dsplib_dev_notes.xls). For the OMAP-L1x sample application performance cycle counts with IPC and without are provided on the web page for DSPF_sp_mat_mul(1.24ms), DSPF_sp_mat_mul_cplx (.812ms) and DSPF_sp_mat_trans(.721ms), of particular interest to Radar application processing using the Esprit algorithm. http://processors.wiki.ti.com/index.php/Example_DSPLIB/DSPLink_Application_on_OMAP-L1xhttp://processors.wiki.ti.com/index.php/C674x_DSPLIB#Performancehttp://processors.wiki.ti.com/index.php/Getting_Started_Guide_for_OMAP-L137DSPLIB c674x/dsplib_v11 and dsplib_v12 have effective release dates of 6/25/2009 and 1/5/2010 respectively5. Example source includes input data and benchmark results for several matrix math operations.

4 6747 Fixed/Floating-point Digital Signal Processor.5 Difficult to understand why hand coded matrix multiply routines were being reviewed as late as August of

2010 with C++ types and double indexed arrays, long understood to block 'parallelization'.

Michael Nolin 15 of 33 February 22, 2011

Page 16: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

6.3.6.1 DSPLINK

http://processors.wiki.ti.com/index.php/DSPBIOS_LINK_WebEx_Presentations

DSPLINK provides and IPC like software support package to a DSP co-processor running TI/DSPBIOS providing an API to TI DSPLIB functions. DSPLINK supports an interface to/from more traditional Linux based IPC supporting operating systems as demonstrated in the OMAP-L137 (ARM926EJ-K <=> 6747) and DaVinci example code. A Ring IO buffering compatible with operating systems supporting IPC. TI representatives have expressed concerns that the DSPLINK is a large module. 102674 2009-04-16 21:37 dsplink.lib RELEASE BUILD 122946 2009-04-16 21:37 dsplinkk.ko 1261756 2009-04-16 21:37 ../DEBUG/dsplinkk.ko DEBUG BUILD

360267 2009-04-16 21:37 ../DEBUG/dsplink.libBuilding a suitable library with kernel module for an embedded system appears only to be dependent on effective tools usage. As the DSPLINK can be 'scaled at compile time' to add or remove functionality, its not clear how size could be a design consideration. DSPLINK, and all dependent components have been built and integrated onto the OMAP-L137 EVM hardware. Working example code (DSPLIB/DSPLINK Application on OMAP-L1x DSPF_sp_mat_mul_cplx) was used to validate the completeness of the newly built source code, performance measured was comparable to published benchmarks.

GPP-DSP boundary Basic processor control Shared/synchronized memory pool across multiple processors Notification of user events Mutually exclusive access to shared data structures Linked list based data streaming Data transfer over logical channels Messaging (based on MSGQ module of DSP/BIOS) Ring buffer based data streaming Zero Copy Messaging

Support for different physical links LINK DRIVER can be accommodated LNK_012_DES.pdf DSP/BIOS LINK, LNK 012 DES ,Link Driver.

Michael Nolin 16 of 33 February 22, 2011

Page 17: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

Michael Nolin 17 of 33 February 22, 2011

Illustration 7: Texas Instruments DSP/BIOS Link Architecture

Illustration 8: GPP-DSP connectivity through DSP/BIOS LINK

Page 18: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

work/OMAP-L137/OMAPL137_arm_1_00_00_11/dsplink-1_61_03-prebuilt/packages/dsplink/doc> kpdf UserGuide.pdfThe General Purpose Processor (GPP) end of the DSPLINK supports Linux (MV_pro5), Nucleus, and PrOS(eSOL) . Native build tools are necessary dependent on the GPP's target OS.

1.1.1.1. Ring IO LNK_129_DES.pdfThis component allows creation of a ring buffer created within the shared memory.The reader and writer of the ring buffer can be on different processors.

1.1.1.2. LDRV Link Driver

6.3.7 DSPLINK Summary

Complete IPC support including semaphores, interrupts HW and SW, as well as Data messaging and control messaging. Debug and informational statistics support is also provided. Procstats, MSGQstats, and Chnlstats, for integrated kernel logging.Support for the OMAP-L137 – 6747, working examples are provided, integration into no-OS micro-controller constructs used on the TMS320F28335 could prove challenging though there seems to be no hard multi-threading requirement placed on the GPP.

6.3.8 Host Port InterfaceThe Host Port Interface HPI is provided as a “parallel port interface through which an external host

Michael Nolin 18 of 33 February 22, 2011

Illustration 9: Architecture RingIO Transfer, Shared Memory

Page 19: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

processor can directly access the processor's resources (configuration and program/data memories).”6 The HPI interface is a user configurable 16 bit interface. Dedicated address HPIA and data HPID, HPIC control register is also provided. Sprufm7d.pdfThe TMS320F28335 provides host processing support for Radar designs. The 28335 does not support UHPI as it is defined, discreet GPIO pins (XZCS7-GPIO37, XRD, XZCS0-GPIO36, HRDY-GPIO28, HINT-GPIO63) are defined through software and the 335's “External Interface XINTF” 4.14 TMS320F28335 Data Manual to provide HPI functionality to the C6747.The HPI is supported as a Transport (6.5.4 Transports) of the BIOS Message Queues Input/Output support. The TI DSP BIOS supports its Message Queue MSGQ IPC like interface over the HPI as a supported infrastructure layer. Some example code exists for a MSGQ implementation. TI DSP bios/packages/ti/bios/examples/advanced/msgq_swi2swi/msgq_swi2swi.cWorking with E2E.TI.com Brad Griffis provided some feedback on the suitability of IPC mechanisms between C2000 C6000 devices. The DSP – DSP interface model provided by the TMS320F28335 connected through XINT/HPI to TMS320C6747 is supported by the BIOS – BIOS MSGQ interface, with a driver abstraction MQT (message queue transport). Alternately, DSPLINK was developed for the GPP-DSP interface model or ARM-C6747 as exists in the OMAP family. Since legacy implementations of radar devices does not implement TI/DSPBIOS framework on either DSP device an original coding, development and integration effort will be necessary. Lyrtech has been contracted for both HPI software and Ethernet, further postponing the effort for a common device driver interface and OS-BIOS framework. File->New->DSP/BIOS Configuration. Opens Configuration1 Panel “Input/Output” selection 'MSGQ' selection.

6 13. sprufk9b.pkf 674x/OMAP-L1x Processor Peripherals Overview.

Michael Nolin 19 of 33 February 22, 2011

Illustration 10: CCStudio MSGQ configuration

Page 20: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

The MSGQ configuration in CCStudio is analogous to spru423.pdf TMS320 DSP/BIOS v5.41 Users Guide Section 6 Input/Output Methods subsection 6.5 Message Queues.

6.3.9 Radar Data and the Host Port InterfaceThe TMS320F28335 provides the ADC support for the incoming radar data, the TMS320C6747 provides signal processing as well as an external Ethernet interface for downloading antenna data. Due to data rates and the limitations of the HPI and shared memory access to the 6747 SRAM, a basic buffering and DMA solution will be required. Buffering ADC (converter) data in a ring buffer construct on the 335 will address Real Time data rates, minimize data usage and allow for other functions to use the HPI to the 6747 co-processor. A DMA interface function managing current and end pointers will allow for contiguous shared memory writes of radar and support HPI interface interruptions, during large accumulations of entire radar data sets. Option 1 using TI DSP/BIOS available to 335 and 6747.

(1) ADC Radar Data with Stream IO interrupt handler (buffer put)(2) DMA application, buffer get DMA to 6747 shared memory buffer area.

Michael Nolin 20 of 33 February 22, 2011

Illustration 11: DSP/LINK Message Queue MSGQ

Page 21: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

(3) Control/IPC Message MSGQ and SWI/HWI 6747 signal: start algorithm on 1-n buffers.(4) Result data/ Raw data SIO to Ethernet TX ring.

Data flow Diagram Here.

Option 2 DSPLINK, working example code (October 12, 2010) port to HPI using LDRV Link driver LNK_012_DES

(1) ADC Radar Data with Stream IO interrupt handler (buffer put)(2) DSPLINK with 'Ring IO' (DMA application, buffer get) (3) algorithm integrated with DSPLIB API start algorithm on 1-n buffers.(4) Result data/ Raw data Stream IO SIO to Ethernet TX ring.

Option 3 Linux style IPC to RTOS DSP co-processor.

Data, Buffer, Buffer Management, Ring Buffers Perf Ftrace LTTng Linux Trace Tool

6.4 Signal Processing for Long Range Radar

A repository for Digital Signal Processing development notes for Radar applications.

Measurement of Floating point DSP performance presents unusual challenges. A theoretical calculation of CPU instructions without consideration of the actual number of pipeline instructions and Execution cycles (single precision 4, double precision 107) can lead to significant errors in estimated execution times. The problem is further complicated by variable optimization performance achieved though parallel instruction execution which can produce up to 20x8 performance improvement over Linear Predictive Coding. Unfortunately not all functions, operations, or algorithms can achieve fully parallelized execution for maximum optimized performance.

Fortunately DSP silicon vendors as well as DSP design solutions vendors have significant market incentives to measure and bench mark9 their performance against existing technologies. With system design experience it is possible to navigate the volumes of performance data to produce accurate embedded performance estimates which can be measured on embedded hardware.

Memory architectures must also be considered when determining expected embedded system performance embedded designs typically have much slower clock rates and narrower bus widths. Few simulation tools for embedded devices would provide a matrix for memory interfaces hence only relative CPU cycle counts would be useful.

Compiled esprit program executable binary with libraries 8.023805 Mbytes and 8.027934 Mbytes with smoothing conditional code ARC_5B 4Mbytes. Esprit program compiled against TI DSP libraries and BIOS

7 Reference 1. page 3-32 “Total Result Latency” MPYSP and MPYDP8 Reference 2. E2E Forum C/C++ compiler group. 9 Reference 4 dsplib developers notes.xls DSPF_sp_mat_mul_cplx 553 Cycles (Absolute)

Michael Nolin 21 of 33 February 22, 2011

Page 22: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

currently 1Mbytes.

6.5 Texas Instruments DSP BIOS → SYS BIOSThe TI DSP BIOS has been “designed to minimize memory and CPU requirements”. It has been optimized to effectively work with the TI tools taking advantage of parallelization with key performance improvements implemented in assembly language. The TI DSP/BIOS is explained in length though three references:1. “Using DSP/BIOS” lessons in the online Code Composer Studio Tutorial. 2. “TMS320C6000 DSP/BIOS 5.31 Application Programming Interface (API) Reference Guide SPRU403M”3. “TMS320 DSP/BIOS User's Guide SPRU423F” IBIOS 5_31_02 BIOS 5_33_01, and latest available for CCSv3 5_41_07_24, also available for Linux.Bios_5_33_05 /home/mnolin/work/OMAP-L137. Quick Start instructions: “If you want to quickly try DSP/BIOS with command-line/makefile builds (and not use Code Composer Studio)...”

The TI DSP BIOS is supported across the family of TI parts including the TMS320C6000 and TMS320C2000/TMS320F28335, a license agreement may be necessary (2008) to obtain full sources.

With the newly available IPC Users Guide recently published “Previous versions of SYS/BIOS were called DSP/BIOS. The new name reflects that this operating system can also be use on processors other than DSPs”10

6.5.1 DSPBIOS 6.x

TI DSP BIOS updates for version 6.x “IPC support may be used independently of core DSP/BIOS 6.x kernel functionality. “ http://focus.ti.com/docs/tollsw/folder/print/dspbios6.htmlUpgrading to CCSv4 is required for DSP BIOS 6.x DSP BIOS 5.41 works with CCSv3 or CCSv4.

6.6 DSP Libraries

6.6.1 ARM Cortex DSPLIB

6.6.2 Texas Instruments DSPLIB

6.7 Texas Instruments FastRTS Library

The TMS320C67x Fast Run-Time-Support Library, 26 optimized floating-point math functions for the TMS320C67x. Spru100A.

10 SYS/BIOS Inter-Processor Communication (IPC) and I/O User's Guide SPRUG06B May 2010

Michael Nolin 22 of 33 February 22, 2011

Page 23: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

Note: We have already root caused significant performance problems, resulting by mistakenly linking against older RTS library routines.

6.8 Applying New Technology:

The Texas Instruments C6xxx floating point processor family is a mature device family dating back to 1999-2000. While there have been improvements in clock rates (up to 1GHz) and multi core devices these advances do not exist for the mid range power devices 5-4 watts. The C6747 device represents a new integration of fixed point and floating point DSP features appearing in the past in separate devices.

Lately released documentation (April 2010) related to the C6747 Host Port Interface suggest a new DSP co-processor system architecture.

6.9 Software LibrariesSoftware Libraries are common to many software projects and all levels of infrastructure from web interfaces to low level scientific libraries and simple string processing libraries. Discussion to the suitability of libraries to any system design is best left to experienced professionals.

6.10 Linear Math Libraries cblas_zgemm and zgemm:

Esprit calls cblas_zgemm 6 times as well as getS1S2 function calls cblas_zgemm another 2 times for 8 calls to cblas_zgemm. These well defined linear algebra functions and API's made a logical starting point for the embedded integration effort. The core mathematical processing of the Esprit algorithm is handled by the 8 function calls to zgemm (complex) generic matrix multiplication functions.11

C := alpha*op( A )*op( B ) + beta*C

The repetitive nature of Linear algebra mathematics with matrices has been optimized over several decades, optimized on the essential equation above.

6.11 OMAP-L137System overview of OMAP: http://processors.wiki.ti.omc/index.php/OMAP-L1x/C674x/AM1x_SoC_Architectural_OverviewMultiple CPU masters (ARM or DSP) are combined with multiple slave (peripheral, memories) all manages through a Switch Central Resources (SCRs) module. Powerful yet complex combination of asymmetric CPU cores and peripheral resources.

The OMAP-L137 EVM, for high end multi channel audio processing.

11 Reference 9 B. General ESPRIT Algorithm 1-7

Michael Nolin 23 of 33 February 22, 2011

Page 24: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

The OMAP data sheets PRU Subsystem PRUSS Programmable Real-Time Unit Subsystem two units are included PRU0 and PRU1. Complete with interrupt controller and associated memories the PRU's represent instruction memory that can be used to perform a variety of embedded tasks with tight real-time constraints. For Radar designs this flexibility can alleviate SPI bus bottlenecks or implement a CAN bus interface. See 1.4 and 6.20 of OMAP-L137 ADVANCE INFORMATION.

6.11.1 JTAG ARC5_B schematics indicate a common JTAG header “JTAG CON” connect with both TMS320F28335 and TMS320C6747xxx. This is not consistent with the OMAP-L137 EVM which provided separate JTAG connector headers for each JTAG target “ARM JTAG” and “TI JTAG”. The OMAP-L137 EVM provides external logic to multiplex the ARM and TI JTAG connectors to the OMAP-L137ZKB (pins J1, 2, 3,4 H3 TCLK) signals DSP_EMU0 and DSP_EMU1 connected to J5 GPIO7_15 provided by JTAG_EMU1 and JTAG_EMU0 from J4 the “TI JTAG” connector, effectively provide a work around for standard JTAG scan chain. The ARM JTAG connector providing no JTAG_EMUx singals provide default ARM JTAG to TMS320C6747ZKB. See: EVM schematic pages 10 and 22 of EVMOMAPL137_TechRef_revg.pdf

Section 6.31 JTAP Port Description omap-L137 Advance Information data sheets, JTAG scan chain taps are used to select C674x or ARM926 debug interface. The ARC5_B schematics should support three JTAG tap ID's for the F28335, ARM926 and C6747.

7 References:1. Www.nr.com Numerical Recipes Third Edition William H. Press, Saul A. Teukolsky, William T. Vettering,

Brian P. Flannery. 2007.2. “Numerical Recipes, The Art of Scientific Computing” Third Edition 20073. netlib.org cblas_zgemm 4. “Numerical Methods for DSP Systems in C Practical application of numerical methods in : Signal

Processing, Graphics, Video Programming, Scientific Applications” Don Morgan. 5. “Digital Signal Processing and Applications with the C6713 and C6416 DSK” Rulph Chassaing 2005 by

John Wiley & Sons Inc. 6. ESPRIT Beam Forming for the Autoliv Long Range Radar, Bruce Labitt. 7. “Singular Value Decomposition – A Primer” Sonia Leach Department of Computer Science Brown

University Providence RI 02912. DRAFT VERSION. (Postscript) Ghost view (1994)

7.1 Texas Instruments8. Texas Instruments TMS320C6000 Optimization Workshop Student Guide9. TI E2E Community “Use of C++ <complex> types and measured performance” TI C/C++ Compiler

Forum Clear Quest SDOWP ID#SDSCM00037600 Georgem. 10. http://processors.wiki.com/index.php/C6000 Compiler Tuning Software Pipelined Loops11. http://processors.wiki.ti.com/index.php/C674x_DSPLIB

c647x_dsplib_dev_notes.xlsTI DSPLIB “Legacy ASM Implementation from C67x” DSPF_sp_mat_mul_cplx.asm

Michael Nolin 24 of 33 February 22, 2011

Page 25: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

12. www.wiki.ti.DPS64x 13. SPRU423F “TMS320 DSP/BIOS User's Guide” November 200414. TMS320C6000 DSP/BIOS 5.31 Application Programming Interface (API) Reference Guide

(spru403m.pdf) July 200615. TMS320C674x/OMAP-L1x Processor Peripherals Overview Reference Guide SPRUFK9B June 2009

Users Guide sprufm7d.pdf April 2010.16. OMAP-L137 Low-Power Applications Processor ADVANCE INFORMATION September 2008 revised

August 2010. 17. OMAP-L137/TMS320C6747 Floating-Point Starter Kit '01 May 09' Early Adopter (EA) and (GA)18. TMS320C67x FastRTS Library Programmer's Reference spru100a19. TMS320F28335, ..334, 332,235,234,232 Digital Signal Controllers Data Manual SPRS439H June 2007

– Revised March 2010 http://focus.ti.com/lit/ds/symlink/tms320f28335.pdf. 20. TMS320x28xx, 28xxx DSP Peripheral Reference Guide SPRU566D June 2003 - October 200621. DSP/BIOS LINK LNK 058 USR User's Guide Version 1.61.03 March 31, 2009. OMAP-

L137/OMAPL137_arm_1_00_00_11/dsplink-1_61_03-prebuilt/packages/dsplink/doc22. xDIAS-DM Users Guide

OMAPL137_arm_1_00_00_11/framework_components_2_23_01/fctools/packages/ti/xdais/dm/docs/XDM_UsersGuide.pdf

7.2 Texas Instruments Vision & VisionMid23. TMS320DM814x DaVinci Digital Media Processors SPRS647-November 5, 2010 (Draft Only) Product

Preview. 24. VICP Signal Processing Library for DM6446, DM 6441, DM647, and DM648 User's Guide SPRUGJ3E

November 2009

7.3 SAAB (Microwave) SAAB documents concerning ARC4 and ARC5 devices can be found on the shared drive S:\AEACommon1\active_safety\24GHz\Saab Transfer Documents several documentations revi

25. ARC5 SW design considerations 2009-05-25 A24R-00104 DDJX, Alexei Zernov

8 Appendix

netlib.org CLAPACK-3.2.1, netlib.org LAPACK-3.2.2netlib.org ATLASnumbpy sourceforge.net.

9 Linux and the Texas Instruments c6x family

Http://www.linux-c6x.org/wiki/index.php/Main_Page.The evolution of multiple devices and multiple cores into many embedded products has lead to questions of interoperability and communications between devices. Multi processor multi core designs with built in architecture support from Unix, Linux and uClinux, are challenging the traditional RTOS micro-controller

Michael Nolin 25 of 33 February 22, 2011

Page 26: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

software support packages. RTOS vendors must support a common IPC like communications interface to enable system integrators to include micro-controllers in to current multiprocessor architecture designs. For many reasons Linux to RTOS IPC is expected in todays technology systems.EABI Embedded Application Binary Interface, a requirement of DSPLINK and TI PSP Platform Support Packages.

10 Software as Texas Instruments redefines it. Texas Instruments as a hardware company has several definitions to describe basic C programming concepts, some dating back to the origins of Unix and Xnix like concepts GNU.org.IPS: Interprocessor Signaling (Semaphores) PSP: Platform Software Package device drivers for 6747 for DSP/BIOS environments. XDC eXpanDed C : A command shell for GNU make support, (see Richard Stallman, GNU.org.)XDM xDIAS-DM Digital Media Users guide. The xDM standard defines a uniform set of APIs across various multimedia codecs to ease integration and ensure interoperability. xDM is built over TI’s well proven eXpress DSP Algorithm Interoperability Standard (also known as xDAIS) specification. A form of binary achieved container, for 3rd party intellectual property, without which research and development of new algorithms would not be undertaken by businesses.

Michael Nolin 26 of 33 February 22, 2011

Page 27: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

11 ARC5-B Hardware

12 OMAP Software Resources. Following the OMAP-L1x Getting Started guides through to working example code provided for “DSPLIB/DSPLink Applications” illustrated the value of TI's solution to integrating DSP co-processing

Michael Nolin 27 of 33 February 22, 2011

Illustration 12: Initial ARC5-B Digital Signal Processing System

Page 28: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

devices into multiple core design solutions.After running the pre-built executables for the “Example DSPLIB/DSPLink Application on OMAP-L1x” building from sources further validated, build instructions and portability of DSPLINK to new designs. work/OMAP-L137mnolin@linux-lap:~/work/mv_pro_5.0>-rw-r--r-- 1 users 1804720 2010-10-13 14:59 montavista/pro/devkit/lsp/ti-davinci/linux-2.6.18_pro500/arch/arm/boot/uImage

LSP Linux Support Package U-Boot Kernel User-Bootloader and Flash Drivers. Linux Utils 2.23.01 2009CMEM contiguous memory manager. SDMA, NA for OMAP-L137EDMA C64xVICP C64x Video Image Co-Processing includes vicp and irq. The Linux Utils utility package provides the ability for user-mode applications to access the CMEM, EDMA, SDMA, and VICP utility librarieshttp://processors.wiki.ti.com/index.php/Building_The_OMAP-L137_SDK#CMEMOMAPL137_arm_1_00_00_11/linuxutils_2_23_01/packages/ti/sdo/linuxutils/cmem/src/module

QA-C “static” tools proprietary tools checking for MISRA compliance.

DSPLink Application Block Diagram http://processors.wiki.ti.com/OMAP-L137_Audio_Drivers_in_the_DSP_%2B_Linux an illustration from Texas Instruments of the OMAP-L137 integrated as a DSP hardware accelerator or algorithm co-processing engine.

Michael Nolin 28 of 33 February 22, 2011

Illustration 13: DSP hardware accelerator or algorithm co-processing engine

Page 29: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

DSPLink Application Block Diagram http://processors.wiki.ti.com/OMAP-L137_Audio_Drivers_in_the_DSP_%2B_Linux an illustration from Texas Instruments of the OMAP-L137 integrated as a DSP hardware accelerator or algorithm co-processing engine, with the additional flexibility of external peripherals. Currently Long Range Radar implements an external Ethernet connected to the C6747 DSP for radar data retrieval.

Last login: Fri Jan 14 07:37:28 2000 on console Linux 169.254.1.2 2.6.18_pro500-da830_omapl137_evm-arm_v5t_le #1 PREEMPT Wed Oct 13 14:58:53 EDT 2010 armv5tejl GNU/Linux Welcome to MontaVista(R) Linux(R) Professional Edition 5.0.0 (0801921).

[email protected]:/home# lsmodModule Size Used [email protected]:/home# ./loadmodules.shCMEMK module: built on Oct 14 2010 at 15:20:13 Reference Linux version 2.6.18 File /home/mnolin/work/OMAP-L137/OMAPL137_arm_1_00_00_11/linuxutils_2_23_01/packages/ti/sdo/linuxutils/cmem/src/module/cmemk.cioremap_nocache(0xc2000000, 12582912)=0xc3000000allocated heap buffer 0xc3000000 of size 0x8ac000cmem initialized 3 pools between 0xc2000000 and 0xc2c00000dsplinkk: no version for "struct_module" found: kernel tainted.DSPLINK Module (1.61.03) created on Date: Oct 13 2010 Time: 17:08:[email protected]:/home# modinfo dsplinkk.kofilename: dsplinkk.kolicense: GPL v2depends:vermagic: 2.6.18_pro500-da830_omapl137_evm-arm_v5t_le preempt mod_unload ARMv5 gcc-4.2

Michael Nolin 29 of 33 February 22, 2011

Illustration 14: DSP algorithm co-processing engine and external peripherals

Page 30: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

[email protected]:/home# modinfo cmemk.kofilename: cmemk.kolicense: GPLdepends:vermagic: 2.6.18_pro500-da830_omapl137_evm-arm_v5t_le preempt mod_unload ARMv5 gcc-4.2parm: phys_start: Start Address for CMEM Pool Memory (charp)parm: phys_end: End Address for CMEM Pool Memory (charp)parm: pools: List of Pool Sizes and Number of Entries, comma separated, decimal sizes (array of charp)parm: phys_start_1: Start Address for Extended CMEM Pool Memory (charp)parm: phys_end_1: End Address for Extended CMEM Pool Memory (charp)parm: pools_1: List of Pool Sizes and Number of Entries, comma separated, decimal sizes, for Extended CMEM Pool (array of charp)parm: allowOverlap: Set to 1 if cmem range is allowed to overlap memory range allocated to kernel physical mem (via mem=xxx) (int)[email protected]:/home#

[email protected]:/home# ./call_dsplib DSPF_sp_mat_mul_cplx mat_mul_cplx_input.txInitializing DSPLINK...Calling DSPLIB function...Received response from DSP after 0.000785 seconds.DSP completed processing in 89271 cycles.Closing DSPLINK...call_dsplib completed [email protected]:/home#[email protected]:/home# cat /proc/versionLinux version 2.6.18_pro500-da830_omapl137_evm-arm_v5t_le (mnolin@linux-lap) (gcc version 4.2.0 (MontaVista 4.2.0-16.0.32.0801914 2008-08-30)0

Illustration 15: OMAP-L137, running “Example DSPLIB/DSPLink Application on OMAP-L1x”.

Kernel modules as indicated have been built from source, MontaVista pro5 Kernel also built October 13th. 'Terminal', minicom console port, telnet and ssh terminals are supported.

13 VisionMid TMS320DM814x Software ResourcesProduct Status Product Preview (PP)

An “Engineering Release” for the VisionMid Software kit has most recently been released as of December 2, 2010. VisionMid support is currently only available in the latest CCStudio 4.2.1.00004 on Windows hosts only. A required BIOS and PSP “DM8148 BIOS PSP” package is also only available with limited functionality. Attempts to build sample test applications was unsuccessful. A test sample of CCStudio, 30 day trial due to expire January 21'st is also a dependency.

Michael Nolin 30 of 33 February 22, 2011

Page 31: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

TMS320DM814x DaVinci Digital Media Processors TMS320DM8148(ALP) November 5,2010 REL-BIt is clear that VisionMid support is only available as an internal engineering release, several releases may be necessary before performance benchmarks provided with the OMAP PSP can be executed. Program Files\Texas Instruments\pspdrivers_02_20_00_02\docs\DM8148 Release Notes PSP release versioned 02.20.00.02 is a Beta release for EVM DM8148. DSP/BIOS Version 6.31.00.06 CCStudio Version 4.2.0.09007 CG tools 4.6.3 EDMA3, XDC Tool IPC, and supported drivers tables of performance and memory usage are provided for each driver provided by data sheets. “Pleas note that at this point of time the drivers does not have any abstraction for the OS APIs and they use the OS (BIOS 6.31.00.06) inside the drivers.” pspdrivers_02_20_00_02\packages\ti\psp\mcspi\docs 4 channel chip select (SPIEN). TI DSP/BIOS driver with streams interface.

Cslr Chip supprot register configuration (Macros)PCRM (that helps to turn the clock on/off for the modules)XDC (eXpanDed C) EA1 Release, Eclipse Community Forums: DSDP Real Time software Components (RTSC)Eclipse DSDP device software development project?

DM8148_BIOSPSP_Userguide.pdf Installation Guide.

All new IPC interface May 20,2010 IPC_Users_Guide.pdf

14 Pseudo Codehttp://en.wikipedia.org/wiki/Pseudo_code

No standard for pseudocode syntax exists, as a program in pseudocode is not an executable program. Pseudocode resembles, but should not be confused with, skeleton programs including dummy code, which can be compiled without errors. Flowcharts can be thought of as a graphical alternative to pseudocode.

See doxygen and or the Internet.Matlab scripts,

Michael Nolin 31 of 33 February 22, 2011

Page 32: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

Michael Nolin 32 of 33 February 22, 2011

Illustration 16: BIOS PSP Users Guide (OMAP-L137) block driver

Page 33: New Table of Contents · 2011. 8. 31. · significantly limiting effectiveness. (asynchronous serial verses synchronous parallel) 4.3 Estimate of Performance Requirements. Benchmark

Alphabetical IndexBIOS............................................................................

DSP BIOS...........................................................19JTAG...................................................................21SYS BIOS...........................................................19

Cortex A8.................................................................10DSPLIB......................................................................8Host Port Interface...................................................11HPI...............................................................................

Host Port Interface..............................................11serial peripheral interface....................................11

6.3 IPC....................................................................11Inter-Process Communication.............................10IPC........................................................................2man ipc...............................................................11

JTAG........................................................................21Link..............................................................................

DSPLINK.......................................................13, 18LSP..........................................................................25Neon........................................................................10Platform Support Package.........................................9PSP............................................................................9serial peripheral interface.........................................11SPI...............................................................................

serial peripheral interface....................................11VICP........................................................................25VisionMid.................................................................27XDC...........................................................................9

Michael Nolin 33 of 33 February 22, 2011

Illustration 17: BIOS PSP driver with streaming interface (OMAP-L137)