tigersharc clu closer look at the xcorrs m. smith, university of calgary, canada [email protected]
Post on 21-Dec-2015
215 views
TRANSCRIPT
![Page 2: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/2.jpg)
Overview
Recap GPS correlation Look at XCORRS instruction in detail
This was part of Take home quiz for 5005 Additional information on the web
Xcorrs.asm – assembly code discussed in class Xmain.cpp – demonstrates the use of the xcorrs.asm
code XcorrsTest.cpp – demonstrates testing of all the
functions being used Additional correlation presentations (not XCORRS)
from Analog Devices developers In 2005, we pointed out many errors in TigerSHARC
XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors
![Page 3: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/3.jpg)
GPS Positioning Concepts
(1)
For now make 2 assumptions: We know the distance to each satellite We know where each satellite is
With this information from 2 satellites – you know you are on a “plane of intersection.
Require 3 satellites for a 3-D position in this “ideal” scenario Requires 4 satellites to account for local receiver clock drift.
![Page 4: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/4.jpg)
Determining Time
Use the PRN code to determine time Use time to determine distance to the satellite
distance = speed of light * time
(1)
Signal send by satellite
Signal received by you
You know the signal sent
Perform correlations till you get a match
![Page 5: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/5.jpg)
The practice
Suppose we have the vector – in-phase and out-of-phase data gathered over an antenna from a satellite for example. Gain issues make it x16
-16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j -16-16j, 16+16j, 16+16j, -16-16j 16+16j, 16+16j, -16-16j 16+16j, 16+16j, etc
Question – if the original data from the satellite had this form -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j, -1-j,1+j,1+j,
How much is the satellite data delayed? FOR THIS EXAMPLE …….. 0, 3, 6, 9, 12 etc
![Page 6: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/6.jpg)
Tackle the issue with FIR
First – modify correlation function to handle complex values Ignore that issue at the moment
– 1 add + 1 multiplication + 2 memory fetches to 3 adds + 4 multiplications plus 4 memory fetches
Imagine 1024 data points + 1024 PRN Need to do 1024 FIR each of 1024 taps We know how to optimize to do 2 taps every cycle (one
in X and one in Y) Cycle time is 1024 * 512 cycles = 1 ms at 500 MHz
XCORS can do 8 * 16 taps each cycle in each compute block – 148 times faster
![Page 7: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/7.jpg)
Where does the CLU fit in?
![Page 8: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/8.jpg)
XCORRS definition
![Page 9: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/9.jpg)
THEORYMathematicaldefinition
Uses registers
TR -- accumulateD -- 8 data?C -- 1 coefficient?
And something calledCUT – essentially awindow operation
fcut = 0 -- don’t use
![Page 10: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/10.jpg)
2005 Lab. 4Satellite data
Quad fetch brings in8 complex values 8 bits eachPattern here is -1 + 0j, 1 + 0j, 1 + 0j, -1 + 0j, 1 + 0j, 1 + 0j, ……….
![Page 11: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/11.jpg)
PRN code – 2 bit complex number
Seems strange to have two dummy bitsBut actually makes sense
PRN -1+ -1j, 1 + j, 1 + j, -1 + -1j, 1 + j, 1 + j, ……….
+1, -1 are associated with the PSK – more another lecture
Problem BINARY means 1 and 0, so how represent 1 and -1
-1 are stored as 1’s, +1 stored as 0’s (DAMY)
![Page 12: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/12.jpg)
PRN
![Page 13: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/13.jpg)
PRN
0x3 value go in asC15 and C160011 -- C15 = -1 –j C16 = +1 + j
![Page 14: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/14.jpg)
Loading the THR registers
![Page 15: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/15.jpg)
Standard XCORRS instruction
Lower 46 bits ofTHR1:0
R7:3
TR0, TR1, TR2 ……. TR15
![Page 16: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/16.jpg)
TR15:0 = XCORRS(R7:4, THR3:0)
Doing 8 complex taps of 16 correlationat each cycle
TR0 += D7 * C22 + D6 * C21 +… 8 tapsTR1 += D7 * C21 + D6 * C20 +… 8 taps………..………..TR15 += D7 * C7 + D6 * C6 + … 8 taps
64 taps each cycles – on both x and y compute blocks – if set up properly
128 taps each cycle – these are “complex taps”compared to 2 real taps / cycle after lab. 3
![Page 17: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/17.jpg)
TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7)
Because of offsets, sometimes wemust only use “some of the taps”
TR0 += D7 * C22 + D6 * C21 + … 8 tapsTR1 += D7 * C21 + D6 * C20 + … 8 taps………..………..TR14 += D7 * C8 + D6 * C7 2 tapsTR15 += D7 * C7 1 taps
![Page 18: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/18.jpg)
TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15)
TR0 += D7 * C22 + D6 * C21 … 8 tapsTR1 += D7 * C21 + D6 * C20 … 7 taps………..TR7 += D7 * C15 … 1 tapsTR0 += 0 … 0 taps
………..TR15 += 0 … 0 taps
![Page 19: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/19.jpg)
TR15:0 = XCORRS(R7:4, THR3:0) (CUT +7?)
TR0 += 0 … 0 tapsTR1 += D0 *C14 1 taps………..TR7 += D6 * C14 + D5 * C13 + … 7 tapsTR0 += D7 * C14 + D6 * C13 + … 8 taps
………..TR15 += D7 * C7 + D6 * C7 + … 8 taps
![Page 20: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/20.jpg)
![Page 21: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/21.jpg)
TR15:0 = XCORRS(R7:4, THR3:0) (CUT -15)
TR0 += D7 * C22 + D6 * C21 … 8 tapsTR1 += D7 * C21 + D6 * C20 … 7 taps………..TR7 += D7 * C15 … 1 tapsTR0 += 0 … 0 taps
………..TR15 += 0 … 0 taps
![Page 22: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/22.jpg)
![Page 23: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/23.jpg)
TR15:0 = XCORRS(R7:4, THR3:0) (CUT -7)
TR0 += D7 * C22 + D6 * C21 + … 8 tapsTR1 += D7 * C21 + D6 * C20 + … 8 taps………..………..TR14 += D7 * C8 + D6 * C7 2 tapsTR15 += D7 * C7 1 taps
![Page 24: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/24.jpg)
![Page 25: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/25.jpg)
TR15:0 = XCORRS(R7:4, THR3:0)
TR0 += D7 * C22 + D6 * C21 +… 8 tapsTR1 += D7 * C21 + D6 * C20 +… 8 taps………..………..TR15 += D7 * C7 + D6 * C6 + … 8 taps
64 taps each cycles – on both x and y compute blocks – if set up properly
128 taps each cycle – these are “complex taps”compared to 2 real taps / cycle after lab. 3
![Page 26: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/26.jpg)
![Page 27: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/27.jpg)
Problem at this point -- THR3:2 emptyNeed to bring in more PRN values
![Page 28: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/28.jpg)
TR15:0 = XCORRS(R7:4, THR3:0) (CUT +15)
TR0 += 0 … 0 tapsTR1 += D0 *C14 1 taps………..TR7 += D6 * C14 + D5 * C13 + … 7 tapsTR0 += D7 * C14 + D6 * C13 + … 8 taps
………..TR15 += D7 * C7 + D6 * C7 + … 8 taps
![Page 29: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/29.jpg)
![Page 30: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/30.jpg)
Final Result
Maximum correlation occurs every 3 shifts – which is what we expectIs it the correct result?
![Page 31: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/31.jpg)
Correlation – result expected
In step-1 +0j, 1 + 0j, 1 + 0j, … 16 times
with-1 - j, 1 + j, 1 + j, … 16 times
-1 * -1 + 1 * 1 + 1 * 1 + 48 = 0x30 -- Real component
Out of step-1 +0j, 1 + 0j, 1 + 0j, … 16 times
with1 + j, 1 + j, -1 - j, … 16 times
-1 * 1 + 1 * 1 + 1 * -1 + -16 = -0x10 = 0xFFF0
![Page 32: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/32.jpg)
Final Result
1) Now have correlation values for 16 shifts in TR registers – store to external memoryRepeat for all other necessary shifts – find the maximum2) Now make parallel in SISD mode 3) Now make parallel in SIMD
![Page 33: TigerSHARC CLU Closer look at the XCORRS M. Smith, University of Calgary, Canada smithmr@ucalgary.ca](https://reader035.vdocuments.us/reader035/viewer/2022062516/56649d5f5503460f94a3fd0d/html5/thumbnails/33.jpg)
Overview
Recap GPS correlation Look at XCORRS instruction in detail
This was part of Take home quiz for 5005 Additional information on the web
Xcorrs.asm – assembly code discussed in class Xmain.cpp – demonstrates the use of the xcorrs.asm
code XcorrsTest.cpp – demonstrates testing of all the
functions being used Additional correlation presentations (not XCORRS)
from Analog Devices developers In 2005, we pointed out many errors in TigerSHARC
XCORRS explanation – if my figures are not the same as in the manual, then they fixed the manual errors