ecc 2019 winterthur, switzerland - swisst.net · winterthur, switzerland. internet penetration...
TRANSCRIPT
ECC 2019
Winterthur, Switzerland
Internet Penetration (2018)
Source: United Nations / International Telecommunications Union, USA Census Bureau. Pew Research (USA), China
Internet Network Information Center (China), Islamic Republic News Agency / InternetWorldStats / Bond estimates
(Iran), Bond estimates based on IAMAI data (India), & APJII (Indonesia).
Source: Photo Creation Per InfoTrends Digital Imaging
Reporter’s State of the Industry 2018. Instagram releases.
Annual New Photos Taken (2017)
Ping example (laptop):
Pinging google.com with 32 bytes of data:
Reply from 172.217.168.46: bytes=32 time=4ms TTL=56
Is the cloud the only option?
Source: Data Age 2025, sponsored by Seagate with data from IDC Global DataSphere, Nov 2018
Machine learning
▪ Extract relevant information from data
▪ High computational demand
Edge computing
▪ Move computation to data source
▪ Reduce latency, bandwidth, costs
Image recognition Speech recognition
Machine learning is difficult to implement:
• Requires large data sets
• Training is complex
• Resulting models vary in size and complexity
GPU’s
Cell phone
Sensor device?
[Xu et all. Nature Electronics Apr 18]
Major memory and
computing challenge!
Transmit
10-100 mW
Process
1 - 10 mW
Sense
100 µW - 2 mW
Battery-powered operation requires sub-mW consumption
MCU
(Cortex M)
IOs
memory
Low rate data uplink
SW update, commands
Sensor Bandwidth Computational Demand Computing Platform
~ 1 bps < 1 MOPS e.g. Cortex M0
~ 1 Kbps ~ 10 MOPS e.g. Cortex M3
~ 100 Kbps ~ 100 MOPS e.g. Cortex M4/M4F
~ 1000 Kbps ~ 1000 MOPS ???
Started by UC-Berkeley in 2010
Open Standard governed by RISC-V foundation
▪ Necessary for the continuity
▪ Extensions are still being developed
Defines 32, 64 and 128 bit ISA
▪ No implementation, just the ISA
▪ Different RISC-V implementations
(both open and close source) are available
Spec separated into “extensions”
I Integer instructions
E Reduced number of registers
M Multiplication and Division
A Atomic instructions
F Single-Precision Floating-Point
D Double-Precision Floating-Point
C Compressed Instructions
X Non Standard Extensions
Ph.D. in Electrical Engineering and Information Technology, ETHZ (Zurich)
Supervisors: Prof. Luca Benini and Prof. Lothar Thiele
Topic: Design and Specification of Batteryless Sensing Systems
Currently R&D engineer at Miromico AG (Zurich)
Andrés Gómez
Miromico was founded in 2002 as ETH spin-off in Zurich, Switzerland
Our customers include international corporations such as IBM, Global Foundries, Infineon,
Roche, Coop Group, as well as national corporations such as Post, Belimo, Tamedia, etc.
“Our goal is to enable the technical realization of our customer's visions.”
PULP Platform from ETHZ and UniBo
▪ Open source implementations (single and multi-core)
▪ Includes peripherals, interconnect and HW accelerators
Joint supervision of master projects
Developing machine learning applications (Simon Benninger)
Work currently submitted for publication
Prof. Luca Benini
Michele Magno
Himax camera HM01B0
▪ Resolution: 324x244
▪ Monochrome
Omnidirectional mic. MP34DT
▪ Low power, digital MEMS mic.
Data transfer
(range of kB)
Data transfer
(range of bytes)
Data Acquisition
Processing Transmission
GreenWaves Technologies GAP8
▪ 9 RISC-V cores (based on PULP)
▪ Fabric controller + 8 cluster cores
▪ No floating-point units
▪ L1 memory: 16 + 64 kB
▪ L2 memory: 512 kB
LoRa & LoRaWAN
▪ Range of up to 15 km
▪ 0.3 – 50 kbps
▪ Limited packets per day
→ LoRa & LoRaWAN node
No floating-point units• Energy-efficient
• Instead: fixed-point computations
No automatic data transfer between L1 and L2 memory
• Energy-efficient
• Instead: explicit memory management
→ GreenWaves Technologies AutoTiler
Convolutional Layer Tiling [Palossi et al.]
GAP8
▪ Active power
▪ Sleep power
▪ Operating point
→ Voltage and frequency
→ Number of cores
System
TransmitProcessSense
7 x
Setup: 5x5 conv, Vgap @ 1.0 V, FC @ 50 MHz, CL @ 150 MHz
Setup: VDD @2.8 V, Vgap @1.0V, FC @ 50 MHz, CL @1 150MHz
Task Time [ms]
Acquisition 160
Processing 9.3
LoRaWAN 1480
Other 2551
Total 4203
Setup: VDD @2.8 V, Vgap @1.0V, FC @ 50 MHz, CL @1 150MHz
Task Time [ms]
Acquisition 158.3
Processing 9.8
LoRaWAN 22.3
Other 31.7
Total 222.1
Setup
▪ System VDD: 2.8 V
▪ GAP8 DC/DC: 0.8 V
▪ FC/CL off
System sleep power
▪ FMLR: 2.3 μW
▪ Memory: 258.6 μW
▪ GAP8: 153.3 μW
▪ Other: 59.4 μW
▪ Total: 473.7 μW
0.5%
54.6%32.4%
12.5%
Sleep Power Breakdown
FMLR Memory GAP8 Other
Based on prototype measurements
Average power consumption:
𝑃𝑎𝑣𝑔 =𝐸𝑎𝑐𝑡𝑖𝑣𝑒 + 𝑃𝑠𝑙𝑒𝑒𝑝(𝑡𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙 − 𝑡𝑎𝑐𝑡𝑖𝑣𝑒)
𝑡𝑖𝑛𝑡𝑒𝑟𝑣𝑎𝑙
Assumptions
• Constant battery voltage of 3.7 V
• No self-discharge
Amount of data generated by digital devices is increasing
Local data processing can reduce latency, communication costs
▪ Challenging to energy and active time within constraints
We can use RISC-V hardware to design ultra-low power systems
▪ Take advantage of parallel processing and hardware accelerators
Edge computing will make next generation LoRa-based devices smarter
coming soon