artificial intelligence where you need it · into account constraints of networking, power...

REPORT

Artificial Intelligence Where You Need ItDelivering AI to the Cloud, Edge and Endpoint DevicesMIKE JENNETT

TOPICS: ARTIFICIAL INTELLIGENCE CLOUD EDGE INFRASTRUCTURE

SPONSORED BY

CREDIT: SHUTTER2U

https://gigaom.com/analyst/jennett-mike/

https://gigaom.com/sponsor/arm/

https://gigaom.com/sponsor/arm/

Artificial Intelligence Where You Need ItDelivering AI to the Cloud, Edge and Endpoint Devices

TABLE OF CONTENTS

Summary1

AI Compute-capable System on Chips (SoCs)2

Bringing AI to the Network Edge and Device Endpoint with Arm3

Example: Natural Language and AI/ML in Cardiac Arrest4

Example: Keeping the Trains Running on Time5

Conclusion6

About Mike Jennett7

About GigaOm8

Copyright9

Artificial Intelligence Where You Need It 2

1. Summary

Artificial intelligence (AI) is permeating every aspect of our personal and professional lives, but manycompanies have yet to grasp the power and revolutionary capabilities it brings. While publicperceptions of AI can be skewed by science fiction’s (almost entirely theoretical) depiction of hyper-intelligent humanoid robots, the enterprise view can also be out of step with reality. Many still view AIas completely focused on high-performance or cloud-based computing models. In actuality, advancesin hardware, software, and algorithm optimization mean that AI opportunities are now incredibly broad,and often far more down to earth.

Take Audio Analytic, who have announced an experimental baby monitor, running AI-powered soundrecognition on an Arm Cortex-M0+ -based processor. This processor, used in devices such as banksmartcards, has an ultra-low energy footprint. The monitor uses real-time on-device analysis of a babycrying, incorporated with a simple LED warning light that flicks from green to red when the devicerecognizes the baby’s cry. This alert system illustrates how to apply AI where it is most useful, takinginto account constraints of networking, power consumption, and processing.

Closer to home, we see utility companies using smart meters for electricity and gas. Such meterstransmit data on usage either to an individual walking by the building, or directly back to companyheadquarters. They have built-in processing and storage capabilities, allowing them to gather andprocess relevant information and await a passing sensor that they can send the data packets to, ratherthan requiring someone to approach each device, read it, and manually enter information into theirlogs.

AI-based devices such as these are more about serious function than science fiction, demonstratingthe pervasiveness of AI today. They provide functional, and sometimes critical, capabilities for theirusers as well as producing huge efficiencies and savings for the companies that employ them. The factthat different use cases require unique architectural choices means that decision-makers need tounderstand the capabilities, patterns, and tradeoffs of AI-capable processors running in the cloud,edge, and endpoint.

So, how to make the right choices? In this paper, we review processor and platform options, thenconsider several use cases both to illustrate the potential of AI today and show how to devise, define,and deploy the right AI for the job.


2. AI Compute-capable System on Chips (SoCs)

To better understand the revolution taking place in the AI space, it is best to take a look at thebeginnings of AI and how it has transformed in recent years. While AI-based applications have been indevelopment for decades, original designs had limited capabilities or were implemented in researchenvironments, rather than the mainstream use we see today. The early 1980s saw the corporate worldadopting “expert systems” (computer programs that emulate human capabilities of answering domain-specific questions). However, these systems were expensive and did not translate into everyday,usable technology. Then in the 1990s, as computer power increased and costs reduced, algorithmsbased on neural networks became more prevalent – but still, use cases were limited to operating onhuge, workhorse machines running in data centers.

Over more recent years, the advent of mobile computing and the introduction of the first smartphoneshas led to a redistribution of computing power. While these devices had relatively limited processingcapabilities, they began to demonstrate that certain tasks and routines could move from the datacenter to the device. As technology has evolved, we have seen the emergence of smartphonehardware that can handle large amounts of compute in a small form factor.

The impact on AI has been profound. Initially, smartphones worked as AI delivery mechanismsconnected to back-end data centers or cloud-based environments, but further hardware developmenthas led to AI compute moving down, to run directly on appropriately configured devices. For example,today, smartphones are routinely used for biometric security, including fingerprint recognition and face-based unlocking. As the use of device-based AI has spread, end-users have started to understand howto use real-time computing options such as Natural Language Processing (NLP) and image recognition.

Endpoint device-based AI, or endpoint AI, goes way beyond user-facing applications. From anengineering perspective, an important benefit of endpoint AI is the ability to take action on dataimmediately, rather than sending it to be processed centrally. In IoT scenarios, and as endpoint devicesbecome more prevalent, the need to process on endpoints is increasingly critical. “The endpoint is thenext stage of evolution of AI technology because of physical constraints, cost constraints, and thepractical limitations of running all AI applications in the cloud. It simply doesn’t make sense to send allthe bits for things like video and audio streaming to the cloud and back down for every situation, everyendpoint,” Steve Roddy, vice president of special projects for Arm, told GigaOm’s Byron Reese recentlyin his Research Byte AI at the Edge.

Developers of AI-enabled applications are looking to do as much on the device as possible, eliminatingthe need to send data back to a centralized source. There are several reasons for this, including:

• Latency: sending data back to the cloud for processing is not viable in many scenarios. Reductionin round-trip time not only makes for more efficient response times, but in operations such asindustrial manufacturing, it can be critical to safety and operations to process informationinstantaneously so decisions can be made in near real-time. The same principle applies toconnected vehicles relying on platooning data relayed from car-to-car or highway-to-car. In thesescenarios, the faster-harvested information can be processed using Machine Learning (ML), and the


https://gigaom.com/ai-at-the-edge-a-gigaom-research-byte/

learnings sent to those people and machines that need it.

• Scale and cost: with the ever-expanding IoT world being increasingly reliant on AI, companies arelooking at how they scale up their AI strategy. In most cases, companies rely on a multi-layeredapproach that combines AI computing in the cloud, the edge, and endpoints, balancing computingfirepower with a cost. Hyperscalers such as Amazon Web Services are investing strongly in theircloud and edge compute infrastructure (based on Arm Neoverse solutions, for example), givingcompanies and developers a more cost-effective platform for high throughput workloads.

• Security and privacy: companies often do not want to store sensitive commercial data in the publiccloud, leading to a greater focus on on-premise arrangements based around edge server solutionsbehind company firewalls, or running AI-based pre-processing of sensitive data locally. As more AIis layered into edge infrastructure and as device capabilities rise, more flexible, geo-fenced AImodels are likely to become commonplace for business and enterprise applications.

These issues are becoming more important as endpoint devices need to make critical, split-seconddecisions in use cases from autonomous vehicles to health care. By employing local processing,developers can create devices that are self-sufficient and can handle many of the tasks that used to betreated in the data center or the cloud. The need to process AI data at the point, or close to the point atwhich it is created, is a key to success.


https://aws.amazon.com/ec2/instance-types/a1/

https://www.arm.com/products/silicon-ip-cpu/neoverse/neoverse-n1

3. Bringing AI to the Network Edge and Device Endpoint withArm

Given the multi-layered approach now possible for AI-based solutions, the key question for a given usecase is: what capabilities can run on the device? Answering this requires choosing the right processorfor the job, based on use case requirements and taking into account device constraints around cost,efficiency, security, and performance. In technical terms, the most intensive ML algorithms may requirea combination of processing types: Central Processing Unit (CPU), Graphics Processing Unit (GPU),Network Processing Unit (NPU), and potentially Microcontroller Unit (MCU) – all on a single SoC. Morecost-sensitive devices may require ML compute to run on a single MCU. Even then, a range ofperformance options exist, from 64-bit processor-style functionality down to the 32-bit MCUs used inthe most energy-constrained devices.

Arm provides a broad range of chip technologies that are both highly cost-effective and have thepower to deliver against needs at the edge and endpoints. With Arm’s rich history in low-power-utilization and cost-effective chip design technologies, the company has become a recognized sourcefor developing AI solutions across devices used in a multi-layered architecture. To tackle thesecomplex challenges, Arm’s portfolio of designs provides a broad range of offerings that address powerand efficiency needs. These designs work in conjunction with the overall Arm architecture, includingsoftware design capabilities, to allow for a complete ecosystem for AI/ML.

For instance, Arm Neoverse CPUs are built specifically for cloud to network edge processing needswith highly-efficient throughput and range-topping performance. Meanwhile, the Arm Cortex linefocuses on prioritizing efficiency and cost-efficient performance. Arm has built on its success with eachversion in the series, focusing on advancements in memory, increasing dual-issue capabilities, andimproving branch prediction.

Arm Cortex-A CPUs range from the latest high-performance Cortex-A77 built for heterogeneoussystems with Arm’s proprietary DynamIQ microarchitecture, down to the Cortex-A5 which is availablethrough the low-cost Arm DesignStart program for developers. With the introduction of Cortex-A53,devices now enjoy the level of power-efficiency that was only experienced by high-end mobiledevices, making AI at the endpoint affordable and easy to implement.

This range of CPUs is complemented by Arm Cortex-M MCUs, which make up the bulk of Armshipments and are the company’s highest efficiency embedded and IoT Cortex-M microprocessors,and also enable inference engines to run on the device. These span from cores with rich feature sets,including DSP and floating-point processing, to ultra-constrained MCUs capable of running onharvested energy and lasting for years on small cell batteries.

Turning to NPUs and GPUs, Arm offers a range of products to meet various development needs. ArmEthos NPUs cover the range of AI acceleration needs from premium mobile and AR/VR down toembedded applications in products, including set-top boxes and smart TVs. At the same time, Arm MaliGPUs processors are optimized to run with Arm CPUs and NPUs in a heterogeneous environment,


https://www.arm.com/products/silicon-ip-cpu

covering graphics, camera, and display needs.

All these designs work with multiple AI frameworks, including the likes of TensorFlow, TensorFlow Lite,Onnx, PyTorch, Caffe, Caffe2, MXNet, and Android NNAPI. Often, developers want to be able to workwith their favorite framework and not have to care about which architecture they are targeting. Thisbecomes possible when there is a broad platform of hardware that is expansive enough to warrantbuilding a fulsome translation layer. This translation layer is what Arm has done through the Arm AIPlatform.

The Arm AI Platform is made up of open-source and proprietary software technology, which forms aplatform for connecting any of the main AI/ML applications, algorithms, and frameworks with the broadspread of Arm, and its ecosystem partners, hardware technology.

Figure 1: Arm AI Platform Framework


https://www.arm.com/products/silicon-ip-cpu/machine-learning/project-trillium

https://www.arm.com/products/silicon-ip-cpu/machine-learning/project-trillium

4. Example: Natural Language and AI/ML in Cardiac Arrest

AI has multiple applications that leverage the concepts behind Natural Language Processing (NLP) –enabling AI/ML to interpret human voices. NLP is not just word comprehension, but also the nuances inwhich they are spoken, which enables devices to understand feelings such as stress. By using NLP, AIdevices can better answer questions proposed by humans just as one would during a regularconversation, where the tone is just as important as what is said. Emergency healthcare is one areawhere this is critical.

Corti is an NLP-based device used by emergency call responders. It is designed to recognize cardiacarrests quicker than humans, by utilizing AI software that can listen in on emergency calls. Corti usesinference to pick up on verbal and non-verbal communication patterns in the caller’s voice. Based onthe various responses, AI software can also prompt the emergency responder to ask specific questionsand check for consciousness and breathing. In testing it has shown a 95% success rate in cardiacarrest recognition, as opposed to a 73% rate for humans.

The hardware that enables this ground-breaking product is based on a partnership between Nvidiaand Arm, powered by the Nvidia TX2 with a quad-core Arm Cortex-A57 CPU. The device itself isroughly the size of a Google Home speaker and runs on Nvidia’s TX2 module atop the Nvidia J140carrier board. It sits on the dispatcher’s desk and connects to the telephone’s audio stream, allowing itto listen in on emergency calls and detect cardiac arrest in real-time. The partnership between Nvidiaand Arm allows for a compact, powerful, and cost-effective tool for emergency responders that canquickly identify crucial elements of real-world situations, and allow for those responders to act uponthe data with a reliable AI assistant.


https://corti.ai/

https://developer.arm.com/ip-products/processors/cortex-a/cortex-a57

5. Example: Keeping the Trains Running on Time

When it comes to timetables, trains are always the first things that come to mind. Keeping track of trainarrival times can be a daunting task, especially with the number of different types of trains that sharetracks throughout the rail system. To tackle this problem, a team of data scientists at Silicon Valley DataScience decided to use technology to track the arrival times for the Caltrain system, the maincommuter railway in the San Francisco bay area. After having used publicly available data from Caltrainfor some time, they decided to embark on a project that would allow them to use Raspberry PiCameras to track Caltrain trains and determine where they were on the tracks. It was a simple conceptbut marred with issues, due to other trains (as well as cars and trucks) being picked up by the cameras.To solve this issue, the team decided to utilize AI/ML to train their cameras to exclude any vehicle thatwas not a Caltrain train.

The team accomplished this through the use of Google’s TensorFlow ML platform compiled on theRaspberry Pi 32-bit Arm chip. Using Arm’s Cortex-A53 and Cortex-M-based sensors, the team, withlimited knowledge of TensorFlow, was able to quickly set up TensorFlow on the Arm chip to detectCaltrain based vehicles. The model was done without the need for an expensive GPU, and the entiredevice construction cost was less than $130.

All image classifications were performed on the Raspberry PI to keep costs down as well as ensurethat there was not a need for data to be sent back to a central server. By using Arm technology, thedata scientists were able to predict Caltrain vehicles with 97% accuracy with the training data, and a90% accuracy with test data.


https://www.tensorflow.org/

https://www.arm.com/products/processors/cortex-a50/cortex-a53-processor.php

6. Conclusion

Today AI is used to process data in the cloud, the network, and across billions of endpoint devices. As5G continues its rollout, smartphones will provide a ready base of devices equipped to run AI locallyand offload some of the heaviest computing tasks to the network and cloud. While the smartphone isleading the charge, it is fully expected that other devices will join it to form an overarching matrix of AI.

As AI starts becoming a generalized compute workload, there is one thing that will help accelerate itinto the future even faster: the majority of AI compute operations are today running on the Armarchitecture. Beyond the smartphone, which is a market running almost entirely on Arm-based chips,Arm ecosystem partners (1,000+ of them) are today addressing markets ranging from cloud computingto deeply embedded applications and solutions.

This architectural platform provides an underlying thread for developers so they can design once anddeploy their code across multiple layers, helping to reduce the potential complexity tax of having todeal with multiple architectures. As a result, we are seeing an erosion of the traditional idea of having asmall amount of compute locally, with larger computers in the background in a data center. Instead, wehave the principle of moving AI compute to where it makes the most sense – be it the cloud, in edgeservers, endpoint devices, or some combination thereof.

This evolution does not mean that data centers, or the cloud for that matter, are a thing of the past, butare part of an overall infrastructure for AI that allows it to process data where it is needed and when itis needed. This allows organizations to determine the best fit for their AI needs. Through a variedcombination of CPUs, GPUs, NPUs, and MCUs that support a myriad of architectures and theincorporation of developer-friendly software, the Arm family of products is helping to enable therevolution of AI and ML by providing developers what they need in a fashion they can easily consume.


7. About Mike Jennett

Mike JennettMike Jennett utilizes over 20 years of business andtechnology experience to guide senior executives inachieving sustainable business outcomes. Mike was the VicePresident of the Enterprise Mobility, IoT and Digital Strategypractices as IDC where his research explored strategies toempower technology leaders seeking to invent, integrate,and support digital technology initiatives as well as focusingon enterprise mobility and emerging IoT technologystrategies.

Previously, Mike led Hewlett Packard’s enterprise mobilitytransformation. He was responsible for all aspects of the development, deployment and integration ofHP IT’s mobile applications and infrastructure on a worldwide basis, evangelizing HP developmentsand products and shaping the company’s overall go-to market strategy as part of HP’s cross-businessleadership team. Prior to his focus on HP’s Mobility transformation, Mike led strategy for HP IT andmanaged integration for many of HP’s mergers and acquisitions. His 13-year tenure at HP includedleadership roles in mobility, web and enterprise applications, and emerging technologies. Beforejoining HP, Mike served as COO of Spotlight Studios, an award-winning Internet development firmserving venture-funded startups and Fortune 500 companies.


https://gigaom.com/analyst/jennett-mike/

8. About GigaOm

GigaOm provides technical, operational, and business advice for IT’s strategic digital enterprise andbusiness initiatives. Enterprise business leaders, CIOs, and technology organizations partner withGigaOm for practical, actionable, strategic, and visionary advice for modernizing and transforming theirbusiness. GigaOm’s advice empowers enterprises to successfully compete in an increasinglycomplicated business atmosphere that requires a solid understanding of constantly changing customerdemands.

GigaOm works directly with enterprises both inside and outside of the IT organization to apply provenresearch and methodologies designed to avoid pitfalls and roadblocks while balancing risk andinnovation. Research methodologies include but are not limited to adoption and benchmarkingsurveys, use cases, interviews, ROI/TCO, market landscapes, strategic trends, and technicalbenchmarks. Our analysts possess 20+ years of experience advising a spectrum of clients from earlyadopters to mainstream enterprises.

GigaOm’s perspective is that of the unbiased enterprise practitioner. Through this perspective, GigaOmconnects with engaged and loyal subscribers on a deep and meaningful level.


9. Copyright

© Knowingly, Inc. 2020. "Artificial Intelligence Where You Need It" is a trademark of Knowingly, Inc.. Forpermission to reproduce this report, please contact [email protected].


https://www.knowingly.com/

https://www.knowingly.com/

mailto:[email protected]

artificial intelligence where you need it · into account constraints of networking, power...

Documents