ionic decision-maker created as novel, solid-state devices€¦ · ionic decision-maker created as...

SC I ENCE ADVANCES | R E S EARCH ART I C L E

MATER IALS SC I ENCE

International Center for Materials Nanoarchitechtonics (WPI-MANA), NationalInstitute for Materials Science, Tsukuba, Ibaraki 305-0044, Japan.*Corresponding author. Email: [email protected]

Tsuchiya et al., Sci. Adv. 2018;4 : eaau2057 7 September 2018

Copyright © 2018

The Authors, some

rights reserved;

exclusive licensee

American Association

for the Advancement

of Science. No claim to

originalU.S. Government

Works. Distributed

under a Creative

Commons Attribution

NonCommercial

License 4.0 (CC BY-NC).

D

Ionic decision-maker created as novel,solid-state devicesTakashi Tsuchiya*, Tohru Tsuruoka, Song-Ju Kim, Kazuya Terabe, Masakazu Aono

Decision-making is being performed frequently in areas of computation to obtain better performance in a widevariety of current intelligent activities. In practical terms, this decision-making must adapt to dynamic changes inenvironmental conditions. However, because of limited computational resources, adaptive decision-making isgenerally difficult to achieve using conventional computers. The ionic decision-maker reported here, which useselectrochemical phenomena, has excellent dynamic adaptabilities, as demonstrated by its ability to solve multi-armed bandit problems (MBPs) in which a gambler given a choice of slot machines must select the appropriatemachines to play so as to maximize the total reward in a series of trials. Furthermore, our ionic decision-maker suc-cessfully solves dynamic competitive MBPs, which cause serious loss due to the collision of selfish users in communi-cation networks. The technique used in our devices offers a shift toward decision-making using themotion of ions, anapproach that could find myriad applications in computer science and technology, including artificial intelligence.

ow
on O
ctober 18, 2020http://advances.sciencem

ag.org/nloaded from

INTRODUCTIONDecision-making based on complex data processing is indispensable tointelligent life, making it able to adapt to dynamic environmentalchange and survive. The more sophisticated decision-making abilities,including mutual concession for overall optimization, make humanbeings different from other creatures; they are what makes us human.Computer science and technology is used to emulate decision-makingabilities: automatically sense environmental changes and make decisionsabout how to behave in various situations (for example, information andcommunications, manufacturing, financial trading, and entertainment)(1–6). Decision-making in applications is usually emulated usingconventional computers composed of a central processing unit (CPU),memory, and algorithms (software programs). While CPU-based com-puting has proved useful for emulating human intelligence, it has onecritical drawback: The amount of computational resources needed forhandling the rapidly increasing amount of information is exploding ex-ponentially. The conventional approach to handling information thusreaches its limit (7–18). The ionic decision-maker we have developedovercomes the limitations of conventional computers by creating aparadigm shift toward “decision-making using the motion of ions.”

Conventional computing is designed for “Turing machines” inwhich versatility (flexibility) is achieved through the digitization ofinformation. While this digitization allowed for substantial develop-ment in digital computers, it simultaneously lost an intrinsic efficiencyof nature that is achieved by the coupling of information processing(that is, decision-making) with underlying physical laws, which efficiencyis found in various natural phenomena (for example, feeding in amoebaand phototropism in sunflowers) (7–14). The question arises of how thisnatural efficiency can be regained and applied to our technology.

Here, we report an ionic decision-maker that could significantly re-duce the computational cost of decision-making in computer scienceand technology. Its decision-making ability is achieved by electro-chemical phenomena including ionic transport and redox reactions(19–22). A principle with tug-of-war (TOW) dynamics has beenexploited to solve dynamic multiarmed bandit problems (DMBPs)(14–18, 23–26), which are long-standing mathematical problems that

have been applied to deep learning and related technologies (1–6, 27–31). The excellent solvability and adaptability of our devices are advan-tageous for dealing with the temporal dynamics of practical problems.

Our ionic decision-maker demonstrates the ability to solve com-petitive MBPs, which suffer serious loss of rewards (for example,throughput in communication networks) due to the Nash equilibrium(NE), which is the natural consequence for a group of independent self-ish users (7, 30, 31). Excellent throughput has been demonstrated byoverall optimization of the selections made by users in a communica-tion network. Advantageous performance, particularly adaptability, isachieved by the unique and inherent voltage-charge relationship ofelectrochemical cells, whichworks as a forgetting parameter that usuallyrequires huge computational resources in conventional computationalapproaches. The ionic decision-maker creates a new research field of“materials decision-making” inwhich the intrinsic properties ofmaterialsare used to make decisions not only for large-scale computations of hu-man behavior but also for developing autonomous intelligent chips formobile applications (for example, communication networking andmed-ical diagnosis).

RESULTS AND DISCUSSIONTheoretical background and experimental setupBoth human beings and computers solve various problems by makingdecisions regarding subsequent actions. The problems, in many cases,can be interpreted as an MBP with stochastic events (1–6, 27–29). TheMBP is a problem in which a gambler at a row of slot machines has toempirically decide whichmachines to play tomaximize the total reward(coins) in a series of trials (30, 31), as illustrated in Fig. 1A (1). Here, wediscuss the MBP in a scenario where a user of busy communicationchannels has to select appropriate channels from among the availablechannels to transmit his or her information data atmaximum efficiency.The situation can be discussed on the basis of the channel model pro-posed by Lai et al. (1, 2), which is shown in Fig. 1A (2). In thismodel, theuser can select only a single channel at any given time. Suppose that at acertain moment (t), the user selects either channel A or B, which areavailable (open) with probabilities PA and PB and unavailable (occupied)with probabilities 1–PA and1–PB, respectively. The user does not knowthe value of PA and PB a priori. With PA (that is, channel A is available),one packet is transmitted (accepted); this situation is hereafter referred to

1 of 7

http://advances.sciencemag.org/


on October 18, 2020


Dow

nloaded from

as “transmitted.” The packet will not be transmitted (rejected) with aprobability of 1 – PA. This situation is referred to as “blocked.” A seriesof selections results in a table with “selected (transmitted),” “selected(blocked),” or “not selected,” such as shown in Fig. 1B.

It should be emphasized that the original MBP and the channelmodel are equivalent, as is indicated by the comparison of the two inFig. 1A (1 and 2). The slot machine in the MBP corresponds to achannel in the communication network. Obtaining a coin correspondsto transmission of a packet. Therefore, we candiscuss theMBPbased onthe channel model without loss of generality.

The user of the communication network described above must de-cide which of channels A and B should be selected so as tomaximize theefficiency of transmitting data packets. The ionic decision-maker devel-oped in the present study is efficiently used for this decision-making.For this purpose, we use the TOW principle that resembles the TOWgame in which two persons, A and B, pull against each other at oppositeends of a rigid rope (7, 14, 25). Physically, the TOWprinciple consists oftwo key points, that is, the conservation of some physical quantity(which corresponds to the length of the abovementioned rigid rope)and a certain stochastic fluctuation (corresponding to the fluctuationof the horizontal position of the rigid rope, which will be discussed indetail later) (2, 7, 14).


Figure 1C illustrates a strategy based on the TOW principle, inwhich we assume that the rigid rope makes decisions as indicated bythe yellow bar (14–18, 23–26). Here, we use an electrical charge appliedto electrochemical cells to represent the rigid rope. Our principle is thuscharge-conservingTOWdynamics, inwhich a sumof total charge pass-ing through electrode i (corresponds to channel i), Qi, is conserved tozero (that is,QA +QB = 0, due to +Q−Q= 0) during current application.Electrical potential of channel i, Ei, is a variable used to evaluate whichchannel is profitable to select. It is increased or decreased, in a timelymanner by a current application to the cell, on the basis of the resultsof stochastic events (for example, transmitted or blocked). For example,the user selects channelAwhenEA is larger thanEB and vice versa.Underthe condition PA > PB, EA becomes stably larger than EB after severalselections. Here, the variation in Ei appears as a variation in cell voltage V,which is caused by the application of a constant current (I) for a limitedtime (t).

Figure 1D illustrates our experimental setup. The setup consists of atwo-electrode electrochemical cell, a potentio/galvanostat, and a randomnumber generator to emulate the transmission of packets as stochasticphenomena. The stochasticity in Fig. 1 (C and D), that is, +Q or −Q,is externally inserted by the random number generator, whereas sto-chasticity is an internal property for the examples in Fig. 1A (1 and 2).

A

B

D

C

A

µ

Fig. 1. Theoretical background and experimental setup of ionic decision-maker. (A) 1: Original MBP, in which a gambler attempts to select a slot machine. 2: MBP in thechannel model, in which a communication network user attempts to select an available channel. (B) Example of the results for users 1 and 2. (C) Illustration of the charge-conserving TOW principle used. (D) Illustration of the experimental setup using a two-terminal electrochemical cell, potentio/galvanostat, and a random number generator.The illustration is simplified. Details of the setup are described in Materials and Methods.

2 of 7



The cell includes a Nafion H+-conducting polymer electrolyte (in whichthe H+ is highly mobile, while the electrons are immobile) and platinumthin-film electrodes (32–34). The electrical potential of electrode A (EA)with respect to electrode B (EB) is denoted asV (see Fig. 1D). In the initialstate, H+ is homogeneously distributed in Nafion, which causes thecomplete symmetry of electrodes A and B, leading to zero V. However,theV starts to deviate from zero because electrochemical reactions takingplace during current application to the cell break the symmetry.Note thatthe electrochemical reactions mainly consume Q applied to the cell dur-ing the current application. The resultant modulation of proton and gasconcentration in the Nafion is thus the origin of V variation.

Adaptive operation of ionic decision-maker for DMBPSDMBPs can be efficiently solved with our ionic decision-maker. Theoperation principle of the device illustrated in Fig. 1D is provided forthe case that PA and PB discussed above are 0.6 and 0.4, respectively.The dynamic change of the measured voltage V during operation is


shown in Fig. 2A. First, the sign ofV (positive or negative) is determinedin the open-circuit condition (indicated by black small arrows). Al-though theV should be ideally zero in the initial state, in reality, it slightlydeviates from zero due to different adsorption/structure at the electrode/electrolyte interfaces (even using the same metal for electrodes). There-fore, we observe positive or negative V even in the first Vmeasurement.When V is positive (negative), a random number between 0 and 1 isgenerated to emulate that channel A (B) with PA (PB) is selected. Witha PA of 0.6, a random number smaller (larger) than 0.6 corresponds totransmission (block) of a packet. In accordance with the results[transmitted or blocked; as illustrated in Fig. 1, A (2) and B], a con-stant current of 500 nA,which corresponds to +Q in the TOWprinciple(shown in Fig. 1C), or −500 nA, which corresponds to −Q, is applied toelectrode A for 500ms [that is,Q (=I·t) = 500 nA·500ms]. The circuit isthen opened, and V is measured for 500 ms (indicated by green largearrows). The sequence of these steps (from a black arrow to the nextblackarrow) isdefinedasone selectionof the channel in thecommunication

on October 18, 2020


Dow

nloaded from

A B

C

Fig. 2. Adaptive operation of ionic decision-maker for DMBPs. (A) Variation in cell voltage during experiment. (B) Illustration of variation in CSR versus number ofselections. (C) Top: Variation in CSR of ionic decision-maker against Pi inversions occurring every 200 selections. CSR starts at close to 0.5, corresponding to completelyrandom selection. Although CSR reached 1 within 100 selections with initial conditions (0.9, 0.1), (0.8, 0.2), and (0.7, 0.3), it did not exceed 0.95 with (0.6, 0.4) even within200 selections because of relatively close probabilities. Bottom: Variation in number of packets.

3 of 7



on October 18, 2020


Dow

nloaded from

network. As shown in Fig. 2A, the repeated selections resulted in adigital-like V variation, which is used to calculate correct selectionrate as explained later. This behavior corresponds to a concentrationpolarization (generating voltage in the Nafion) by modulation of con-centration distribution of proton and gas in the Nafion due to theelectrochemical reactions shown below

2Hþ þ 2e′→H2 ð1Þ

H2O→2Hþ þ 2e′þ 12O2 ð2Þ

Because it is an electrochemical cell consuming Q by the electro-chemical reactions, a deviation from an ideal V behavior of a capacitor,as indicated by the dashed lines in Fig. 2A, is significant. Such a devia-tion from an ideal capacitor, in which Q is completely preserved aftercurrent application, in contrast to the present case, plays an importantrole for achieving excellent adaptability, as will be discussed in the latersection.

We investigated the variation in correct selection rate (CSR) of ourionic decision-maker device. Correct selection means that the channelwith the highest probability (Pi) was selected regardless of whether apacket was transmitted or not, that is, the voltage takes a positive (neg-ative) value at the selection under the PA > (<) PB condition. The CSRis automatically calculated on the basis of V variation with time that issimilar to Fig. 2A. We performed 800 consecutive selections, with eachselection being repeated for 100 cycles. We calculated CSR by dividingthe number of cycles in which a correct selection was made, N, by thetotal number of cycles (100 in Fig. 2B),C (that is, CSR =N/C). Details ofthe experimental conditions are described in Materials and Methods.Figure 2B schematically illustrates the variation in CSR. CSR graduallyincreased with the number of selections because V tended to take apositive value after repetition of selections under the PA >PB conditiondue to accumulation of protons near electrode A (14–18, 23–26).

To evaluate adaptability against environmental change, which is aparticularly important property for practical applications, we invertedPi [for example, from (PA, PB) = (0.6, 0.4) to (0.4, 0.6)] at every 200thselection. CSR dropped immediately after each Pi inversion because thecorrectV had changed to negative; the choice of channel A (that is, pos-itive V) was no longer the correct decision.

The variation in CSR of our two-electrode electrochemical cellagainst Pi inversions given at every 200th selection is shown in Fig.2C (top), starting from various combinations of PA and PB. The CSRsteeply increased toward 1.0 afterPi inversions, indicating that our ionicdecision-maker is highly adaptable. This behavior agrees well with theillustration in Fig. 2B and the theoretical calculations shown in fig. S2,indicating that the TOW principle works properly in our device. Thequick and adaptive behavior demonstrates that our ionic decision-maker can efficiently solve DMBPs.

Figure 2C (bottom) shows the variation in the number of packets,which is the number of packets transmitted from the beginning to a spe-cific selection. Given that V is positive (negative) at a selection, onepacket is added for PA (PB), whereas a packet is not added for 1 – PA(1 – PB). As the larger probability (that is, PA for the first 200 selections)decreases from 0.9 to 0.6, the slope gradually decreases. This is quite rea-sonable because the slope is close to the expected value of the transmittedpackets for the correct channel at a selection. The slope was about 0.9when PA was 0.9 and PA > PB (PA·1 = 0.9).


One may ask “Is a simple capacitor enough to solve DMBPs be-cause it also satisfies the condition of +Q − Q = 0 during currentapplication?” The answer is “no.” The difference is caused by aninherent nonlinearity observed in V-Q relationship of the cell. Itfunctions as a forgetting parameter, a(<1) in the TOW principle andshould thus be termed as “built-in a” (see figs. S2 and S3 for details).

Ionic decision-maker for competitive DMBPsAn ionic decision-maker can be used to solve more complex and prac-tical problems that are difficult to solve from a mathematical viewpoint,that is, competitive DMBPs (18). Let us consider two network users whoattempt to select an available channel in the network, as illustrated inFig. 3A. As long as they select different channels, the Pi is the same asthose for the single user cases illustrated in Figs. 1 and 2. However, ifthey select the same channel, a collision occurs, and Pi is evenly splitbetween them (that is, Pi

2), leading to a significant decrease in the totalnumber of transmitted packets for both users (16–18). This situation issummarized in the payoff matrix shown in Table 1.

Packet loss is a serious problem for communicationnetwork systemswith a limited number of channels and many users (1, 2). Avoidingpacket loss is difficult because it has a mathematical origin: NE, whichis the natural consequence for a group of independent users who at-tempt to use the best channel in a self-centeredmanner. This problem ismuch more complicated in reality because Pi dynamically fluctuates.Therefore, it is a major challenge to solve competitive DMBPs andachieve overall optimization, which is referred to here as “social maxi-mum (SM)”. Given two users and three channels, SM is defined as thesituation in which the users select channels with the highest and secondhighest Pi, resulting in maximization of the total number of packets forall users. Note that SMdiffers from the situation inwhich the number ofpackets for each user is maximized, as illustrated by the two SM situa-tions in Table 1. In the two situations, the number of packets for user1 or 2 is 0.4, which is not the maximum value.

We solve competitive DMBPs by using an extended form of ionicdecision-maker. As illustrated in Fig. 3B, an ionic decision-maker is im-plemented using two electrochemical cells, each with three electrodes(A, B, and C). We define each cell, including a potentio/galvanostatand a random number generator, as devices 1 and 2, corresponding tousers 1 and 2, respectively. The devices are connected in series so that apositive current to electrode A in device 1 is equivalent to a negative cur-rent to electrodeA in device 2. The strong interaction between the devicesdue to sharing of the applied currentmeans that the selection of a specificchannel (for example, channel A) by device 1makes selection of the samechannel difficult for device 2. The theoretical background for the solvabil-ity is described elsewhere (18).

Figure 4A shows the variation in selection rates for each channel(that is, A, B, and C rates, as defined in the figure) for both devices.The initial Pi (PA, PB, and PC) were set to (0.9, 0.4, and 0.2) (please seefigs. S4 and S5 for the experimental details). At the beginning of theoperation, the A, B, and C rates for both devices were close to 0.33 (thatis, 1/3), which corresponds to a random distribution. While the A and Brates increased to 0.5, the C rate decreased to zero due to the lowest PC(0.2). The asymptotic approaches of the A and B rates to 0.5 mean thatboth devices selected channel A (with the highest Pi) in about half thecycles (10) while, in the remaining cycles (10), they selected channel B(with the second highest Pi). This was caused by the interaction betweenthe devices, as mentioned above. The A, B, and C rates reach differentvalues due to the different Pi assignments (0.4, 0.2, and 0.9) afterthe 100th selection. A and B rates reached only 0.5, which is half

4 of 7



httD

ownloaded from

the saturation value observed in Fig. 2. This indicates that the devicesmade mutual concessions in choosing channels A and B. This selectioncontinued in the subsequent selections following the Pi assignmentchanges. This behavior is completely different from that shown inFig. 2, which shows that the correct (highest Pi) channels were alwaysselected so that CSR (for example, a rate whenPA is the highest) reached1.0 (see section S5). The result shown in Fig. 4A evidences that the de-vices achieved SM.

The red curve in Fig. 4B shows the variation in the total number ofpackets for users 1 and 2. The combined devices (connected in series asshown in Fig. 3B) achieved performance very close to the theoreticallimit of SM (pink dashed curve); this achievement is possible only inthe virtual case where users 1 and 2 perfectly select the highest and sec-ond highest Pi channels for all selections. The slight deviation fromtheory is due to imperfect adaptability of the ionic decision-maker,which is, at least in part, inevitable because all MBP solvers, includingalgorithms and devices, need to explore incorrect channels to grasptheir probability of being correct before making a decision.

For comparison, we also examined the performance of independentdevices (two devices operated independently) under the same condi-tions. The total number of packets is indicated by the blue curve inFig. 4B. The total number of packets for the independent device casewas significantly lower than that for the combined device case becausethe devices independently sought the highest Pi, leading to the NE state.


The performancewas even below the theoretical limit ofNE (gray dashedcurve); the performance (not so good because of NE) can be achievedonly when devices 1 and 2 perfectly select the highest Pi channelsfor all selections. The comparison shows that ionic decision-maker withthe combined devices is advantageous for solving competitive DMBPs.

While each device emulates selfish seeking during operation, theirinteraction enables mutual concessions to achieve SM. The processingin ionic decision-maker differs completely from that in conventionalcomputers, in which elements and their interactions are fully calculatedusing huge computational resources. Ionic decision-makers thus offernew insight into a novel class of non–von Neumann architecturecomputing (7).

At present, the device cannot operate in closed (for example, en-capsulated) environment because it operates using electrode reactions(1) and (2) that include exchange of mass with the surroundings. Thelimitation may be overcome by using other electrode reactions withoutexchange ofmass with the surrounding.Moreover, further downscalingis possible by replacement of the Nafion by thin-film electrolytes.

CONCLUSIONThe ionic decision-maker introduced in this article was developed onthe basis of the solid-state ionic principle. Its excellent solvability andadaptability for DMBPs, including competitive ones, are achieved by

on October 18, 2020

p://advances.sciencemag.org/

A

B

Fig. 3. Theoretical background and experimental setup for competitive DMBPs. (A) Examples of two situations in which two network users attempt to usedifferent channels or the same channel in the network (that is, competitive DMBP). (B) Illustration of ionic decision-maker composed of two electrochemical cells(devices 1 and 2), each with three electrodes, which are connected to potentio/galvanostat via a switch matrix.

5 of 7



on October 18, 2020


Dow

nloaded from

ionic transport and the resultant electrochemical phenomena. This wasthe first physical implementation of ionic decision-maker in a devicewith decision-making ability. An ionic decision-maker was particularlyeffective for solving competitive DMBPs, which are normally difficultbecause of the NE. A conserving physical object and built-in a, the twomain characteristics of the TOW principle, were successfully emulatedby the intrinsic property of ionic charge: strictly conserved duringapplication (that is, instantaneous conservation) but somewhat vola-tile (timewise volatility) in electrochemical systems.

Optimization in a society through competition is similar to decision-making in an individual through dilemmas. Ionic decision-makers canthus be applied to human behaviors, frommicroscopic to macroscopic.Furthermore, the combination of ionic decision-makers with artificialsynaptic plasticity, which was recently found in ionic devices, should beuseful for emulating complex personality formation in humans beyondshort-term plasticity and long-term potentiation in a single synapse(35, 36). The ionic motion in all-solid-state devices is useful not onlyfor large-scale emulation of collective behavior (Web advertising, fi-nancial trading, etc.) but also for developing autonomous intelligentchips for mobile applications (communication networking, medicaldiagnosis, etc.).


MATERIALS AND METHODSFabrication of electrochemical cells for an ionic decision-makerThe procedure used for fabricating electrochemical cells for an ionicdecision-maker is illustrated in fig. S1. A Nafion dispersion solution(5%; DuPont DE521) was mixed with 20 weight % platinum-loadedcarbon (Vulcan XC-72). The solution was casted onto the surface ofa Teflon sheet and dried in air at room temperature for about 6 hours.The resultant thin film (Nafion film with Pt-C catalyst) was transferredand hot-pressed at 413 K onto the surface of Nafion H+-conductingmembrane (DuPont N-117) with a thickness of 183 mm. Pt thin filmwith a thickness of 80 nm was deposited onto the surface of the Nafionwith a shadow mask by using electron beam deposition. The gap be-tween the Pt electrodes was 100 mm.

Operation of an ionic decision-maker for DMBPs with onedevice and two channelsThe ionic decision-maker device was placed in a manual probe sys-tem and connected using two tungsten probes, as shown in Fig. 1D.The electrochemical cell was connected to a potentio/galvanostat(CompactStat, Ivium Technologies), which was controlled usingcustom software designed to generate a random number and initiate acurrent application at positive or negative 500 nA for 500 ms. A randomnumber was generated to emulate stochastic events using various valuesof Pi. For example, a randomnumber from 0 to 1 was generated, and if itwas smaller than Pi, it was interpreted as transmitted, and a positive500 nA was applied for 500 ms. If the generated number was largerthan Pi, it was interpreted as blocked, and a negative 500 nA wasapplied for 500 ms.

The operation was composed of 800 selections, with each selectionbeing repeated for 100 cycles. Each cycle started with a short circuit ofthe two electrodes for 200 s to refresh the electrochemical cell (refresh-ment stage). The circuit was opened, and the voltage was measured toidentify the signal of the voltage (positive or negative).When the voltagewas positive, channel A was selected. The selection was emulated byrandom number generation with PA and the subsequent current ap-plication, as described above, and vice versa. After current application,the circuit was opened for 500 ms to measure the voltage. Each unit ofoperation was a selection. After every 200 repetitions of the selections,the PA and PB values were inverted to emulate environmental changes.After 800 consecutive selections with three inversions (that is, onecycle), the operation moved to the refreshment stage of the next cycle.One hundred cycles were performed for each PA/PB combination.

SUPPLEMENTARY MATERIALSSupplementary material for this article is available at http://advances.sciencemag.org/cgi/content/full/4/9/eaau2057/DC1Section S1. Fabrication of electrochemical cells for ionic decision-makerSection S2. Comparison with CPU-based computation using mathematical algorithms anddielectric capacitorSection S3. Operation mechanism of ionic decision-makerSection S4. Theoretical expectation of decision-making behavior with two devicesSection S5. Two operation modes of ionic decision-maker for competitive MBPs with twodevices and three channelsFig. S1. Fabrication of electrochemical cells for ionic decision-maker.Fig. S2. Comparison with CPU-based computation using mathematical algorithms anddielectric capacitor.Fig. S3. Operation mechanism and built-in “a” of ionic decision-maker.Fig. S4. Expected behavior of two devices and corresponding variation in selection rates foreach channel for both devices.Fig. S5. Two operation modes of ionic decision-maker for competitive MBPs with two devicesand three channels.

A

B

Fig. 4. Adaptive operation of ionic decision-maker for competitive DMBPs.(A) Variation in selection rates for channels for twodevicesmeasured over 20 cycles foraveraging. Initial probabilities of channels (PA, PB, and PC) were set to (0.9, 0.4, and 0.2),and the assignment was changed at every 100th selection. (B) Variation in total num-ber of packets for devices 1 and 2 in two operation modes (combined devices andindependent devices). SM and NE show theoretical limits of SM (pink dashed curve)and NE (gray dashed curve).

6 of 7

http://advances.sciencemag.org/cgi/content/full/4/9/eaau2057/DC1

http://advances.sciencemag.org/cgi/content/full/4/9/eaau2057/DC1



on October 18, 2020


Dow

nloaded from

REFERENCES AND NOTES1. L. Lai, H. Jiang, H. V. Poor, Medium access in cognitive radio networks: A competitive

multi-armed bandit framework, in Proceedings of the 42nd Asilomar Conference on Signals,System and Computers (IEEE, 2008), pp. 98–102.

2. L. Lai, H. E. Gamal, H. Jiang, H. V. Poor, Cognitive medium access: Exploration,exploitation, and competition. IEEE Trans. Mobile Comput. 10, 239–253 (2011).

3. L. Kocsis, C. Szepesvári, Bandit based Monte-Carlo planning, in Proceedings of the 17thEuropean Conference on Machine Learning (Springer, 2006), pp. 282–293.

4. S. Gelly, Y. Wang, R. Munos, O. Teytaud, Modification of UCT with patterns in Monte-CarloGo, Technical Report, RR-6062, INRIA (2006).

5. D. Agarwal, B.-C. Chen, P. Elango, Explore/exploit schemes for web content optimization,in Proceedings of the Ninth IEEE International Conference on Data Mining (IEEE, 2009),pp. 1–10.

6. M. Gagliolo, J. Schmidhuber, Algorithm portfolio selection as a bandit problem withunbounded losses. Ann. Math. Artif. Intell. 61, 49–86 (2011).

7. S.-J. Kim, M. Naruse, M. Aono, Harnessing the computational power of fluids foroptimization of collective decision making. Philosophies 1, 245–260 (2016).

8. G. Rozenberg, T. Back, J. Kok, Handbook of Natural Computing (Springer Verlag, 2012).9. L. M. Adleman, Molecular computation of solutions to combinatorial problems. Science

266, 1021–1024 (1994).10. R. J. Lipton, DNA solution of hard computational problems. Science 268, 542–545

(1995).11. R. Ursin, F. Tiefenbacher, T. Schmitt-Manderbach, H. Weier, T. Scheidl, M. Lindenthal,

B. Blauensteiner, T. Jennewein, J. Perdigues, P. Trojek, B. Ömer, M. Fürst, M. Meyenburg,J. Rarity, Z. Sodnik, C. Barbieri, H. Weinfurter A. Zeilinger, Entanglement-based quantumcommunication over 144 km. Nat. Phys. 3, 481–486 (2007).

12. T. Nakagaki, H. Yamada, A. Tóth, Maze-solving by an amoeboid organism. Nature 407,470 (2000).

13. A. Tero, S. Takagi, T. Saigusa, K. Ito, D. P. Bebber, M. D. Fricker, K. Yumiki, R. Kobayashi,T. Nakagaki, Rules for biologically-inspired adaptive network design. Science 327,439–442 (2010).

14. S.-J. Kim, M. Aono, Amoeba-inspired algorithm for cognitive medium access. NOLTA 5,198–209 (2014).

15. S.-J. Kim, M. Naruse, M. Aono, M. Ohtsu, M. Hara, Decision maker based on nanoscalephoto-excitation transfer. Sci. Rep. 3, 2370 (2013).

16. M. Naruse, W. Nomura, M. Aono, M. Ohtsu, Y. Sonnefraud, A. Drezet, S. Huant, S.-J. Kim,Decision making based on optical excitation transfer via near-field interactions betweenquantum dots. J. Appl. Phys. 116, 154303 (2014).

17. M. Naruse, M. Berthel, A. Drezet, S. Huant, M. Aono, H. Hori, S.-J. Kim, Single-photondecision maker. Sci. Rep. 5, 13253 (2015).

18. S.-J. Kim, T. Tsuruoka, T. Hasegawa, M. Aono, K. Terabe, M. Aono, Decision maker basedon atomic switches. AIMS Mater. Sci. 3, 245–259 (2016).

19. T. Tsuchiya, K. Terabe, M. Aono, All-solid-state electric-double-layer transistor based onoxide ion migration in Gd-doped CeO2 on SrTiO3 single crystal. Appl. Phys. Lett. 103,073110 (2013).

20. T. Tsuchiya, K. Terabe, M. Aono, In situ and non-volatile bandgap tuning of multilayergraphene oxide in an all-solid-state electric double-layer transistor. Adv. Mater. 26,1087–1091 (2014).

21. T. Tsuchiya, T. Tsuruoka, K. Terabe, M. Aono, In situ and nonvolatile photoluminescencetuning and nanodomain writing demonstrated by all-solid-state devices based ongraphene oxide. ACS Nano 9, 2102–2110 (2015).

22. T. Tsuchiya, K. Terabe, M. Ochi, T. Higuchi, M. Osada, Y. Yamashita, S. Ueda, M. Aono, Insitu tuning of magnetization and magnetoresistance in Fe3O4 thin film achieved with all-solid-state redox device. ACS Nano 10, 1655–1661 (2016).


23. S.-J. Kim, M. Aono, M. Hara, Tug-of-war model for multi-armed bandit problem, inUnconventional Computation, C. S. Calude, M. Hagiya, K. Morita, G. Rozenberg, J. Timmis,Eds. (Springer, 2010), pp. 69–80.

24. S.-J. Kim, M. Aono, M. Hara, Tug-of-war model for two-bandit problem: Nonlocallycorrelated parallel exploration via resource conservation. Biosystems 101, 29–36(2010).

25. S.-J. Kim, M. Aono, E. Nameda, Efficient decision-making by volume-conserving physicalobject. New J. Phys. 17, 083023 (2015).

26. C. Lutz, T. Hasegawa, T. Chikyow, Ag2S atomic switch-based ‘tug of war’ for decisionmaking. Nanoscale 8, 14031–14036 (2016).

27. D. Fudenberg, D. K. Levine, The Theory of Learning in Games (MIT Press, 1998).28. J. R. Marden, H. P. Young, G. Arslan, J. S. Shamma, Payoff-based dynamics for multiplayer

weakly acyclic games. SIAM J. Control Optim. 48, 373–396 (2009).29. A. Garivier, O. Cappé, The KL-UCB algorithm for bounded stochastic bandits and beyond,

in Proceedings of the 24th Annual Conference on Learning Theory (COLT) (Cornell UniversityLibrary, 2011), pp. 359–376.

30. H. Robbins, Some aspect of the sequential design of experiments. Bull. Am. Math. Soc. 58,527–535 (1952).

31. R. Weber, On the Gittins index for multiarmed bandits. Ann. Appl. Probab. 2, 1024–1033(1992).

32. T. Toda, H. Igarashi, H. Uchida, M. Watanabe, Enhancement of the electroreduction ofoxygen on Pt alloys with Fe, Ni, and Co. J. Electrochem. Soc. 146, 3750–3756(1999).

33. H. A. Gasteiger, S. S. Kocha, B. Sompalli, F. T. Wagner, Activity benchmarks andrequirements for Pt, Pt-alloy, and non-Pt oxygen reduction catalysts for PEMFCs.Appl Catal B 56, 9–35 (2005).

34. M. Watanabe, M. Tomikawa, S. Motoo, Experimental analysis of the reaction layerstructure in a gas diffusion electrode. J. Electroanal. Chem. 195, 81–93 (1985).

35. T. Ohno, T. Hasegawa, T. Tsuruoka, K. Terabe, J. K. Gimzewski, M. Aono, Short-termplasticity and long-term potentiation mimicked in single inorganic synapses. Nat. Mater.10, 591–595 (2011).

36. R. Yang, K. Terabe, G. Liu, T. Tsuruoka, T. Hasegawa, J. K. Gimzewski, M. Aono, On-demandnanodevice with electrical and neuromorphic multifunction realized by local ionmigration. ACS Nano 6, 9515–9521 (2012).

AcknowledgmentsFunding: The authors acknowledge that they received no funding in support of this research.Author contributions: T. Tsuchiya, T. Tsuruoka, S.-J.K., K.T., and M.A. conceived the ideafor the study. T. Tsuchiya, T. Tsuruoka, and S.-J.K. designed the experiments. T. Tsuchiya andT. Tsuruoka wrote the paper. T. Tsuchiya carried out the experiments. T. Tsuchiya andS.-J.K. analyzed the data and carried out the theoretical calculation. All authors discussed theresults and commented on the manuscript. K.T. and M.A. directed the projects. Competinginterests: The authors declare that they have no competing interests. Date and materialsavailability: All data needed to evaluate the conclusions in the paper are present in thepaper and/or the Supplementary Materials. Additional data related to this paper may berequested from the authors.

Submitted 17 May 2018Accepted 30 July 2018Published 7 September 201810.1126/sciadv.aau2057

Citation: T. Tsuchiya, T. Tsuruoka, S.-J. Kim, K. Terabe, M. Aono, Ionic decision-maker created asnovel, solid-state devices. Sci. Adv. 4, eaau2057 (2018).

7 of 7


Ionic decision-maker created as novel, solid-state devicesTakashi Tsuchiya, Tohru Tsuruoka, Song-Ju Kim, Kazuya Terabe and Masakazu Aono

DOI: 10.1126/sciadv.aau2057 (9), eaau2057.4Sci Adv

ARTICLE TOOLS http://advances.sciencemag.org/content/4/9/eaau2057

MATERIALSSUPPLEMENTARY http://advances.sciencemag.org/content/suppl/2018/08/31/4.9.eaau2057.DC1

REFERENCES

http://advances.sciencemag.org/content/4/9/eaau2057#BIBLThis article cites 28 articles, 4 of which you can access for free

PERMISSIONS http://www.sciencemag.org/help/reprints-and-permissions

Terms of ServiceUse of this article is subject to the

is a registered trademark of AAAS.Science AdvancesYork Avenue NW, Washington, DC 20005. The title (ISSN 2375-2548) is published by the American Association for the Advancement of Science, 1200 NewScience Advances

License 4.0 (CC BY-NC).Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution NonCommercial Copyright © 2018 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of

on October 18, 2020


Dow

nloaded from

http://advances.sciencemag.org/content/4/9/eaau2057

http://advances.sciencemag.org/content/suppl/2018/08/31/4.9.eaau2057.DC1

http://advances.sciencemag.org/content/4/9/eaau2057#BIBL

http://www.sciencemag.org/help/reprints-and-permissions

http://www.sciencemag.org/about/terms-service


ionic decision-maker created as novel, solid-state devices€¦ · ionic decision-maker created as...

Documents