re-examination of log-periodicity observed in the seismic ... · re-examination of log-periodicity...

15
Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang 1 and H. Saleur Department of Physics, University of Southern California, Los Angeles D. Sornette 2,3 Institute of Geophysics and Planetary Physics, University of California, Los Angeles Abstract. Based on several empirical evidence, a series of papers has advocated the concept that seismicity prior to a large earthquake can be understood in terms of the statistical physics of a critical phase transition. In this model, the cumulative seismic Benioff strain release increases as a power-law time-to-failure before the final event. This power law reflects a kind of scale invariance with respect to the distance to the critical point: is the same up to a simple rescaling λ z after the time-to-failure has been scaled by a factor λ. A few years ago, on the basis of a fit of the cumulative Benioff strain released prior to the 1989 Loma Prieta earthquake, ? proposed that this scale invariance could be partially broken into a discrete scale invariance, defined such that the scale invariance occurs only with respect to specific integer powers of a fundamental scale ratio. The observable consequence of discrete scale invariance takes the form of log-periodic oscillations decorating the accelerating power law. They found that the quality of the fit and the predicted time of the event are significantly improved by the introduction of log-periodicity. Here, we present a battery of synthetic tests performed to quantify the statistical significance of this claim. We put special attention to the definition of synthetic tests that are as much as possible identical to the real time series except for the property to be tested, namely log- periodicity. Without this precaution, we would conclude that the existence of log-periodicity in the Loma Prieta cumulative Benioff strain is highly statistically significant. In contrast, we find that log-periodic oscillations with frequency and regularity similar to those of the Loma Prieta case are very likely to be generated by the interplay of the low pass filtering step due to the construction of cumulative functions together with the approximate power law acceleration. Thus, the single Loma Prieta case alone cannot support the initial claim and additional cases and further study are needed to increase the signal-to-noise ratio if any. The present study will be a useful methodological benchmark for future testing of additional events when the methodology and data to construct reliable Benioff strain function become available. 1 Department of Earth Sciences, University of Southern Cali- fornia, Los Angeles, California 90089-0740. 2 Department of Earth and Space Sciences, University of Cali- fornia, Los Angeles, California 90095-1567. 3 Laboratoire de Physique de la Matiere Condensee, CNRS UMR 6622 and Universit´ e de Nice-Sophia Antipolis, Nice, France. 1

Upload: others

Post on 14-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

Re-examination of log-periodicity observed in the

seismic precursors of the 1989 Loma Prieta earthquake

Y. Huang1 and H. Saleur

Department of Physics, University of Southern California, Los Angeles

D. Sornette2,3

Institute of Geophysics and Planetary Physics, University of California, Los Angeles

Abstract. Based on several empirical evidence, a series of papers hasadvocated the concept that seismicity prior to a large earthquake can beunderstood in terms of the statistical physics of a critical phase transition.In this model, the cumulative seismic Benioff strain release ε increases as apower-law time-to-failure before the final event. This power law reflects akind of scale invariance with respect to the distance to the critical point: ε isthe same up to a simple rescaling λz after the time-to-failure has been scaledby a factor λ. A few years ago, on the basis of a fit of the cumulative Benioffstrain released prior to the 1989 Loma Prieta earthquake, ? proposed thatthis scale invariance could be partially broken into a discrete scale invariance,defined such that the scale invariance occurs only with respect to specificinteger powers of a fundamental scale ratio. The observable consequence ofdiscrete scale invariance takes the form of log-periodic oscillations decoratingthe accelerating power law. They found that the quality of the fit and thepredicted time of the event are significantly improved by the introduction oflog-periodicity. Here, we present a battery of synthetic tests performed toquantify the statistical significance of this claim. We put special attentionto the definition of synthetic tests that are as much as possible identicalto the real time series except for the property to be tested, namely log-periodicity. Without this precaution, we would conclude that the existenceof log-periodicity in the Loma Prieta cumulative Benioff strain is highlystatistically significant. In contrast, we find that log-periodic oscillations withfrequency and regularity similar to those of the Loma Prieta case are verylikely to be generated by the interplay of the low pass filtering step due to theconstruction of cumulative functions together with the approximate powerlaw acceleration. Thus, the single Loma Prieta case alone cannot support theinitial claim and additional cases and further study are needed to increase thesignal-to-noise ratio if any. The present study will be a useful methodologicalbenchmark for future testing of additional events when the methodology anddata to construct reliable Benioff strain function become available.

1Department of Earth Sciences, University of Southern Cali-fornia, Los Angeles, California 90089-0740.

2Department of Earth and Space Sciences, University of Cali-

fornia, Los Angeles, California 90095-1567.3Laboratoire de Physique de la Matiere Condensee, CNRS

UMR 6622 and Universite de Nice-Sophia Antipolis, Nice, France.

1

Page 2: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

2 HUANG ET AL.

1. Introduction

The idea that earthquakes are somewhat analogousto critical phenomena of statistical mechanics has beengaining ground in the last few decades [????????]. Oneof the consequences of this new point of view is thatevents occurring even several decades before a largemain shock can be considered as seismic precursors, andthat a study of this precursory seismicity might givea fairly good indication of when the impending majorearthquake will take place, and how big it is going to be.Although such considerations are still in infancy, andthe usual caveats about earthquake prediction must bekept in one’s mind, a lot of work has already been de-voted to “post-diction”, sometimes with impressive suc-cess. One of the pioneering cases in that direction wasthe 1989 Loma Prieta earthquake, where ? proposed tosee the empirical power law used by ? in the perspec-tive of criticality in the sense of statistical physics [?].They found that the cumulative Benioff strain startingabout 50 years ago could be well-fitted with a powerlaw, giving rise to a post-diction for the main event of(1990.3± 4.1), a reasonably satisfactory result.

Things got even more exciting after the paper of ?where it was pointed out that the strong oscillationsaround the power law could be fitted as well using acomplex exponent correction to scaling:

ε(t) = A+B(tf − t)z{1+C cos[ω log(tf − t)+φ]} . (1)

In this formula, the parameter tf is the time of the mainshock (a pure power law would correspond to C = 0),and the best fit gave rise to an estimate tf = 1989.9±0.8, considerably closer to the real date than the one in[?], and with much less uncertainty.

The existence of complex correction to scaling expo-nents could be linked to an underlying discrete scaleinvariance, a very appealing property from a theoreti-cal point of view: the initial observation in [?] thereforespurred a lot of development [?????????].

Although the quality of the fit in ? is, to the nakedeye, impressively good, the suspicion arose recently thatthe oscillations in the Loma Prieta Benioff strain couldbe merely the result of noise. The synthetic tests per-formed in [?] being somewhat incomplete, we have de-cided to reanalyze this question much more carefully inthe present work.

We have performed two types of synthetic tests to dothis re-analysis.

In the first type, we consider random power laws (ex-plained in Section 3), with parameters that match thoseof the real data, and study whether noise can give rise

to log-periodic structures after integration. The advan-tage of this approach is that the two key ingredients ofthe analysis of the real data (power law and integration)are captured in a simple way. Its drawback is that thesynthetic data (a sequence of random numbers drawnfrom a power law probability distribution, from whicha cumulative quantity is constructed) is not quite of thesame nature than the real data (a sequence of times andmagnitudes, from which the cumulative Benioff strainis constructed). In particular, the sampling for the syn-thetic data is essentially periodic in log scale, and thiseffect, combined with integration, is expected to giverise to spurious log-periodic oscillations indeed. Nev-ertheless, we find the consideration of these synthetictests quite useful, as it complement the considerationsin [?].

The second type of synthetic tests is devised es-pecially to avoid this issue of sampling: we generatedata for both time and magnitude, in such a way thatthe probability distributions of the synthetic (ts, ms)and the real (tr, mr) quantities are the same. Thereis a problem in doing so: although the synthetic dataand the real data have the same distribution, there isno guarantee that the cumulative Benioff strain con-structed form the synthetic sequences is really a powerlaw, because a power law dependence involves higher-order statistics (i.e., correlation and dependence) notcaptured by the one-point distribution functions. Topreserve the feature of the real data that events aremore frequent and with higher magnitudes when closerto the main shock (or the last data point), we addeda reordering procedure which shuffles the synthetic se-quences in such a way that the event with the jth mag-nitude is at the same position as the real event with thejth magnitude in the real sequence.

We find that for both type of synthetic tests, it is,surprisingly, highly possible to get spurious (that is, en-tirely due to noise) log-periodic oscillations which areas good as those observed in the real Loma Prieta data.This conclusion is made quantitative in a variety ofways, in particular by studying the highest peak of thespectrum of oscillations around the power law for thereal data, and building the probability distribution ofsuch peaks for synthetic data. We thus conclude that,at the present time, it is not possible to distinguish thelog-periodic oscillations observed in [?] from noise.

The present study is related to [?]. The commontheme is the investigation of the conditions under whichlog-periodicity can be created spontaneously by noise.In [?], the goal is to study in details the underly-ing mechanism, relying solely on the manipulation of

Page 3: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

LOMA PRIETA LOG-PERIODICITY 3

data: the generally found non-uniform sampling to-gether with a low pass filtering step, as occurs in con-structing cumulative functions, in maximum likelihoodestimations and de-trending, is enough to create appar-ent log-periodicity. A detailed exploration of this mech-anism has been offered in [?] together with extensivenumerical simulations to demonstrate all its main prop-erties. It was shown that this “synthetic” scenario forlog-periodicity relies on two steps: 1) the fact that ap-proximately logarithmic sampling in time correspondsto uniform sampling in the logarithm of time; 2) integra-tion reddens the noise and, in a finite sample, createsa maximum in the spectrum leading to a most prob-able frequency in the logarithm of time. In [?], thisinsight was then use to to analyze the 27 best after-shock sequences studied by [?] and search for traces ofgenuine log-periodic corrections to Omori’s law, whichstates that the earthquake rate decays approximately asthe inverse of the time since the last main shock. Theobserved log-periodicity was shown to almost entirelyresult from the “synthetic scenario” due to the dataanalysis. From a statistical point of view, resolving theissue of the possible existence of log-periodicity in after-shocks will be very difficult as Omori’s law describes apoint process with a uniform sampling in the logarithmof the time. By construction, strong log-periodic fluc-tuations are thus created by this logarithmic sampling.In contrast, in the present paper, we apply the insightobtained in [?], to study accelerated power laws cul-minating in a finite-time singularity at time tf .

To be complete, we should also point out the fol-lowing. ? paper contains a forward prediction of anearthquake in the Kommandorski Island region at timetf = 1996.3± 1.1 year. Forward predictions provide amuch larger statistical significance since the model pa-rameters are estimated independently and outside theirdomain of application (see below the discussion in thesection on the analysis procedure). Forward predictionhas also the quality of increasing the number of cases.The prediction of a critical time is not enough, one mustspecify the magnitude of the predicted earthquake. In[?], the magnitude was not specified but can probablybe taken following the specification of ? of a magni-tude in the range 7.5-8.5 occurring in a zone originallyoutlined by ?. The largest earthquake in the Harvardcatalog during the time period 1994-1998 has a momentmagnitude MW = 6.6 (1996/07/16, 56.16N, 164.98E).The same event has the magnitude MS = 6.4 in thePDE catalog. If the prediction is considered correctonly if both its time and magnitude range is as pre-dicted, this prediction is a failure. However, it is hard to

draw any firm conclusion based on this single case withrespect to usefulness of the critical earthquake conceptand of log-periodicity as the methodology has evolvedsignificantly since the initial paper of ? (see for instance[??]).

The plan of this paper is as follows. In Section 2 wepresent some general observations on synthetic tests andtheir interpretations. Details on our two types of tests,together with their results, are presented in Sections 3and 4. Our conclusions are collected in Section 5.

2. Analysis Procedure

Usually, the major problem in establishing the sta-tistical significance of a forecasting procedure is its ret-rospective character involving a limited number of dataand a significant number of explicit (the parameters ofthe fit) and implicit (the total time and space windowsused, etc.) degrees of freedom. In such situations, thecalculation of statistical significance becomes very diffi-cult and uncertain as soon as the adjustable parametersare determined from the data. The conclusions can thenbe artifacts of the processing technique or of a selectionbias. The paper of ? certainly suffers from this prob-lem. Since there are no general methods or techniquesthat would allow us to overcome these difficulties, westick to a more modest approach, which turns out to besufficient to draw a clear and meaningful conclusion.

We develop two types of synthetic tests. For bothtypes, the key part is the comparison of the log-periodicoscillations in the real seismicity to those in the syn-thetic sequences. Similar oscillations should have simi-lar frequencies and similar regularities, and this can bestbe quantified by considering the spectrum of these os-cillations. We focus on the highest peak (characterizedby angular frequency ω and peak height h) in the spec-trum, which quantifies the most significant frequencycomponent in the oscillations as a function of log(tf−t)(since we are looking for log-periodicity). The Lombmethod [?] is used instead of the usual Fourier Trans-form, since the data points are not equidistantly spaced.

The ultimate purpose of synthetic tests is to getthe significance level (the probability of getting thesame thing by accident) of the real observation. Onenatural way of evaluating the significance level wouldbe, it seems, to count the number of synthetic peaks(defined as spectral components with spectral powerhigher than neighboring frequencies, in other words,local maximums in the spectrum) within the intervalsωs ∈ [ωr −∆, ωr + ∆] and hs ≥ hr (superscript s, r de-note synthetic and real data respectively), and to nor-

Page 4: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

4 HUANG ET AL.

malize this by the total number of synthetic peaks. Thismight however lead to incorrect conclusions. For in-stance, suppose that, from 10000 synthetic peaks, onlyone peak with peak height higher than 10 (the heightof the real peak) and angular frequency ω in the rangeof [π, 2π] was found: could we conclude that the sig-nificance level is 0.01% (or confidence level of 99.99%)?Probably not. To see why, suppose the distribution ofω’s were uniform in [0, 40], and the distribution in peakheight uniform in [6.862, 10.004]: then the probabilityof observing one peak of height above 10 and ω ∈ [π, 2π]would roughly be 1/10000. But this small probabilitydoes not mean that the peak of height above 10 andω ∈ [π, 2π] is highly significant! Due to the uniformdistribution in both ω and peak height, any other pairof (ω, h) would have, in fact, been observed with thesame probability. To draw positive conclusions fromsuch a simple counting analysis, one would need to havein advance a theory indicating where the peak shouldbe found, and approximately at what height. It is notclear that there is such a thing at present, and thereforewe prefer to use a more conservative approach.

In the language of statistics, this question is relatedto the difference between first-order and second-orderstatistics. In first-order statistics, we ask: “what is theprobability to observe the peak we see just by chance?”In second-order statistics, we ask: “what is the prob-ability to observe some (first-order significant) peaksomewhere (whatever its position)?” In our context,the determination of the confidence level within first-order statistics requires that we have an a priori under-standing of where the peak should be found.

In the absence of any theoretical predictions, it seemsnatural, to quantify the significance level of the log-periodic oscillations, to rely somehow on the probabil-ity distribution of ω and peak height in the syntheticsamples, i.e. to rely on second-order statistics. We havenot managed to come up with a totally satisfactory, ob-jective way to use this probability distribution however.A useful quantity we came up with–but it should notbe trusted blindly–is the ratio R of the probability ofobserving a given peak to the probability of observingthe most probable peak: if the ratio is close to 1, thissurely indicates that the peak is not very significant.

To obtain this ratio, one needs the probability den-sity function p(ω, h) of the synthetic peaks: the lattercan be constructed using the Kernel Density method[??]. We then set R = p(ωr,hr) dω dh

p(ωmp,hmp) dω dh where (ωr, hr)characterize the real peak, and (ωmp, hmp) is the mostprobable synthetic peak. The ratio quantifies howfrequently in synthetic sequences we can observe log-

periodicity similar to that observed in the real sequence.There are some technical advantages in using this ratio.For instance, there is in fact no arbitrariness in choosingthe intervals dω and dh, since they cancel out betweenthe numerator and the denominator: it is then enoughto just use the function value of p(ω, h) at (ωr, hr) and(ωmp, hmp). Also, the special choices (e.g., the degree ofsmoothing and the type of kernels) made in construct-ing the probability density function p(ω, h) hopefullyalso cancel out in the ratio, and have little influence onthe final result.

The generation of synthetic data and extraction ofoscillations depend on the type of synthetic tests andwill be explained separately in the following sections.

3. Synthetic Test I

3.1. Generation of Synthetic Data

We start from a simple method which takes into ac-count the most crucial features of the real data. Wethen refine this method in several ways in Section 4.

The crucial features of the real data are power lawand integration. The latter–considering the cumula-tive Benioff strain–is necessary for numerical reasons,as there are not enough data points to study directlythe rate at which energy is released (moreover, consid-ering the rate leads to other difficulties, like the influ-ence of the binning intervals etc). We therefore takea pure power law as the null hypothesis, and generatedata with a probability density fitting the power lawpart of dε/dt: we mimic real data as closely as possi-ble, taking in particular the same number of points. Wethen construct a synthetic cumulative Benioff strain bynumerical integration, and investigate whether noise inthe sampling of the power law can give rise to spuriouslog-periodic oscillations.

Again, the power law is

dε(t)dt

∝ (tf − t)m−1 (2)

following (1) in [?]. We assume the range for t is [t0, t1]with t1 < tf to avoid the singularity at tf . After nor-malization, we have

dε(t)dt

=m

(tf − t0)m − (tf − t1)m(tf − t)(m−1). (3)

For the real seismic precursors, t is the time of occur-rence of an earthquake, for synthetic events t is a ran-dom variable with probability function p(t) = dε(t)

dt . Tomake sure the synthetic events and the real seismic pre-cursors have the same power law distribution, we chose

Page 5: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

LOMA PRIETA LOG-PERIODICITY 5

the same parameters t0, t1, tf , m, N (from Table I andFigure 1 of [?]), where N is the number of events. (Sincethe original data was not available at the time of thisstudy, we retrieved data from the CNSS catalog usingtheir space-time-magnitude window. However, due tosome unknown reason, our data set was slightly differ-ent from their data set. We got only 27 events insteadof 31.) The random variable t with given p(t) can betransformed from a random variable x uniformly dis-tributed on [x0, x1] by solving [?]

p(x)dx = p(t)dt.

The transformation is

t = tf − [(tf − t0)m

− x− x0

x1 − x0((tf − t0)m − (tf − t1)m)

] 1m

. (4)

We then construct the cumulative distribution functionof t and use this function to mimic the cumulative Be-nioff strain of the real sequence. They have the samepower law parameters and they are both, indeed con-structed from integration.

3.2. Extraction of Oscillations

The original analysis procedure used in [?] was to fitthe cumulative Benioff strain to a power law with log-periodic oscillations ((1), the same as (8) in [?]). Thequality of the observed log-periodicity was not quanti-fied in [?] other than by showing that the quality ofthe fit measured by the residue as well as the predictedcritical time tf were both substantially improved com-pared to those from the fit with the simple power law,but for our study, it is crucial to do so.

To get the oscillations, we first fit a power law withlog-periodic oscillations (1) to both the real and syn-thetic data. The pure power law part (obtained by set-ting C = 0 in (1) is then subtracted, and we obtain theremaining oscillations. These oscillations are in turnanalyzed using the procedure outlined in Section 2.

3.3. Results

3.3.1. The real sequence. We first characterizethe log-periodicity observed in the real sequence. Thefit of (1) to the real data is remarkable (Figure 1). Theoscillations around the power law part show approxi-mately 2.5 cycles of regular oscillations (Figure 2), thespectrum of which has a peak near ωr ∼ 6.1 with heighthr ∼ 7.5 (Figure 3), which is significantly different fromGaussian noise (the chance of observing such a peak

1920 1940 1960 1980 20000

0.2

0.4

0.6

0.8

1

t

cum

ulat

ive

Ben

ioff

stra

in

Figure 1. The fit of a power law with log-periodic os-cillations to the normalized cumulative Benioff strain ofthe seismic precursors of the 1989 Loma Prieta earth-quake.

100

101

102

−0.1

−0.05

0

0.05

0.1

− t

osci

llatio

ns

Figure 2. The oscillations around the power law ofthe normalized cumulative Benioff strain of the seismicprecursors of the 1989 Loma Prieta earthquake.

from Gaussian noise of the same number of data pointsis less than 2% according to (13.8.7) of [?].). However,since we are dealing with oscillations around a cumu-lative quantity, integrated Gaussian noise would be amore appropriate null hypothesis. We will study thechance of observing such a peak from integrated Gaus-sian noise in Section 3.3.2.

Page 6: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

6 HUANG ET AL.

0 5 10 15 20 250

2

4

6

8

w

Lom

b P

ower

Figure 3. The Lomb Periodogram of the oscillationsshown in Figure 2.

1920 1940 1960 1980 20000

0.2

0.4

0.6

0.8

1

t

cum

ulat

ive

num

ber

of e

vent

s

Figure 4. Plot of a quantity similar to that in Figure1 from a synthetic sequence.

3.3.2. Synthetic sequences. 300 syntheticsequences were generated using the parameters of theseismic precursors for the 1989 Loma Prieta earthquake.They were analyzed in the same way as the real precur-sor sequence. See Figures 4, 5, and 6 for typical re-sults. We note that it is possible to observe syntheticsequences which are remarkably similar to the real se-quence: similar amplitude of oscillations, similar fre-quency, and similar regularity. In the following part,we quantify how frequently such sequences can actuallybe observed.

The distribution function of the frequencies and peak

101

−0.06

−0.04

−0.02

0

0.02

0.04

0.06

− t

osci

llatio

ns

Figure 5. Plot of a quantity similar to that in Figure2 from a synthetic sequence.

0 5 10 15 20 250

2

4

6

8

10

w

Lom

b P

ower

Figure 6. Plot of a quantity similar to that in Figure3 from a synthetic sequence.

Page 7: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

LOMA PRIETA LOG-PERIODICITY 7

1020

305 10 15

0.01

0.02

0.03

0.04

wpeak height

Figure 7. The distribution function of frequencies andpeak heights from the synthetic sequences. The positionof the real peak is marked by the vertical line.

heights of the synthetic sequences were constructed us-ing the Kernel Density method [??] (Figure 7). Thereal peak is not far from the most probable syntheticpeak. From the distribution function (Figure 7), we findthat the probability density function at the most prob-able synthetic peak is proportional to 0.031, while it isproportional to 0.016 at value of ω and h correspond-ing to the real peak. The ratio of these two quantitiesis close to one half. If we look at the separate distri-bution of frequencies and peak heights (Figures 9 and10), we see that the frequency from the real sequence isslightly higher than the most probable synthetic peak,and the peak height of the real sequence is almost themost probable synthetic peak height. Note that thefrequencies from synthetic sequences have a rather nar-row distribution, as expected from [??]. The regularityof oscillations observed in the real sequence (Figure 1,quantified by the peak height) is not surprising due tothe strong smoothing effect of integration [??]. Whenthe same analysis procedure was applied to both thereal data and the synthetic data, this kind of regularityis observed in both the real data and the synthetic data.

4. Synthetic Test II

4.1. Generation of Synthetic Data

For synthetic tests to be effective, the syntheticdata should differ from the real ones by only onecharacteristic–the characteristic to be tested. Since in

0 10 20 30 400

5

10

15

20

w

peak

hei

ght

Figure 8. 2D map view of Figure 7. Each circle rep-resents one synthetic peak. The frequency of the realpeak is marked by the vertical line, peak height by thehorizontal line.

0 5 10 15 200

0.05

0.1

0.15

0.2

w

prob

abili

ty d

ensi

ty

Figure 9. The distribution of the synthetic frequen-cies. The vertical line marks the position of the real fre-quency. The two diamonds (�) mark the FWHM (full-width-half-maximum) of the distribution (the same forall subsequent similar plots).

Page 8: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

8 HUANG ET AL.

0 5 10 15 200

0.05

0.1

0.15

0.2

peak height

prob

abili

ty d

ensi

ty

Figure 10. The distribution of the synthetic peakheights. The vertical line marks the position of the realpeak height.

our problem we want to test log-periodicity in the oscil-lations around a power law, the synthetic data shouldbe the same power law but with known noise (could beadditional data errors or random fluctuations.). Sincethe power law of the cumulative Benioff strain is in factimplied by the magnitude distribution and the tempo-ral spacing between events and their correlations, wehave to generate synthetic magnitude ms and time ts

to get the same power law. ts and ms should be randomnumbers having the same probability distribution as tand m, because in our observations we had control overneither t nor m. In this light, the analysis in Section 3is thus a bit oversimplified; we now refine it.

One difficulty in generating synthetic ts and ms isthat the theoretical probability density function (pdf)of magnitudes and times for our real data is unknown.This problem can be solved by using the empirical pdfconstructed from the real data. However, we shouldnot use the empirical pdf directly, otherwise all featuresof the real data would be reproduced in the syntheticdata. For example, the empirical distribution of thereal time sequence (Figure 11) shows some regular os-cillations (that could well be genuine physically-basedlog-periodic oscillations) around its general trend.

If we used exactly this empirical distribution to gen-erate the synthetic time sequences, all synthetic se-quences would show similar regular oscillations: thisis of course not appropriate, since what we want to testis precisely whether these oscillations are the result ofnoise. Therefore, we decided to use only the general

1940 1950 1960 1970 1980

0.2

0.4

0.6

0.8

1

t

cum

ulat

ive

dist

ribut

ion

func

tion

Figure 11. The normalized cumulative number ofevents up to time t of the time sequence of the seismicprecursors of the 1989 Loma Prieta earthquake (solidline connecting circles). The other solid line is the em-pirical line repeatedly smoothed (50 times) by 3-pointmoving average.

trend of the experimental data to generate our syntheticsamples. This general trend is obtained by smoothingthe empirical distribution. Three-point moving averagewas applied repeatedly 50 times (10 for smoothing thecumulative distribution function of magnitudes). Thenumber of times is not crucial as long as the oscilla-tions were wiped out. The criteria for our choice arecloseness to the empirical curve and lack of oscillations.Similar considerations were applied to the generationof synthetic magnitude sequence. The assumption forthe general trend was not crucial. A reasonable one,close to the empirical cumulative distribution curve butwithout the fluctuations, would suffice.

The next difficulty is that, for a sequence of timesand magnitudes generated using this method, there isno guarantee that the cumulative Benioff strain will fol-low a power law. The time sequence more or less followsa power law (we have verified that the cumulative num-ber of events of the seismic precursors of the 1989 LomaPrieta earthquake is similar to the cumulative Benioffstrain in power law and log-periodic oscillations), but,when combined with the magnitude sequence to con-struct the whole Benioff strain curve, there is no ob-vious reason why we should always get a power law.It is natural to expect that the power law of the realsequence comes mainly from the fact that events oc-cur more frequently with increasing magnitude (trend

Page 9: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

LOMA PRIETA LOG-PERIODICITY 9

1940 1950 1960 1970 1980

5

5.2

5.4

5.6

5.8

t

mag

nitu

de

Figure 12. Synthetic magnitude sequence (+). Thereal sequence is plotted with o.

1940 1950 1960 1970 1980

5

10

15

20

25

30

t

cum

ulat

ive

num

ber

Figure 13. Synthetic time sequence (+). The real se-quence is plotted with o.

only) when closer to the main shock. To preserve thisfeature in the synthetic sequences, we decided to re-order the events in the synthetic sequence suchthat the event of the kth magnitude would oc-cur at the same position in both the real andthe synthetic cases (for example, both the realsequence and the synthetic sequences have theevent of the second biggest magnitude being thekth one in the sequence). This reordering schemeis applied only to the magnitude sequence (timesequence is an ordered sequence by definition).One example is shown in Figures 12 and 13.

We performed synthetic tests both with andwithout the reordering scheme. In fact, the re-sults turned out to be almost identical.

4.2. Extraction of Oscillations

The method of extracting oscillations fromthe cumulative Benioff strain is slightly differentfrom that of Section 3.2, however the differenceturns out to be insignificant.

Two ways are possible to obtain the oscilla-tions: the first involves extracting the best-fitpower law from the real data. The drawbackof this approach is that power law fits are oftennot as stable as fits including the log-periodiccorrections. Sometimes (around 8% of all cases)the fit even converges to a tf smaller than thetime of the last data point, thus tf − t is neg-ative and (tf − t)(m−1) is complex since m < 1,which is of course unphysical. The advantage ofthis approach is that log-periodicity is not as-sumed in the first place. The second approachinvolves extracting the power law obtained froma best-fit power law with log-periodic oscilla-tions [???]. The advantage here is that the fitalways converges well, but now log-periodicityis somewhat assumed from the very beginning.The more positive view point advocated in [???]to justify this procedure is that fitting with log-periodicity allows one to take the most probablynoise into account [?] and thus to obtain a goodpure power law representation by putting the co-efficient C = 0. In practice, the results of eitherapproach were very similar: in the following, wereport only the results using the second one.

It is of course crucial to use exactly the sameanalysis procedure for both the real data andthe synthetic data, otherwise features generatedby the analysis procedure for the real data maynot be detected by the synthetic tests.

The cumulative Benioff strain was first con-structed from the magnitude sequence mi:

εi =i∑

j=1

100.75mj (5)

and then normalized such that εmax = 1 (theunit was changed without influencing the con-clusion). As in [?], the cumulative Benioff strainwas then fitted to a power law with log-periodicoscillations.

Page 10: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

10 HUANG ET AL.

1940 1950 1960 1970 1980

0.2

0.4

0.6

0.8

1

time

Nor

mal

ized

Ben

ioff

stra

in

Figure 14. The fitting of the normalized cumulativeBenioff strain of the 30 seismic precursors of the 1989Loma Prieta earthquake to (1).

The de-trended data were obtained by [???]

detrn =ε(t)−A

B(tf − t)z(6)

which should be either noise or pure log-periodiccosine according to (1), and then analyzed by theprocedure explained in Section 2.

4.3. Results

4.3.1. The real sequence. The fitting of thereal data showed good agreement between thereal data and the theoretical curve (Figure 14).

There was a peak at ω = 6.1 in the spectrumof the de-trended data (Figure 16), close to thebest-fit parameter ω = 5.7.

4.3.2. Synthetic sequences. 1000 synthetic se-quences (one example in Figures 12 and 13) wereanalyzed using the same procedure as that forthe real sequence. The cumulative Benioff strainof the synthetic sequence showed obvious simi-larity to the real data (Figure 17).

We first checked whether the synthetic datagave rise to power laws. The method to do thiswas to fit these data to a power law shape, andmeasure the summed square of error (SSE) fromthe fitting: in about 50% of the cases, the syn-thetic sequences had an SSE smaller than thatof the real sequence, indicating that they wereroughly as good power laws as the real data.

−1 0 1 2 3 4−3

−2

−1

0

1

2

3

c

de−

tren

ded

Ben

ioff

stra

in

Figure 15. The de-trended data of the normalizedBenioff strain of the 30 seismic precursors of the 1989Loma Prieta earthquake.

0 5 10 15 20 250

1

2

3

4

5

6

7

w

Nor

mal

ized

Lom

b de

nsity

Figure 16. Spectrum of the de-trended data in Figure15.

Page 11: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

LOMA PRIETA LOG-PERIODICITY 11

1920 1940 1960 1980 20000

0.2

0.4

0.6

0.8

1

1.2

1.4

t

Nor

mal

ized

Ben

ioff

stra

in

Figure 17. The fitting of the normalized cumulativeBenioff strain of one synthetic sequence to (1).

We note here that the ratio of the probabil-ity of observing a synthetic sequence with SSEsimilar to that from the real sequence dividedby the probability of observing a synthetic se-quence with the most probable SSE is around79%. If we used SSE to measure the goodness ofthe power law model for our synthetic data, thereal sequence would be very close to the mostprobable synthetic sequence.

We then compared the fit of a power law withlog-periodic oscillations to the real and the syn-thetic sequences.

The SSE of the synthetic sequences are in therange [0.005, 0.05], centered around the SSEfrom the real sequence (∼ 0.013). There arearound 1/4 of the synthetic sequences that havea SSE smaller than that from the real sequence.The ratio of the probability of observing a syn-thetic sequence with SSE similar to that fromthe real sequence and the probability of observ-ing a synthetic sequence with the most proba-ble SSE is around 96%. If we use SSE to mea-sure the regularity of log-periodic oscillations,the real sequence is thus very close to the mostprobable synthetic sequence.

The tf from the synthetic sequences is dis-tributed in [1988,1995] (Figure 18). The ratioof the probability of observing a synthetic se-quence with tf similar to that from the real se-quence divided by the probability of observing asynthetic sequence with the most probable tf is

1985 1990 1995 20000

0.05

0.1

0.15

tc

prob

abili

ty d

ensi

ty

Figure 18. The distribution of main shock times (tfin (1)) from the synthetic sequences. The vertical linemarks the value from the real sequence.

around 95%. Thus the apparent accurate predic-tion of the actual main-shock time for the 1989Loma Prieta earthquake [?] might be due tochance. The width of the distribution (FWHM,full width at half maximum) is 6.8 years, nar-rower than that from the fit of a pure powerlaw (7.8 years), suggesting that log-periodicitymay improve power law fits by accounting forthe most probable noise [?].

There is a well-defined peak in the distribu-tion of the frequencies of the log-periodic os-cillations from the fitting of the synthetic se-quences (Figure 19). The ratio of the proba-bility of observing a synthetic sequence with fre-quency similar to that from the real sequence di-vided by the probability of observing a syntheticsequence with the most probable frequency isaround 81%.

We summarize the above results in Table 1.We now turn to the statistics from the char-

acterization of log-periodicity by spectral anal-ysis of the de-trended data. If we only look atthe distribution of peak heights, the ratio of theprobability observing a synthetic peak similarto the real peak divided by the probability ofobserving the most probable synthetic peak isaround 99.9% (Figure 20).

If we only look at the frequencies, the ratioof the probability of observing a synthetic fre-quency similar to that of the real peak divided

Page 12: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

12 HUANG ET AL.

Table 1. Parameters From the Fit of the Synthetic Data With PW (Pure Power Law) and PWLG (Power LawWith Log-Periodic Oscillations)

pra pmpb ratio mpc rd lefte rightf FWHMg

m 0.72 1.99 0.36 0.34 0.69 0.17 0.57 0.40PW tc 0.073 0.13 0.56 1994.3 1988.7 1988.4 1996.2 7.8

SSE 20.0 24.9 0.79 0.028 0.037 0.016 0.052 0.036

m 1.94 2.23 0.87 0.45 0.52 0.25 0.63 0.38tc 0.13 0.14 0.95 1990.5 1989.8 1988.1 1994.8 6.8

PWLG SSE 50.0 51.8 0.96 0.015 0.017 0.0092 0.026 0.014w 0.21 0.26 0.81 4.28 5.67 3.16 6.99 3.83C 2.84 6.71 0.42 -0.039 -0.083 -0.078 0.075 0.15

aValue of the probability density function at a synthetic peak similar to the real peak.bValue of the probability density function at the most probable synthetic peak.cThe most probable value from synthetic data.dThe value from real data.eValue at the left point of the FWHM of the distribution of synthetic values.f Value at the right point of the FWHM of the distribution of synthetic values.gFull width at Half Maximum of a peak.

0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

0.3

w

prob

abili

ty d

ensi

ty

Figure 19. The distribution of frequencies of log-periodic oscillations from the fitting of the syntheticsequences. The vertical line marks the value from thereal sequence.

0 5 10 15 200

0.05

0.1

0.15

p

prob

abili

ty d

ensi

ty

Figure 20. The distribution of peak heights from thede-trended data. The vertical line marks the value fromthe real sequence.

Page 13: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

LOMA PRIETA LOG-PERIODICITY 13

0 2 4 6 8 100

0.05

0.1

0.15

0.2

0.25

w

prob

abili

ty d

ensi

ty

Figure 21. The distribution of frequencies from thede-trended data. The vertical line marks the value fromthe real sequence.

by the probability of observing the most prob-able synthetic frequency is around 56% (Figure21).

When we look at the joint distribution of peakheights and frequencies, the ratio of the proba-bility for observing a peak similar to the realpeak divided by the probability of observing themost probable synthetic peak is about 56% (Fig-ures 22 and 23).

0

20

40 010

20

0.01

0.02

0.03

peak heightw

Figure 22. The distribution of peak heights and fre-quencies from the de-trended data.

Not using the reordering scheme did not sig-

0 10 20 30 40 500

5

10

15

20

25

30

w

peak

hei

ght

Figure 23. The 2D map view of Figure 22. The ver-tical and horizontal lines mark the values from the realsequence.

nificantly change the above results, except that,for some synthetic sequences, the power law wasnot good.

Also, recall that the foregoing results involvedfitting data to a power law with log-periodic os-cillations, then de-trending. Fitting data to apure power law instead produced very similarresults.

We summarize all the results in the previoussections in Table 2.

5. Discussion

These synthetic tests were performed to de-termine whether it is possible to observe the re-portedlog-periodicity [?] from integrated noise in powerlaws, and if possible, how big the probability is.We found that, if we use the highest peak of thespectrum of the oscillations around the powerlaw of the cumulative Benioff strain to quantifythe log-periodicity in the oscillations, peaks sim-ilar to the peak observed from the real sequence(the real peak) were indeed frequently observedfrom the synthetic sequences. The odds of ob-serving a synthetic peak similar to the real peakis more than 50% of the odds of observing themost probable synthetic peak.

It is reasonable to use the highest peak in thespectrum of a signal to quantify the most sig-

Page 14: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

14 HUANG ET AL.

Table 2. Synthetic Tests of the Log-Periodicity Ob-served in the Benioff Strain of the Seismic Precursorsof the 1989 Loma Prieta Earthquake

EOPWa De-trended data

prb pmpc ratio pr pmp ratio

ω 0.11 0.11 0.99 0.12 0.22 0.56h 0.19 0.20 0.93 0.12 0.12 1.00

(ω, h) 0.027 0.031 0.89 0.016 0.029 0.56

(ω, h)d 0.020 0.025 0.81 0.16 0.23 0.73

aExtracted oscillations using the best-fit pure power law.bValue of the probability density function at a synthetic

peak similar to the real peak.cValue of the probability density function at the most

probable synthetic peak.dNo reordering.

nificant frequency component in the signal. Theposition of the peak is the frequency (ω), andthe peak height (h) quantifies the regularity ofthat frequency component. If two signals havesimilar peaks in their spectrums, they must haveoscillations of similar frequency and regularity.

Peaks similar to the real peak were observedfrequently from the synthetic sequences. Toquantify this frequency of observation, we con-structed the probability density function p(ω, h)of the synthetic peaks in the space of (ω, h).From p(ω, h), we were able to obtain the prob-ability of observing a synthetic peak similar tothe real peak (ωr, hr), that is p(ωr, hr) dω dh. Thisprobability might be a small number, which ismeaningful only when compared with the prob-ability of observing the most probable syntheticpeak, which is p(ωmp, hmp) dω dh. The ratio of thetwo probabilities quantifies well how frequentlywe observe the feature from synthetic data.

The mechanism at the origin of log-periodicityin the synthetic data sets has been discussed in[??]. Briefly, log-periodicity results from the factthat taking the cumulative of a power law in-volves a low pass filtering step (reddening of thenoise) which, in a finite sample, creates a max-imum in the spectrum leading to a most prob-able log-frequency corresponding approximately

to 1.5 cycles over the full sampled interval.We looked into two quantities for log-periodicity

in the oscillations around the power law of thecumulative Benioff strain. The extracted oscilla-tions are the difference between the data and thebest-fit power law. The de-trended data wereobtained using (6). For both of them, the ra-tio of the two probabilities is bigger than 50%,which means that it is not only possible to ob-serve that kind of log-periodicity in syntheticdata, but also highly probable.

Our synthetic events and the real events havethe same distribution in time and magnitude,and they were analyzed in exactly the same way.Since discrete scale invariance is not present inthe synthetic data, the log-periodicity observedin the real sequence cannot be used as evidencefor discrete scale invariance.

In fact, even the power law–used as evidenceof ordinary scale invariance–could also be ex-plained by other mechanisms. Indeed, the cu-mulative Benioff strains of the synthetic se-quences in Section 4 do follow power laws sim-ilar to that of the real sequence. For the realsequence, the cumulative distribution functionof event times is not significantly different froma straight line. The cumulative distribution ofmoments is not a power law either (S shapeinstead of the usual power law shape) (this isan ad hoc statement. The reason is that forthis sequence of magnitudes, the number of datapoints is small (only 30) and the magnitude cut-off is very high (5.0). So even if in general themoment distribution is a power law, large sta-tistical fluctuations may make the moment dis-tribution of this sequence non power law. Weuse an empirical distribution function instead ofthe usual power law assumption to avoid depen-dence on that assumption. In fact our methodwill still be valid no matter what the underly-ing distribution is.). But for the real sequence,magnitudes tends to be bigger when closer tothe main shock, especially for the last severalevents [?]. Since small difference in magnitudeswill be translated into quite big difference in thecumulative Benioff strain, the last several eventswould make the would-be linear trend bend up-ward which happens to be well described by apower law of small exponent. Since this increas-ing tendency of magnitudes was preserved in thegeneration of synthetic magnitudes, power laws

Page 15: Re-examination of log-periodicity observed in the seismic ... · Re-examination of log-periodicity observed in the seismic precursors of the 1989 Loma Prieta earthquake Y. Huang1

LOMA PRIETA LOG-PERIODICITY 15

are also good for describing the synthetic data.In fact, without the re-ordering scheme to pre-serve that feature of magnitudes, we were stillable to obtain similar results. As long as themagnitudes are not exactly uniform, the largestmagnitude will bend the would-be linear trendsomewhere, and power law with small exponentscan still describe the data well.

The improved accuracy of the prediction ofthe main shock time of the 1989 Loma Prietaearthquake by consideration of log-periodic os-cillations was regarded as a evidence supportingthe hypothesis in [?]. However, from the studypresented here, it is not rare to obtain tf nearthat value from our synthetic data containingevents in the range of [1940,1988]. If we use apower law to describe the data, by definition tfshould be slightly bigger than the time of thelast data point. Indeed, from our simulations,we found that tf is distributed in [1988,1995],and the chance of observing a synthetic tf simi-lar to the real tf is around 95% of the probabilityof observing the most probable synthetic tf . Thepoint is, given a sequence of events in that timerange and a power law assumption, that kind oftf is not hard to find.

It is important to emphasize that the presentstudy does not alter the usefulness of studyingseismic precursors. However, the physical in-terpretation associated with the observations in[?] does not seem to be warranted, at least onthe face of the Loma Prieta case only. How-ever, as pointed out in [?] and also found inthis paper, log-periodic oscillations are robustfeatures of power laws. The present analysis aswell as those given in [?] suggests that, whatevertheir origin (noise or physical), they might stillbe used to improve the prediction of the mainevent. This is clearly what we observe in oursynthetic tests performed on pure power lawswithout log-periodicity: a power law fit withlog-periodicity has a better estimate for tf thana pure power law without the log-periodic os-cillations. The reason may be that, by fittingthe most probable form of noise, the fit is morestable. It seems worthwhile to investigate thispossibility further in future studies.

Acknowledgments. We are grateful to A. Jo-hansen and C. Sammis for their help in retrievingthe data and to Rick Schoenberg and Y.Y. Kaganas referees for constructive remarks.