## Abstract

The term ‘extreme ocean climate estimation’ refers to the assessment of the statistical distribution of extreme oceanographical geophysical variables. Components of the ocean climate are variables, such as the storm surge, wind velocity and significant wave height. Important characteristics of extreme ocean climate are the frequencies of the exceedances of ocean climate variables over selected thresholds. Assuming that exceedances are statistically independent of each other, their frequencies can be estimated using non-homogeneous Poisson processes. However, exceedances often exhibit temporal dependency because of the tendency of storms to gather in clusters. We assess the effect of these dependencies on the estimation of the rate of occurrence of extreme events. Using a database built under the HIPOCAS European project, which covers the Western Mediterranean Sea, we compare the performance of the non-homogeneous Poisson process approach versus a new model that allows for temporal dependency. We show that the latter outperforms the former in terms of the resulting goodness of fit and significance of the parameters involved.

## 1. Introduction

In the oceanographic context, the estimation of the frequency of extreme events is essential for the analysis of the coastal protection at a particular site. The coastal flooding risk, the reliability of a maritime work and the hydrodynamic conditions of valuable ecosystems are some examples of coastal features that are highly influenced by the rate of occurrence of major storms in a given area. Basically, we are concerned with the evolution in time *t* of sea state parameters, such as wind velocity, *W*(*t*), significant wave height, *H*(*t*), mean wave period, *T*(*t*) and surge, *S*(*t*). The later term refers to the sea-level anomalies induced by meteorological forcing in the forms of changes in air pressure and winds (Pugh 2004). Besides, we are interested in temporal parameters, such as the duration of exceedances, *D*(*t*) (Sobey & Orloff 1999) and the rate of occurrence, *ν*(*t*), of a particular intensity level of *W*, *H* or *S*. The statistical distribution of (*W*, *H*, *T*, *S*, *D*, *ν*) at a given point in the sea can be called ocean climate.

Historically, the study of ocean climate has been carried out by means of instrumental devices for waves (e.g. buoys; see Massel 1996) and for sea-level (e.g. tidal gauges; see Pugh 2004). This information, although quantitatively suitable, is usually sparse in time and space and often has many missing data. In recent years, some non-intrusive tools have been developed; e.g. satellite data (Krogstad & Barstow 1999) and databases from hindcast numerical models (WASA 1998). In particular, the hindcast or reanalysis models allow a very detailed description in time and space of ocean climate. In general, time and space atmospheric fields (Weisse & Feser 2003) are behind state-of-the-art wave (WAMDI 1988) and sea-level numerical models (Alvarez *et al*. 1998). These numerically generated databases allow a spatial characterization of ocean climate in extent areas. The area of interest in this paper will be the Western Mediterranean basin. Reanalysis of wind wave as well as sea-level in the Mediterranean are used within the framework of the HIPOCAS European project (Guedes Soares *et al*. 2002). These databases have been extensively validated (EPPE 2003; Garcia-Sotillo *et al*. 2005) and consist of a 44-year hindcast (1958–2001) of hourly data with 0.125° spatial resolution for waves and about 0.25° for surge and 0.5° for wind velocity. The purpose of this paper is to analyse the influence of the temporal dependence on the modelling of the rate of occurrence (the frequency) of extreme ocean climate events, focusing mainly on the analysis of extreme storm surge. The spatial variability of this temporal dependence is also considered.

Figure 1 shows the time-series of positive surges at Valencia (Spain) from 1958 to 2001. The figure also shows the exceedances over the 99.5% quantile of the surge empirical distribution. One can see that some exceedances are sufficiently far from each other to justify the hypothesis that they are statistically independent, at least as a useful approximation. However, clusters of exceedances appear from time to time in which the individual exceedances are so close that they cannot be assumed to be statistically independent (Hsing *et al*. 1988); e.g. the three exceedances occurring the 18 (20 h), 21 (7 h) and 23 (18 h) of January 1963, form one of these clusters (marked in the upper part of figure 1 as i_{10}), which is associated with a low barometric pressure disturbance moving eastward. An approach that has often been used to avoid this problem is to choose a time span (e.g. 3 days), such that extreme events separated by less than this period of time are considered as a ‘unique event’, whose magnitude and time of occurrence are considered to be those of the most extreme event in the cluster. This practice, however, has at least the following two flaws:

by doing so, the number of extreme events is reduced artificially with the consequence that some empirical evidence is disregarded (see figure 2),

it is not easy to determine the length of the time span appropriate for this purpose.

Throughout the paper we analyse the rate of occurrence of exceedances over the 99.5 and 95% quantiles of the empirical distributions of three variables; namely, the storm surge, wind velocity and significant wave height. In these studies, we compare the results provided by a standard non-homogeneous Poisson process approach (§2) versus those obtained with a new approach that accounts for the statistical dependence of temporally close storms (§3). To compare these two approaches, we have considered the set of data of the HIPOCAS project around the Western Mediterranean coastal area, including the Balearic Islands (see figure 3). In each of these nodes we have studied the extreme behaviour of the three variables. As we shall see throughout the paper, the results consistently demonstrate the superiority of the new model. For illustration, estimation of the rate of occurrence of the exceedances over the 99.5% quantile of the surge empirical distribution at a typical node, using the non-homogeneous Poisson process approach of §2, without time span and with time spans of 3, 6 and 10 days, yields the quantile–quantile (QQ) plots shown in figure 4. One can see that the goodness of fit improves as the length of the time span increases; this improvement is very remarkable when the time span increases from 0 to 3 days, somewhat less impressive from 3 to 6 days, and almost inappreciable from 6 to 10 days. However, the number of extreme events decreases (from 137 to 94, 83 and 80) as the length of the time span increases (from 0 to 3, 6 and 10 days, respectively). The same problem appears for other thresholds; e.g. the 95% quantile of the surge empirical distribution leads to figure 5, which shows even worse goodness of fit of the Poisson model for all the time spans considered. In contrast, the approach in §3, which takes the statistical dependence of temporally close storms into account, leads to the QQ plots of figure 6. These QQ plots are not only much more satisfactory than those in figures 4 and 5, respectively, but also take all the 137 and 802 exceedances into account. Both selected thresholds (99.5 and 95%) are important from an engineering point of view. Thus, the 99.5% quantile is often used as an appropriate threshold to determine exceedances when using extreme value models to design the structural defences of harbours (i.e. the reliability of maritime works; see Ferreira & Guedes Soares 1998; Goda 2000). Moreover, the threshold based on the 95% quantile can be used to define the harbour tranquility or offshore agitation (i.e. the operativity of maritime works; see Bruun 1981).

The non-homogeneous Poisson process approach is described and applied to the nodes of figure 3 in §2, whereas §3 describes the new model and reanalyses the same datasets taking temporal dependence into account. Concluding remarks are given in §4.

## 2. Fitting a non-homogeneous Poisson process

### (a) The model

Figure 7 shows the occurrence of extreme storm-surge events (exceedances) during 44 years in Valencia (Spain), based on the 99.5 and 95% quantiles of the surge empirical distribution. One can see that the rate of occurrence of exceedances presents a clear seasonal pattern. When using the 99.5% quantile, no single event occurs during a large period within the year, which includes summer. However, when using the 95% quantile, there are some scattered events even during summer. Let us suppose that the occurrence of these extreme events can be modelled as a non-homogeneous Poisson process with intensity *ν*(*t*) (see Davison & Smith 1990; Coles 2001; Smith 2001). Let *t*_{0} be the time at which the observation of the Poisson process starts and *S*_{T}=(*t*_{1}, …, *t*_{n}) be the times at which consecutive exceedances occurs. Under the Poisson hypothesis, the cumulative distribution function of the time *t*_{i} at which the *i*th event occurs conditioned on the time *t*_{i−1} of the previous event is given by(2.1)Consequently, the joint probability density function (PDF) of *S*_{T}, given *t*_{0}, is(2.2)

To approximate the seasonal behaviour of the occurrence of exceedances, we shall consider the following models for *ν*(*t*), where *t* is expressed in years:

*Single harmonic model with long-term trend (SHM)*. The intensity is given by(2.3)where the parameters*θ*_{0},*θ*_{1},*θ*_{2}must be such that*ν*_{S}(*t*)≥0 for all*t*.*Double harmonic model with long-term trend (DHM)*. The intensity is given by(2.4)where the parameters*θ*_{0}, …,*θ*_{4}must be such that*ν*_{D}(*t*)≥0 for all*t*.*Trimmed single harmonic model with long-term trend (TSHM)*. The intensity is given by(2.5)where the parameters*θ*_{0},*θ*_{1},*θ*_{2}in (2.3) are such that*ν*_{TS}(*t*)>0 for some values of*t*.*Trimmed double harmonic model with long-term trend (TDHM)*. The intensity is given by(2.6)where the parameters*θ*_{0}, …,*θ*_{4}in (2.4) are such that*ν*_{TD}(*t*)>0 for some values of*t*.

The factor exp(*θ*_{5}*t*) in the right-hand side of equations (2.3)–(2.6) accounts for temporal trends in the intensity of the Poisson process. Parallel detrended models can, obviously, be considered imposing that *θ*_{5}=0.

Substituting *ν*_{S}(*t*), *ν*_{D}(*t*), *ν*_{TS}(*t*) or *ν*_{TD}(*t*) for *ν*(*t*) in equation (2.2), and taking logarithms, leads to the log-likelihood function of the parameters *θ*_{i} given the sample *S*_{T},(2.7)which may be used to obtain maximum likelihood (ML) estimates of the parameters *θ*_{i}.

### (b) Application to storm-surge data

Let us suppose that the extreme storm-surge events in the Valencia data of figure 7 can be considered statistically independent, so that they occur according to a non-homogeneous Poisson process with intensity given by (2.3)–(2.5) or (2.6). One can fit these models using ML estimators and, subsequently, assess the goodness of fit of the models (including the independence hypothesis) using QQ plots. For example, given *t*_{i−1} (*i*=1, …, *n*), equation (2.1) can be used to transform the random variables *t*_{i} (*i*=1, …, *n*) to the independently and uniformly distributed random variables,(2.8)A QQ plot can be obtained using the plotting points (*y*_{(i)}, *r*_{i}), where *y*_{(i)} is the *i*th sample order statistic for the transformed sample {*y*_{1}, …, *y*_{n}} and *r*_{i} is a function of the rank *i* of *y*_{(i)} (e.g. if the modified Kaplan–Meier score is used, ).

The upper-left subplot of figure 4 provides the QQ plot obtained for the Valencia data with threshold placed at the estimated 99.5% quantile of the surge distribution. Because this subplot suggests a lack of fit, which is caused by the presence of too many values of *y*_{(i)} close to 0, one may conclude that the independence hypothesis cannot be accepted. We have repeated the analysis using non-null time spans of 3, 6 and 10 days. The remaining subplots of figure 4 provide the corresponding QQ plots. One can see that, as the time span increases, the fit improves. However, the number of data decreases considerably.

Figure 8 shows the estimates obtained for *ν*(*t*) during a period of four consecutive years using a time span of 10 days (so that the sample size is 80) and the four models in equations (2.3)–(2.6). One can see that the TSH and TDH models in (2.5) and (2.6) appear to be best, suggesting that there is a large period of time within the year (including summer) during which no extreme (at the 99.5% quantile) events occur in Valencia.

### (c) Application to wind velocity and significant wave height data

We have repeated the analysis for the wind velocity and significant wave height data at the points shown in figure 3. The results are similar to those previously obtained for storm surge. For the sake of brevity, we do not show any QQ plots here. Nevertheless, we provide some summarizing results in §3.

## 3. Taking temporal dependence into account

### (a) The model

Figure 1 suggests that there is a tendency for extreme events (exceedances) to appear in clusters. This tendency induces temporal dependence between successive exceedances. Consequently, when the summer of a given year ends, we can start saying that during the next storm season the occurrence of exceedances will be determined by a non-homogeneous Poisson process with intensity *ν*_{1}(*t*). However, as soon as the first event (or any subsequent event) of the season occurs, we shall accept that the intensity of the process increases because of the occurrence of that event. This may be modelled by assuming that the intensity of the process increases (over *ν*_{1}(*t*)) by some amount *ν*_{2}(*t*) for some period of time immediately following the occurrence of each event.

More specifically, assuming that *t*_{0} is a given instant during summer, the probability that no event occurs during the time interval (*t*_{0},*t*_{1}] is . Therefore, the PDF of the time of occurrence of the first event is(3.1)

Similarly, given that the time of occurrence of the (*i*−1)th event is *t*_{i−1}, the PDF of the time of occurrence of the *i*th event is(3.2)

Although the process we have just defined is not a process of independent increments, one can still compute the joint PDF of the times *S*_{T}=(*t*_{1}, …, *t*_{n}) at which consecutive extreme events occur by using(3.3)where the factors in (3.3) are given by (3.1) and (3.2).

Consequently, if the baseline intensity *ν*_{1}(*t*) is given by models (2.3)–(2.6), and the dependence-inducer intensity *ν*_{2}(*t*) is modelled as(3.4)the log-likelihood function of the parameters *θ*_{i} and *γ*_{i}≥0 can be obtained by simply taking logarithms in (3.3). Maximization of this log-likelihood function produces ML estimates for *θ*_{i} and *γ*_{i}.

### (b) Application to storm-surge data

The extreme storm-surge events in the Valencia data of figure 7, which were analysed in §2*b* using the non-homogeneous Poisson process of §2*a*, can now be reanalysed replacing this Poisson process by the temporal dependence model just introduced. By doing so, we are able to consider the whole sample of 137 extreme events (for the 99.5% threshold) and 802 extreme events (for the 95% threshold) by using zero time span and, simultaneously, obtain much more satisfactory QQ plots than in §2*b*.

Thus after fitting the temporal dependence model of §3*a* using ML methods one can obtain QQ plots based on equations (3.1) and (3.2). More specifically, given *t*_{i−1} (*i*=1, …, *n*), the random variables *t*_{i} (*i*=1, …, *n*) can be transformed to the independently and uniformly distributed random variables(3.5)where *ν*_{2}(*t*_{1}|*t*_{0})=0. QQ plots can then be obtained using the plotting points (*ψ*_{(i)}, *r*_{i}) and otherwise applying the same procedure of §2*b*.

For the extreme storm-surge events in the Valencia data of figure 7, the temporal dependence model of §3*a* leads to the QQ plots of figure 6 which are much more satisfactory than the corresponding QQ plots in figures 4 and 5. Figure 9 shows the estimates obtained for during a period of four consecutive years using zero time span (so that the sample size is 137) and the TDH model in equation (2.6); the parameter estimates are , , , , , , and , all of them expressed as yr^{−1}. One can see that the rate of occurrence of extreme (at the 99.5% quantile) storm-surge events increases considerably for a short period immediately following the occurrence of each such event.

We have repeated the same analysis to all the points in figure 3. For the sake of brevity, we summarize the results in the map of figure 10, which shows half the likelihood ratio test statistic (i.e. the difference between the maximum values obtained for the log-likelihood function) to test the null hypothesis *H*_{0}:*γ*_{0}=*γ*_{1}=0. This hypothesis (of temporal independence) is rejected at all the points shown in the map with a level of significance of 0.005.

### (c) Application to wind velocity and significant wave height data

We have repeated the analysis for the wind velocity and significant wave height data at the points shown in figure 3. The two maps of figure 11 show half the likelihood ratio test statistic to test the null hypothesis *H*_{0}:*γ*_{0}=*γ*_{1}=0 for the ‘99.5% threshold and zero time span’ extreme wind velocity events (upper map) and significant wave height events (lower map), based on the temporal dependence model of §3*a* and the TDH model of equation (2.6). Using a level of significance of 0.005, the null hypothesis (of temporal independence) is rejected at 98.62 and 56.40% of the points, respectively; whereas, for a significance level of 0.05, these percentages increase to 99.65 and 74.39%.

### (d) Application to 95% thresholds

We have performed the same analysis setting thresholds at the 95% quantiles of the surge, wind velocity and significant wave height empirical distributions. The resulting figures (not shown) indicate that the trimmed models in (2.5) and (2.6) provide more similar estimates than their untrimmed versions in (2.3) and (2.4), which can be justified because of the extreme events occurring during the summer season. Moreover, the DHMs in (2.4) and (2.6) significantly outperform their single harmonic counterparts in (2.3) and (2.5), as a consequence of the large number of exceedances resulting for the 95% threshold. Another consequence of this large number of exceedances is that the null hypothesis *H*_{0}:*γ*_{0}=*γ*_{1}=0 can be rejected for smaller values of the probability of the type I error in all the points of figure 3; in other words, the level of significance of the parameters accounting for temporal dependence is smaller for the 95% threshold than for the 99.5% threshold. Furthermore, the crucial improvement shown by the QQ plots obtained with the temporal dependence model of §3, which is exemplified by the right part of figure 6, with respect to the unacceptable QQ plots resulting from the non-homogeneous Poisson model of §2, which is exemplified by figure 5, is a general pattern for all the points in figure 3.

## 4. Concluding remarks

We have compared two methods for the analysis of the rate of occurrence of extreme storm surge, wind velocity and significant wave height events. We have shown that the model that contains terms accounting for temporal dependence outperforms the non-homogeneous Poisson process approach based on the independence of successive extreme events.

Concerning extreme storm-surge events, figure 10 indicates that temporal dependence is more pronounced in the most protected areas of gulfs, such as the Strait of Gibraltar, the Levantine coast of Spain or the Gulf of Leon (at the south of France). This behaviour suggests that temporal dependence can be associated to persistent atmospheric situations in which low barometric pressures associated with storms cause accumulation of water in closed areas such as gulfs. In the area of the Strait of Gibraltar, temporal dependence may also be caused by the very frequent storms appearing in the Alboran Sea (at the north of Morocco and Algeria), which produce persistent eastern winds that easily generate extreme storm-surge events because of the funnel shape of the maritime area.

The same frequent storms in the region of Algeria appear to be the main cause of the strong temporal dependence of extreme winds at the north of Africa as shown in the upper part of figure 11. Depending on the wind direction and the duration of the wind event (only persistent winds in open water can create the necessary fetch to produce the high waves associated with energetic sea states), extreme wind events may or may not be associated with extreme significant wave height events, so that the latter are less frequent than the former as a comparison of upper and lower maps in figure 11 suggests.

## Acknowledgments

The authors thank Puertos del Estado (of the Spanish Ministry of Fomento) for supplying the data from the HIPOCAS project and acknowledge the support of the Oficina Española de Cambio Climático (of the Spanish Ministry of Medio Ambiente). The work was partially funded by projects ENE2004-08172 and CGL2005-05365/CLI from the Spanish Ministry of Educación y Ciencia and by a project entitled ‘Estudio y determinación de regímenes de persistencia de oleaje en el litoral español’ from the Spanish Ministry of Fomento. F.J.M. and M.M. are indebted to the Spanish Ministry of Educación y Ciencia for funding them through the ‘Ramón y Cajal’ and FPI programs, respectively. A.L. acknowledges the support of the Spanish Dirección General de Investigación under grant MTM2005-00287. We thank three anonymous referees for their helpful comments.

## Footnotes

- Received June 15, 2005.
- Accepted December 21, 2005.

- © 2006 The Royal Society