## Abstract

The experimental evidence for the Hall–Petch dependence of strength on the inverse square-root of grain size is reviewed critically. Both the classic data and more recent results are considered. While the data are traditionally fitted to the inverse square-root dependence, they also fit well to many other functions, both power law and non-power law. There have been difficulties, recognized for half-a-century, in the inverse square-root expression. It is now explained as an artefact of faulty data analysis. A Bayesian meta-analysis shows that the data strongly support the simple inverse or ln*d*/*d* expressions. Since these expressions derive from underlying theory, they are also more readily explicable. It is concluded that the Hall–Petch effect is not to be explained by the variety of theories found in the literature, but is a manifestation of, or to be underlain by the general size effect observed throughout micromechanics, owing to the inverse relationship between the stress required and the space available for dislocation sources to operate.

## 1. Introduction

In the years around 1950, two effects of size were identified in the strength of materials; both can be summarized as *smaller is stronger*. Hall [1] and Petch [2] found that the strength of iron and steel increases when the grain size is smaller. On the basis of the theoretical work on dislocation pile-up by Eshelby *et al.*[3], their work established experimentally the eponymous relationship:
*d* is the grain size, *σ*(*d*) is the stress at yield or a flow stress at higher plastic strains, *σ*_{0} is the corresponding stress for large single crystals or very large grained material (we refer to it here as the bulk stress), and *k*_{HP} is a constant that may be predicted by theory or may be considered to be a material constant. This relationship was soon reported to apply quite generally to other metals; however, we show here quantitatively that the data do not, in fact, support equation (1.1).

On the other hand, Frank & van der Merwe [4] and later van der Merwe and co-workers, and especially Matthews and his co-workers, investigated theoretically and experimentally the elastic misfit strain that could be supported by thin epitaxial layers of one metal or semiconductor grown on another. By considering the force balance on threading dislocations or the minimum energy configuration of the system, Matthews developed the relationship between the maximum or critical thickness *h*_{c} for a given misfit *ε*_{0} [5]. For a 001-oriented layer, this is given as
*b* is the magnitude of the Burgers vector, *ν* is Poisson's ratio, and *θ* and *λ* are angles between the slip plane and the Burgers vector and the growth plane. Many versions of equation (1.2) were given subsequently by various authors [6].

Over the decades that followed, these two size effects were addressed by different communities, with very little interaction. Matthew's critical thickness theory was developed and applied within, largely, the semiconductor device community in the context of the strained heterostructures required for, e.g., semiconductor lasers [7] and high-electron-mobility transistors [8]. This theory remains essentially correct. The principal modification relevant here was the realization that for significant plastic relaxation of the elastic strain a relaxation critical thickness needed to be defined, about four or five times the *h*_{c} of equation (1.2), to take account of the operation of dislocation sources [6,9–11]. From equation (1.2), elastic strain rather than stress (i.e. stress normalized by the relevant elastic modulus), and normalized size (*d* measured in units of Burgers vector *b* or lattice constant *a*_{0}), are the relevant parameters. These considerations lead to a general size-dependence equation:
*k* is expected to be of the order of unity. In this paper, *d* will be the grain size in units of *a*_{0} and the bulk strength is described by the elastic strain *ε*_{0}=*σ*_{0}/*Y* . Equation (1.3) is theoretically applicable to any situation where a dimension (such as grain size) constrains the size of the dislocation sources that have to operate whether plasticity is to occur, and their dislocation curvatures. We refer to it below as the size-effect equation from dislocation curvature (the EDC equation). Our Bayesian meta-analysis of a large body of Hall–Petch data in §3 shows that this equation is supported by the data.

Meanwhile, in the wider material science community, a number of theories were put forward to supplement the pile-up theory [4] in accounting for the inverse square root of *d* in equation (1.1) (see §4). Size effects became recognized in micromechanical testing generally, in nano-indentation [12], in thin wires under torsion [13], in thin foils in flexure [14] and most dramatically in micropillars under compression [15]. Despite a few key papers—such as that of Nix [16] applying critical thickness theory to thin films, that of Thompson [17] addressing grain-size effects in thin films in the framework of critical thickness theory, and our own [18] applying critical thickness theory to wire torsion and foil bending—theories of the micromechanical size effect proliferated in parallel with the various theories of the Hall–Petch effect. One symptom of this was the expression of the effect of the size of the specimen or of the loaded region in micromechanical testing as
*a* is some suitable characteristic dimension such as micropillar diameter or indentation contact radius. Much effort has been invested in finding appropriate values of the scaling exponent *x* for particular datasets, particular materials, for types of materials such as FCC or BCC metals, and for large collections of data (e.g. [19,20]). However, we have suggested that such efforts are in vain. Despite apparent good fits to equation (1.4) with various *x* in the range 0<*x*<1, we proposed that *x*=1 (or the equation (1.3)

Returning to the Hall–Petch effect, many authors have considered exponents other than *x*=1/2. Some proposed other exponents because they fit some datasets (e.g. *x*=1/4 [22], *x*=0.66 [23]). Baldwin [24] and Kocks [25,26] pointed out the difficulty (or impossibility) of deciding which exponent fits the data best. On theoretical grounds closely related to the Matthews theory of equation (1.2), Bragg, as early as 1942 [27], and Kocks [26] proposed *x*=1. Hirth acknowledged these earlier proposals, but did not consider them further and adopted *x*=1/2 [28]. Narutani & Takamura [29] showed data for large-grain nickel fitting better *x*=1 at high strain. More recently, Arzt [30] and Saada [31] recognized the theoretical arguments for *x*=1 but also the strength of the experimental evidence for *x*=1/2.

Until recently, it is only in the context of the relationship between subgrain or dislocation cell size and stress during work-hardening that *x*=1 has been considered seriously both experimentally [32–34] and theoretically [35–38]. Following Matthews [5], the argument of similitude notes that if a dislocation structure is at equilibrium under a stress *τ*, then if that structure is rescaled to a size *n* times smaller, the stress must now be *nτ*. Raj & Pharr [34] collated a large amount of experimental data and identified a correlation between the prefactor and the exponent, a correlation of the type that suggests that the range of fitting parameters is simply due to experimental error [39,40]. It may be thought surprising, none of these authors considered extending the argument from sub-grains to the Hall–Petch grain-size effect too.

Recently, there has renewed interest in data, simulation and theory suggesting *x*>1/2 for the Hall–Petch effect itself [41–44]. More typically, Hansen [45] concludes that although no mechanism has been quantified to the extent that it would verify equation (1.1), nevertheless equation (1.1) is empirical and has predictive capability. Indeed, Hahn & Meyers [44] note that equation (1.1), *x*=1/2, with the associated theory of pile-up, is so deeply embedded in the fabric of material science as to be indelible.

In a previous paper, we showed that the micromechanical data are consistent with equation (1.4) with *x*=1 and with equation (1.3) with *k*∼1. We drew attention to the complete lack of any data falling *under* the line of equation (1.3) with *k*∼1 and *ε*_{0}=0, implying that equation (1.3), thus, describes the *minimum* strength owing to the size effect [21]. Then other strengthening mechanisms lead to data above the line, but if plasticity occurs through dislocation multiplication and motion, there are no weakening mechanisms to give data below the line. A collection of datasets from the literature displaying the Hall–Petch effect are likewise concentrated above this line. We proposed that this can be taken as experimental *support* for the applicability of equation (1.3) to the Hall–Petch effect, while the data are merely *consistent* with equation (1.1) [46]. It is this proposal that we develop here. In §2, we review the data, both those used in [46] and many additional datasets. We show that the analyses of the datasets taken individually provide no support for equation (1.1), and that these analyses can neither determine the value of *x* in equation (1.4) nor even show that equation (1.4) applies. In §3, we give a fully Bayesian analysis of the support the data taken as a whole (i.e. a meta-analysis) gives to the different hypotheses, equations (1.1) and (1.3). Finally, in §4, we compare the predictions of the different theories of the Hall–Petch effect with the data. We conclude that equation (1.3) and the theory from which it derives always apply. That is, it describes the size dependence of dislocation plasticity *in general*, and *specifically* in the grain-size effect, while of course underlying other effects which may also increase the strength of metals. The indelible may yet be erased.

## 2. Review and analysis of the data

We present in figure 1 61 datasets of which 17 were already considered in [46]. These datasets, among many others, are what is referred to by authors who say that the data support equation (1.1). We show in §2b that these datasets support neither equation (1.1) nor even equation (1.4). In §3, we use meta-analysis to show that the ensemble of data support equation (1.3) very strongly.

### (a) Data selection and presentation

The data presented here have not been selected in any way. They are simply all the data we have found at the time of writing relevant to testing equation (1.1). Our literature search methods consist of using information from colleagues, following up references and citations and Internet search engine results. It is highly implausible that these methods would yield a selective sampling of the literature that could be biased against data supporting equation (1.1).

Most of the original authors plotted against the inverse square-root of grain size and reported straight-line fits in accordance with equation (1.1). Digitized and changed from the authors' units to SI units, all the full datasets are given in the electronic supplementary material. Normalized by Young's modulus *Y* and lattice constant *a*_{0}, they are plotted in figure 1, on double logarithmic axes because of the very wide range of data values. It is worth noting here that the exact value of *Y* for each metal is not important. The purpose is to facilitate comparisons by taking out the *known* differences between metals. We use average or representative values from handbooks for each metal, given below. Similarly, while the Burgers vector **b** for each metal might be known quite accurately, the relevant projection *b* may not be, so we use the handbook values of lattice constant as a proxy. The fits shown in figure 1 are described in §2b. The heavy lines are equation (1.3) drawn for *k*=0.72 and *ε*_{0}=0. Two key features of these plots, analysed below, are that the different fits for *x*=1/2 and *x*=1 diverge significantly only outside the range of each dataset (§2b), and that no data are found significantly below the equation (1.3) *ε*_{0}=0 line (§3).

For each dataset, we give in the electronic supplementary material the information in the original papers about the metallurgical processing, especially grain-size modification and determination, and the yield or flow stress determination, or we mention the absence of this information. However, we do not use this information. The original authors did not correct their raw data for any known effects that might follow from these variables, and it would not appropriate, even if possible, to do so here.

The normalization constants used for iron and steel are *Y* =211 GPa, *a*_{0}=0.287 nm [46]. The data shown in figure 1*a* come from Hall [1,46] (the attribution to Dunstan & Bushby [46] indicating that we used these data in [46], Fe(7); Petch [2,46], Fe(1); Armstrong *et al.* [47], Fe(6)—here and below, where there are multiple datasets under one key it is because the authors reported data at various values of strain; Douthwaite [48], Fe(5); Douthwaite & Evans [49], Fe(2); Kashyap & Tangri [50], Fe(4); Aghaie-Khafri *et al.* [23], Fe(3).

For brass and copper, we used *Y* =115 GPa and *a*_{0}=0.361 nm [46]. Data shown in figure 1*b* come from Bassett & Davis [46,51], B(1), B(2) and Babyak & Rhines [46,52]. We took these data from Jindal & Armstrong [53], who plotted these data against *d*^{−1/2}. Armstrong & Elban [54] also reported that Mathewson [22] fitted the inverse fourth-root to the data of Bassett & Davis [51]. Data come also from [46,47], B(4) and [46,48], B(5). The copper data in figure 1*c* are from Feltham & Meakin [55], Cu3; Hansen & Ralph [56], Cu(1) (room temperature data) and Cu(2) (77 K data).

Some of the data in figure 1*d* is diamond point hardness (DPH) data, for which we divide by the Tabor factor of 2.8. For W (DPH), it comes from Vashi *et al.* [46,57]; Cr (DPH) from Brittain *et al.* [46,58]; Ti(1) (DPH), Hu & Cline [46,59]; Jones & Conrad [60], Ti(2). For W, we used *Y* = 411 GPa and *a*_{0}=0.316 nm; for Cr, *Y* =279 GPa and *a*_{0}=0.228 nm; for Ti, *Y* =116 GPa and *a*_{0}=0.295 nm [46].

For silver, we used *Y* =83 GPa and *a*_{0}=0.409 nm. The data in figure 1*e* are from Aldrich & Armstrong [61], Ag(1), Ag(2). They compared linear fits to *d*^{−1}, *d*^{−1/2} and *d*^{−1/3} and concluded that *d*^{−1/2} fitted best. They ruled out the *d*^{−1/3} fit on the grounds that it gives an unphysical negative intercept on the *y*-axis—note that the datasets Fe(3), Au, Al(4) and Al(5) do the same in the *d*^{−1/2} fits. The dataset Au, using *Y* =79 GPa and *a*_{0}=0.408 nm is from Emery & Povirk [62]. The nickel data, with *Y* = 200 GPa and *a*_{0} = 0.352 nm, is from Thompson [63], Ni(1); Keller & Hug [64], Ni(2); Narutani & Takamura [29], Ni(3). Keller & Hug studied foils with a thickness to grain size ratio *t*/*d* between 1.3 and 15. At yield stress, they observed a normal Hall–Petch behaviour for *t*/*d* = 15 and we use these data. For higher strain and smaller *t*/*d*, deviations from the normal Hall–Petch behaviour were observed, and explained in terms of the effect of the free surface on the work-hardening mechanisms. These data are not considered here.

Figure 1*f* shows data for aluminium, using *Y* =70 GPa and *a*_{0}=0.316 nm. They are from Carreker & Hibbard [65], Al(3); Hansen [66], Al(1) and Al(2); Tsuji *et al.* [67], Al(4); Yu *et al.*[68], Al(5) and Al(6).

### (b) Fits to the data

Using the *Mathematica*^{©} function *NonlinearModelFit*, the data were fitted with equation (1.1) (HP fit), with equation (1.3) with *k* as a free-fitting parameter (ECD fit), and with equation (1.4) with *x*=1 (SI fit) and also with *x* a free-fitting parameter (EQ4 fit). Some (not all, for clarity) of these fits are shown in figure 1. Full details, fitting parameter values and *R*^{2} values are given in the electronic supplementary material.

All the fits are very good, with *R*^{2} values typically well over 0.999. However, the exponents returned by the EQ4 fit are scattered about *x*=1/2—25 are more than and 25 less than 1/2. All 40 of the datasets returning *x*<0.7 have *R*^{2} values favouring the HP fit over the SI and ECD fits; only 15 with *x*>0.7 have *R*^{2} values favouring the SI or EDC fits. These observations might be taken to favour the HP fit. A detailed analysis of a few typical datasets, however, shows that it is not so.

We choose Cu(1) for this detailed analysis because the three datasets in it have relatively little scatter, Ag(1) because it was considered by Aldrich & Armstrong [61] as potentially fitting exponents of *x*=1/3, *x*=1/2 and *x*=1, and B(1) because, like many of the iron and steel datasets of figure 1*a*, it has unusually high values of *k*_{HP} and *k*, and a wide range of grain size. Additional fits were carried out, to models with different functional forms, namely
*R*^{2} coefficient of determination. The other seven, from SI through to EQ4, return 1−*R*^{2} values (the proportion of the variance in the data which is *not* explained by the model) that are very small, and very similar for the different models across each dataset. The 1−*R*^{2} values clearly reveal no evidence that the true dependence is a power-law dependence as in equation (1.4), rather than any other function that is monotonically decreasing with grain size, asymptotically to *σ*_{0} and so with some positive curvature, such as LOG and EXP. The silver data have more scatter than the other datasets, and therefore, consistently higher values of 1−*R*^{2}. The only significant feature of these fits is that the model EQ4 returns consistently low values for the exponent *x*—though with large uncertainties—and consequently, generally better *R*^{2} values even than the HP model.

There are, however, assumptions in the least-squares fitting procedures. The assumptions are that the grain sizes are as specified, that the scatter comes from Gaussian-distributed errors in the measurements of the yield or flow stresses, and that the least sum of squared residuals is an unbiased estimator. The effect of these assumptions is best demonstrated by setting up dummy datasets and subjecting them to the same fitting routines. First, the dummy datasets HP*y*Cu and HP*y*Ag, with random errors added to the *y* (stress) values:
*d*_{i} and the other parameters are taken from the Cu(1) 5% strain and the silver Ag(1) datasets and their HP fits. The random numbers *ε*_{i} are drawn from the normal distribution with mean *μ*=0, standard deviation *σ*=2.8 MPa (equal to the standard deviation of the residuals in the HP fit to the Cu(1) 5% strain data) and *σ*=16 MPa for silver. Generating 500 such datasets and fitting each with EQ4, exponents *x* are obtained with mean values *y*Cu and SI*y*Ag are set up according to equation (2.1) but with the SI fits (*k*/*d*). From table 1, for these four dummy datasets, set-up according to the assumptions above, and specifically with the random error in the data attributed entirely to *σ* (*y*-axis values), the fitting returns exponents the same within error as those used to create the dataset. (Note that the error in the mean is the standard deviation divided by *R*^{2} values are comparable with those of the real datasets.

The situation is quite different when we put the scatter on the grain sizes instead of the stresses. Now, the yield or flow stresses are taken to be definite. The errors in grain size measurement are expected to be proportional to grain size—i.e. a lognormal distribution—and so we set up dummy datasets as
*σ* and *d* axes exchanged) and fitted with the inversions of the functions HP and SI to obtain the parameters *k* and *σ*_{0}. The random parameters *ε*_{i} are drawn from the normal distribution with mean *μ*=0, standard deviation 0.11 for copper and 0.18 for silver, chosen to give the same variance on the residuals as the HP fits. Fitting 500 such datasets with the eight functions LIN to EQ4, *R*^{2} values are much as before. But the exponents found by fitting with EQ4 are now dramatically smaller than the values used to set up the datasets, 0.2–0.4 for the HP dummy datasets where the true *x*=1/2, and 0.3–0.6 for the SI datasets where the true *x*=1.

### (c) Discussion of fits to the data

As noted in [46] for 17 datasets and as confirmed quantitatively in §2b for 61, fits of the data to equation (1.1) and fits to equation (1.3) (with or without the ln*d* term) are equally good. The rigorous statistical analysis given here confirms that the data cannot distinguish between these, nor between these and non-power-law models. From this analysis, there is no experimental support even for a power law with uncertain or variable exponent *x* as in equation (1.4). The most that can be said of the Hall–Petch effect from analysis of these datasets is that the strength decreases monotonically—but with positive curvature—as the grain size is increased.

The low values of the exponent obtained by fitting with equation (1.4), and the low values reported in the literature for the last 60 years, are fully explained by assuming a moderate random error in the grain size determination. That is demonstrated here by using dummy datasets in standard nonlinear least-squares fitting. An error analysis of the grain size determinations in the literature is not possible—Rhines [69] listed about 10 ways of determining the grain size, and most authors do not give this detail. A deeper mathematical understanding of the effect of grain-size variance on the least-squares fitted exponent can be obtained by further analysis, but that is outside the scope of this paper. The vertical least-square residuals are a biased estimator for nonlinear models, and equation (1.4) is nonlinear in *x*. The orthogonal least-square residuals estimator that is sometimes used will be similarly biased, and this is especially relevant when fitting is done by eye, as much of the earlier data would have been. There exist fitting procedures that can handle errors in the independent variable (here, grain size), e.g. Deming regression, but they require estimates of the errors which are not available here.

Simulation and modelling are beginning to be able to display the Hall–Petch effect and predict Hall–Petch slopes (e.g. [42,70,71]). In this case, there is no error bar on the grain size, and such work is indeed beginning to show that equation (1.3) is preferable to equation (1.1) [43].

The outcome of this section is to show that, without benefit of theory, the two- and three-parameter fits, SI, HP, EDC and EQ4, cannot determine the true functional form obeyed by the data—not even to confirm it to be a power law. In addition, an explanation is found, why equation (1.1) might be considered to be the best fit to the data even if the data actually obey equation (1.3). In the next section, we show that meta-analysis supported by theory can reach an unambiguous conclusion.

## 3. Bayesian meta-analysis of support for hypotheses

Previously [46], we reported an analysis of the statistical support that the data provides for the different theories, equations (1.1) and (1.3). We used the semi-Bayesian approach of calculating the likelihood *L* of the data under the two theories—which was inconclusive—and then using the Akaike information criterion (AIC), which provides a heavy weighting against theories with more free-fitting parameters. Since the Hall–Petch theory of equation (1.1) has two free-fitting parameters per dataset (*σ*_{0} and *k*_{HP}; 34 parameters for 17 datasets) while the theory underlying equation (1.3) has only one free-fitting parameter per dataset (*σ*_{0}) plus one (*k*∼1) for all datasets (18 parameters for 17 datasets), the AIC gave odds of many millions to one that the dislocation curvature theory is true and the Hall–Petch theory false.

Here, we give a more fundamental, fully Bayesian analysis. Bayes' theorem may expressed in the form: the new odds on the hypothesis under test (*H*) being true when new data are acquired are the prior odds, times the ratio of the probability of the new data under the hypothesis *H* and their probability if *H* is false. Here, we may take *H* to be the hypothesis that equation (1.3) is valid, and its negation not-*H* to be the hypothesis that equation (1.1) is valid. This applies very directly to our problem. In the absence of a theory constraining the values of *σ*_{0} and *k*_{HP} in equation (1.1), the experimentally determined values of yield or flow stress against grain size are expected to have a uniform probability distribution in the *σ*_{0}=0 divides the *d* line (with or without the ln*d* term makes no significant difference) and one above. Equation (1.3) asserts that the probability that data will be significantly below the line is zero, so the data should be concentrated into the half of the space above the line. So, defining *H* as the hypothesis that equation (1.3) is correct, we have a relative probability density of 2 for data above the 1/*d* line and 0 for data significantly below the 1/*d* line.

We apply Bayes' theorem iteratively for each dataset. We start by postulating a value for the prior probability *P*_{0} or odds *O*_{0} that *H* is true before any data are considered. Using only the Principle of Insufficient Reason, we would take *P*_{0}=1/2, i.e. even odds, *O*_{0}=1–1. On the other hand, we might consider that the probability that an equation that has stood for 60 years is false is very low, so perhaps we should take *P*_{0}=10^{−3}, *O*_{0}=999 to 1 against). The first dataset that falls above the 1/*d* line gives a value 2 to the second term on the r.h.s., so that the term on the l.h.s., the odds against *H* halve, or the odds on *H* double. This becomes the first term of the r.h.s. (for *P*_{0}=1/2, *O*_{1}=2–1 on, *P*_{1}=2/3) when we consider the second dataset. As each successive dataset *i* falls above the 1/*d* line, *O*_{i}=2*O*_{i−1} and for *n* datasets *O*_{n}=2^{n}*P*_{0}. So, just 10 or 20 such datasets give overwhelming odds on *H*, depending on whatever reasonable prior *P*_{0} we may have chosen. Here, we have 61 datasets, giving odds of 2^{61}*P*_{0} to one—which is overwhelming for any reasonable choice of prior*P*_{0}.

These odds on equation (1.3) hypothesis can be reduced slightly by considering that not all the datasets are independent. If the data for the yield point or lowest strain fall on and above the line, it is predictable that the data for the same material at higher strains at the same grain sizes will also fall above, so the observation that this is so does not strengthen the hypothesis. This reduces the number of fully-independent datasets to 32, which still leaves overwhelming odds on *H*.

The probability is not a step-function between 2 in the upper right and zero in the bottom left, below equation (1.3) line. If that were so, any data in the bottom left would immediately give a probability of zero for the hypothesis *H*. In fact, experimental error, grain-size determination, grain-size distributions and non-dislocation-based plasticity at small grain sizes all have a non-vanishing probability of putting data a little below the line. Thus, the data seen there at ultra-fine grain sizes, Ni(1), Al(4), Al(5) and Al(6), may be accounted for by grain-boundary sliding, migration and diffusion which have been considered in connection with the inverse Hall–Petch effect (e.g. [74,75]). The surprise, then, is how little data are there, not how much.

Finally, since the conclusion of this section is that equation (1.3) (with *k*∼1 and variable *ε*_{0} or *σ*_{0}) provides the best description of the data, we should consider the significant number of datasets in figure 1 where the EDC-fitted value of *k* is much higher, Fe(1), Fe(3), B(1), B(2) and B(3). These datasets are as difficult to account for by the theories of equation (1.1) (see §4) as by the theory of equation (1.3). One speculation is suggested by the observation that these datasets are all for alloy metals, steels and brass. Microstructure in such metals has scope for size-effect lengths that may be much less than the grain size yet correlated with it. Alternatively, this may be a consequence of grain-size-dependent strain-hardening as in [29].

## 4. One-parameter Hall–Petch theories

While experimentally equation (1.1) is treated as if both *σ*_{0} and *k*_{HP} are free-fitting parameters, the theories which have been put forward to account for the inverse square-root law of equation (1.1) do of course make predictions for *k*_{HP}. And the phenomena in question, when they occur in practice, must contribute to the strength. It is appropriate, therefore, to compare their predictions of *k*_{HP} with the data, to test whether they are in fact supported by the data and whether they explain the data. Classic theories of the Hall–Petch inverse-square-root dependence on *d* (equation (1.1)) are shown schematically in figure 2*a*–*d* together with schematic of the dislocation curvature theory leading to equation (1.3) in epitaxial layers (figure 2*e*) and in polycrystalline metals (figure 2*f*).

### (a) Dislocation pile-up model

This is the phenomenon most often used to account for the Hall–Petch equation (1.1). In this model, a dislocation source in a grain operates many times under an applied stress to produce a number of dislocations on the same glide plane (figure 2*a*). The leading dislocation experiences a force from the stress field, and also the forces from the following dislocations behind it, but it is blocked from further movement by the grain boundary. When the force on the leading dislocation is sufficient to stress the material at or beyond the grain boundary to theoretical strength (or some lower value), dislocations are produced in the neighbouring grain and large-scale plasticity becomes possible. Following Cottrell [76], Eshelby *et al.* [3] and Antolovich & Armstrong [77], the theory gives
*τ*_{C} is the critical shear stress at the grain boundary at which a dislocation is generated in the neighbouring grain. The maximum reasonable value of *τ*_{C} is the theoretical strength, less than *G*/10. For the use made of equation (4.1) (figure 3*a*), differences between *τ*/*G* and *σ*/*Y* are unimportant, likewise the approximation *b*∼*a*_{0}. Then equation (4.1) becomes
*ε*_{f}=*σ*_{f}*Y* ^{−1}, in normalized units as used in figures 1 and 3.

The data from §2 are compared with the prediction of the pile-up model in figure 3*a*. The shaded triangle below the solid line is the allowed region according to equation (1.1). Many of the datasets have slopes (values of *k*_{HP}) greater, even very much greater, than the predictions. Applying the same statistics as in §3, the odds *against* the pile-up model are greater than the odds *on* the hypothesis *H* that equation (1.1) is correct; for many data are falling where their relative probability is much less than one-half. In fact, this is an exaggeration. Different datasets falling where their probabilities are low may not be independent events. Consider the *a priori* estimate of the (small) probability *P*_{0} that the model is correct but the parameter values in equation (4.1) have been wrongly estimated. Then all data wherever they fall are fully consistent with this hypothesis, which retains the probability *P*_{0} independent of the data.

Pile-up can of course occur, and will give rise to some (grain-size-dependent) strengthening. However, figure 3*a* shows that it cannot account for the most part of the strength in most datasets; that is, it is a weak effect compared with the direct effect of grain size on the dislocation mechanisms that are required for plasticity (source operation, equation (1.3)). This conclusion is confirmed by discrete dislocation dynamics simulations of wires in torsion, in which pile-up can be encouraged by prohibiting cross-slip or reduced by allowing cross-slip. Torque-torsion curves did not change significantly with the amount of pile-up (J. Senger, D. Weygand, D. J. Dunstan 2012, personal communication).

### (b) Grain boundary ledge model

Li [78] sought a model that could explain the Hall–Petch behaviour in the majority of cases where there is no evidence of dislocation pile-up. He proposed that grain boundaries and sub-grain boundaries should emit dislocations (figure 2*b*). He showed that the stresses required are nearly the same (1) for a pile-up to drive a dislocation through a grain boundary, (2) for a pile-up to activate a source on the other side of the boundary and (3) to move dislocations in a forest formed by all the dislocations emitted by a tilt boundary. In model (3), the grain size dependence arises from the density of the forest. Murr [83] reported observations by electron microscopy supporting this model. The prediction of the model is
*α* is a constant of the order of 0.4, and the constant *m* is given by the expectation that the product *mb* will be in the range 0.02–0.2 [84–86]. Converting to *ε*=*σ*/*Y* ∼*τ*/*G* and using *b*∼*a*_{0}, this becomes
*a* shows the range where data are to be expected according to equation (4.4). This model is again inconsistent with much of the data, and it does not account for the wide scatter of the data. The same considerations apply to the probability that it is correct as per the pile-up model in §4a.

### (c) Plastic strain models

There is a class of theories which give the Hall–Petch coefficient as dependent on plastic strain (with no Hall–Petch effect at the yield point). This is too large a topic to deal with adequately here, so we consider one typical theory. Conrad and co-workers [82,84,87] developed a theory which gives very naturally the inverse square-root dependence on grain size of equation (1.1) when square-root strain-hardening occurs (see also [29,36]). Mobile dislocations account for the plastic strain *ε*_{pl}, and
*ρ*_{m} is the density of mobile dislocation and *ρ* is mobile, so that *ρ*_{m}=*ξρ*. Using the Taylor (forest) hardening expression, substituting and rearranging, we have
*α* is the Taylor coefficient. Converting as for equation (4.4)
*c*) due to the reduced slip distance. This gives a Hall–Petch coefficient which vanishes at the yield point (*ε*_{pl}=0) and is proportional to the square-root of plastic strain otherwise. The constant *α* is normally taken as about 0.3, while the constants *λ* and *ξ* are both of the order of but less than unity, so the factor *α*^{2}/*λξ* may be taken to be about unity. Then for the datasets reported at high plastic strains approximately 0.2, the value of *k*_{HP} in equation (4.7) may be close to 0.5, while for the datasets reported near the yield point (*ε*_{pl}∼0.002) it will be below 0.05. These two possibilities are plotted on figure 3*a* (red chain-dotted lines). Clearly, this theory can be ruled out for the Hall–Petch effect near the yield point, but it survives as a candidate for explaining equation (1.1) behaviour at high plastic strains when square-root strain-hardening is observed.

Other theories, such as the plastic anisotropy theory of Ashby [81], give very similar expressions for *k*_{HP}, so the same comments apply. However, before leaving the topic for a fuller treatment elsewhere, it is worth noting that the datasets Fe(4), Fe(5), Fe(6), B(4), B(5), Cu(1), Cu(2), Ag(1,2) Al(1) and Al(2) all have data for different strains. Any dependence of *k*_{HP} on the plastic strain is weak or absent. Only Ni(3), from Narutani & Takamura [29], shows the strong dependence expected from these theories—and these authors note that their data deviate from equation (1.1).

### (d) Elastic anisotropy model

This model was proposed by Kelly [79], Hirth [28] and Meyers & Ashworth [80]. Given the random orientation of grains, a homogeneous elastic stress field necessitates an inhomogeneous elastic strain field, resulting in gaps and overlaps between grains as shown in figure 2*d*. Here, a two-dimensional polycrystalline cubic material with non-zero anisotropy *C*=|*c*_{11}−*c*_{12}−2*c*_{44}| is shown elastically deformed under a uniform shear stress field. Gaps and overlaps form. Deforming the grains to eliminate them results in inhomogeneous stress and strain fields. The resulting strain gradients require the creation of geometrically necessary dislocations if plastic deformation is to occur, and a consequent increase in strength. The grain-size dependence arises naturally, in that if the grains are smaller the strain gradients and the densities of GNDs will be proportionately larger. This model predicts that under suitable normalization *k*_{HP} will be proportional to the elastic anisotropy. The factor of proportionality is unspecified by the theory—it is phenomenological, depending on the characteristic length in the strain-gradient theory, which can only be found by experiment. In figure 3*b*, we plot the values of *k*_{HP} for the cubic metals against their normalized anisotropy parameters *C*/*Y* . While there is a considerable scatter of the data for metals where we have more than one value of *k*_{HP}, it is clear that there is no strict dependence, nor even a trend suggesting that *k*_{HP} depends upon *C*. This model is therefore neither consistent with nor explanatory of the data.

### (e) Discussion

The outcome of this section is that none of these theories explain the observed strength of metals as a function of grain size. They fail in a variety of ways, including unfulfilled predictions of parameter values and of functional dependence on known parameters, but most fundamentally they fail against the Bayesian criticism—none of them are consistent with all the data appearing above the equation (1.3) (*σ*_{0}=0) line of figure 1 and almost no data below. The odds against them are thus consistently millions to one against.

## 5. Conclusion

It is clear that there is neither experimental nor theoretical evidence for the 60-year-old Hall–Petch equation, equation (1.1). The role of errors in grain-size determination in approximately halving the apparent exponent in least-squares fitting has previously been overlooked, but is a very plausible explanation for the apparent agreement of the data with the equation (1.1) value of *x*=1/2 if the data actually obey *d*^{−1}. The wide range of experimentally reported values for the Hall–Petch constant, *k*_{HP}, for similar materials do not support equation (1.1), neither are they predicted by any of the theories in §4. On the other hand, the large body of experimental data is fully consistent with the size-effect expected from dislocation curvature, equation (1.3), for the minimum strength expected for a given grain size. That consistency depends on the necessary caveats: non-dislocation-based plasticity such as grain-boundary sliding may take over at small grain sizes; other strengthening mechanisms may be correlated with grain size with or without being caused by grain size.

An argument in favour of this conclusion is that it brings the Hall–Petch effect under the umbrella of the size effect(s) generally, rather than being *sui generis* with its own unique inverse-square-root exponent and, therefore, a need for its own explanations. We argue that the underlying size-dependence that dictates the minimum strength for dislocation plasticity should be in the singular—the only size effect that will be necessarily present in all experimental situations is the Orowan size–stress relationship, aka Matthews critical thickness theory [5], aka the argument from similitude [37]: the size must be inversely proportional to dislocation curvature and hence to stress.

It might be considered that this does not matter. It might be pointed out that the Hall–Petch relation, equation (1.1), is a valid empirical relation and as such it is useful for prediction—for interpolation and extrapolation of material properties—whether or not it is theoretically correct [28]. That is certainly so for interpolation, for which it will be as useful—but no more useful—than a smooth curve drawn through the data by hand, but this is a very dangerous approach to extrapolation.

One may also regret the loss of time (the wasted effort) in attempts to explain the inverse-square-root form of equation (1.1), and of course the parameter values therein too. On the other hand, one may anticipate theoretical and practical advances that may be made when it is considered that the grain-size effect operates through the same mechanism as other size effects and, therefore, may be combined with them, as in Ehrler coupling of structural size and grain size [88].

The other main conclusion from this work is that it can never be sufficiently strongly emphasized that a good fit of data to an equation or to a theory is of no significance unless it has been adequately considered what else might fit the data. And statistical methods such as least-squares fitting should always be tested with dummy data where one knows what outputs should be obtained. This is a much more general conclusion, of interest to non-metallurgists as much as to metallurgists.

## Data accessibility

All 61 datasets used are given in full in the electronic supplementary material.

## Authors' contributions

Y.L. carried out the data digitization and data analysis, and drafted the manuscript; A.J.B. and D.J.D. jointly conceived of the study, designed the study, coordinated the study and finalized the manuscript. All authors gave final approval for publication.

## Competing interests

We have no competing interests.

## Funding

Y.L. is grateful to the Chinese Scholarship Council for his PhD studentship. A.J.B. and D.J.D. acknowledge the EPSRC grant no. EP/C518004 under which this work was initiated.

## Acknowledgements

Too many colleagues have contributed valuable insights and comments to name them all. But one must be singled out. We are very grateful to Prof. Ron Armstrong, who appears frequently in the list of references, for his assistance in accessing the older literature, and especially for his encouragement and his enthusiasm for a re-appraisal of a topic with which he has been intimately concerned for well over 50 years.

- Received December 30, 2015.
- Accepted May 5, 2016.

- © 2016 The Author(s)

Published by the Royal Society. All rights reserved.