## Abstract

Almost all practical engineering applications are multi-physics in nature, and various physical phenomena usually interact and couple with each other. For instance, the resistivity of most conducting metals increases linearly with increases in the surrounding temperature resulting from Joule heating by electrical currents flowing through conductors. Therefore, in order to accurately characterize the performance of high-power integrated circuits (ICs), packages and printed circuit boards (PCBs), it is essential to account for both electrical and thermal effects and the intimate couplings between them. In this paper, we present non-conformal, non-overlapping domain decomposition methods (DDMs) for thermal-aware direct current (DC) IR drop co-analysis of high-power chip-package-PCBs. Here, IR stands for the finite resistivity (R) of metals and current (I) drawn off from the power/ground planes. The proposed DDM starts by partitioning the composite device into inhomogeneous sub-regions with temperature-dependent material properties. Subsequently, each sub-domain is meshed independently according to its own characteristic features. As a consequence, the troublesome mesh-generation task for complex ICs can be greatly subdued. The proposed thermal-aware DC IR drop co-analysis applies the non-conformal DDM for both conduction and steady-state heat-transfer analyses with a two-way coupling between them. Numerical examples, including an IC package and a chip-package-PCB, demonstrate the flexibility and potential of the proposed thermal-aware DC IR-drop co-analysis using non-conformal DDMs.

## 1. Introduction

Advanced packaging research includes various technologies, such as design, thermal management, design for test, fabrication and system integration technologies. Three-dimensional stacking technology provides a flexible strategy to achieve extremely high interconnect densities by chip stacking and offers an opportunity to go ‘beyond Moore's law’. The increase in clock frequency and edge rates, as well as the continuous downscaling of feature size and three-dimensional interconnect technologies in high-speed systems, result in power integrity (PI) and signal integrity (SI) as the major culprits causing chip failures.

One of the major PI issues is the IR drop (Heinrich 1992; Pannikkat *et al.* 2004; Shrivastavaa *et al.* 2004; Nithin *et al.* 2010), which is caused by the finite resistivity (R) of metals and current (I) drawn off from the power/ground planes. Chip designs are susceptible to IR drop, especially when the integrated circuit (IC) supply voltage *V*_{dd} decreases with the scaling of silicon processes. IR drop is proportional to the resistance of the power/ground plane. When the resistance of the power/ground plane increases, with the shrinking of complex geometries, the IR drops will in turn increase as well.

To further confound the difficult issue, the rise of the thermal temperature owing to the current carrying interconnects also has tremendous impact on the IC performance and reliability. Current flow in a very large-scale integration interconnect can cause power dissipation, which is referred to as Joule heating or self-heating. The Joule heating effects become increasingly significant with the shrinking scale of the silicon process because of the increase of on-chip power density, inclusion of more metal layers with higher densities and the use of dielectric materials with lower thermal conductivities. Subsequently, the temperature effects on the electrical performance of the three-dimensional system should be carefully considered, as well in the electrical designs (Jiang *et al.* 2010).

Moreover, owing to the linear temperature dependency of metal resistivity, the non-uniform temperature distribution affects the electrical performance of the power delivery network (PDN) and substantially increases the IR drops in the power/ground planes (Bracken *et al.* 2011; Farina *et al.* 2011). In particular, modules that draw more current from the PDN, owing to higher power demands, will suffer worse IR-drop effects. Furthermore, the decreases in IC supply voltage, together with larger IR drops, will significantly reduce the noise margins, consequently making electronic devices more vulnerable to direct current (DC) noises. To simulate this closed coupling loop between electrical and thermal analyses, we propose a non-overlapping, non-conformal domain decomposition method (DDM) for thermal-aware DC IR-drop co-analysis.

In their work, Shao *et al.* (2010) proposed the use of a non-overlapping, non-conformal DDM for modelling SI effects within complex ICs. Interested readers are therefore referred to this reference for details. The main emphasis of this contribution is to extend the application of the non-overlapping, non-conformal DDMs to analyse thermal-aware DC IR drop of printed circuit boards (PCBs) with IC and memory modules.

The rest of the paper is organized as follows: the design flow of the thermal-aware IR-drop analysis is first presented in §2. In §3, we present the DDM for thermal-aware IR-drop analysis with its notations and formulations. Also, the DC IR-drop analysis of a product-level IBM package example is included. We then discuss the on-board and on-package DC IR-drop analysis, with steady-state thermal effects taken into account, in §4. In §5, we illustrate the non-conformal DDM for a thermal-aware IR drop on a chip-package-PCB example. A summary and conclusion are given in §6.

## 2. Design flow of the thermal-aware IR-drop analysis

Today's low-voltage, high-current designs require accurate DC IR-drop analysis for off-chip power distribution systems in order to optimize end-to-end voltage margins for every device on the distribution. To characterize the DC IR drop accurately, a numerical method based on volume discretization is usually needed for solving the current distribution within a complex domain with various geometrical features such as via holes, copper islands, wires and via pads. In this section, we introduce a non-conformal DDM for conduction analysis, and demonstrate typical IR-drop simulation of a three-dimensional PDN with both PCB and package examples. Temperature-dependent resistivity is hereby taken into account in the Laplace equation, which enables simulation of the PDN with inhomogeneous resistivity.

To elucidate the issues involved, for example, we consider an elaborate chip-package-PCB model introduced in Xie *et al.* (2009), which includes heat sinks, thermal interface materials, through silicon vias (TSV), a controlled collapse chip connection to spread the heat from high heat flux areas, multiple chip modules and a PCB (figure 1). One of the major challenges relates to the complexity of the geometry. The package model includes two large-scale copper planes, and two chip modules mounted on top of the PCB. The chip modules contain many three-dimensional interconnects with small dimensions, such as TSVs and buried vias connected to the CPU and RAM, as shown in figure 1*b*. It will be daunting to generate a good-quality finite-element mesh for the entire package with all the small features, let alone solve it efficiently. Moreover, we should also mention that owing to the multi-scale nature of the model, the resulting mesh will have a wide range of element sizes. As a consequence, the application of the finite-element methods to such a discretization usually leads to an extremely ill-conditioned system matrix equation.

## 3. Thermal-aware IR-drop formulations

Owing to the intimate couplings between the Joule heating and the linear temperature dependence of resistivity, the non-uniform temperature profiles in high-performance ICs will significantly affect the IR drop, signal delay and system performance. Therefore, the analysis of DC IR drop must take the effect of thermal temperature distribution into account. During the thermal-aware IR-drop analysis, the temperature profile from steady-state thermal analysis is used to update the power grid resistance for IR drops at each simulation loop. Presented in figure 2 is a flow diagram for the thermal-aware IR-drop co-analysis. First, the IR-drop/conduction analysis is performed at room temperature for the initial voltage distribution. Subsequently, the computed dissipation owing to DC voltage leakage power is imposed as the heat source for steady-state thermal analysis. Moreover, the power profiles *P*=**J**⋅**E** of the chip and memory modules are included as heat sources in the thermal temperature computation. Once the temperature distribution within the device is calculated, it can be used to update material properties, specifically the conductivity (resistivity) of the conducting materials. With an elevated temperature on the chips, electrical resistance has a linear temperature dependence, as depicted in figure 2*b*. For instance, when the temperature of the device increases from room temperature to 80^{°}C, the electrical resistivity will increase by more than 40 per cent. Consequently, the updated values of the resistivity within the device lead to a new/updated voltage distribution. The fully coupled multi-physics, electric and thermal analysis is repeated until the temperature-dependent IR drop and thermal distributions converge with negligible errors.

To conduct the thermal-aware DC IR-drop analysis, we begin with the description of the resistivity of metal as a function of temperature. In figure 2*b*, we show resistivity of various types of metal, and their dependence on the temperature. In general, we have for a metal, the resistivity *ρ* at point (*x*,*y*,*z*) varies linearly with regard to the temperature *u*(*x*,*y*,*z*). Namely,
3.1
where *ρ*_{0} is the metal resistivity at the reference temperature, *u*_{0}, and *α* is the temperature coefficient of the resistivity with regard to the change in temperature.

### (a) IR-drop formulation: Ohm's law and Laplace's equation

To compute the steady-state voltage distribution within a conducting material, we start with the governing equation for the current density **J** as
3.2
In DC, the current flowing through a conductor obeys Ohm's law in the form
3.3
where *σ* is the temperature-dependent electrical conductivity of the conductors at point (*x*, *y*, *z*). The electric field **E** can be expressed in terms of the electric potential *ϕ* as
3.4
which leads to Laplace's equation of scalar electric potential. We have
3.5
with the boundary conditions
3.6
3.7
where *Γ*_{d} is the Dirichlet boundary surface where voltage on it is given, and *Γ*_{l} represents the impedance boundary surfaces to account for external loads such as chips and memories. *R*_{l} and *S*_{l} are the load resistance and cross-sectional area of *Γ*_{l}, respectively. Finally, *Γ*_{n} denotes the Neumann boundary surface through which no currents are allowed to flow.

The equivalent load resistance *R*_{l} is generally also temperature dependent. For simplicity, in this work, we treat it as constant and derive it from the power consumption in the chip/memory layer under ideal operating condition, namely without any IR drop. Hence, for example, with a power consumption of *P*_{l} on a load chip, we have
3.8
Here, we use the supply voltage *V*_{dd} to compute the equivalent load impedance *R*_{l} on the chips. Owing to the finite resistivity of the metal plane, the actual voltage *ϕ*_{via} at the via location will be smaller than *V*_{dd}, and subsequently, the amount of voltage delivered by the PDN to the chip modules may be insufficient for chips to function properly.

### (b) Decomposed boundary-value problem

Here, we introduce a non-conformal and non-overlapping DDM for the solution of DC IR-drop analysis. For simplicity, we only decompose the original problem into two sub-domains. The boundary-value problem (BVP) of IR drop on the PDN for the decomposed problem of figure 3 may then be written as
3.9
3.10
3.11
3.12
3.13
3.14
Here, *i*=1,2, and equations (3.13) and (3.14) are interface boundary conditions (transmission conditions) that impose the continuities of voltage *ϕ* and current *σ*(∂*ϕ*/∂*n*) at the interface *Γ* of adjacent sub-domains.

Herein, we adopt a non-conformal DDM approach such that the original problem can be divided into a number of sub-domains with smaller sizes, and each sub-domain can be meshed independently without consideration of conformity to the adjacent sub-domains. The boundary conditions, the continuities of both voltage and current across interfaces, will be imposed through the interior penalty (IP) (Arnold 1982; Houston *et al.* 2005; Rawat 2009) method.

#### (i) Interior penalty formulation

Regarding the decomposed BVP (3.9)–(3.14), with every *trial–function* pair, (*ϕ*_{1},*ϕ*_{2}), we have the following residuals:
3.15
3.16
3.17
3.18
3.19
On the Dirichlet surface, , the Dirichlet boundary conditions shall be enforced strongly as the essential boundary conditions. Namely, the boundary conditions *ϕ*_{i}=*ϕ*^{(i)}_{d} on *Γ*^{(i)}_{d} are satisfied exactly and will not result in any residuals for any trial function in the trial function space.

The above residuals can be interpreted as the error sources that support the differences between the exact voltage distribution and the trial numerical solution. In applying the Galerkin method, we shall test each of these residuals in spatial domains by test functions in its dual space. For example, the test function for the residual needs to be in the Sobolev space (Adams 1975) , the dual space of *H*^{−1}(*Ω*_{1}). To derive the weak form Galerkin statement, we follow the IP approach and begin by properly testing each of the residuals in (3.15)–(3.19) by test functions in its dual space. Following the IP formulation, with proper dual pairing for each of the residuals, we obtain the following weak Galerkin statement.Find , such that
3.20
where
3.21
and
3.22

Moreover, the last two terms on the left-hand side of (3.20), inspired by the IP methods, are included to penalize the discontinuities of the voltage and the current across interface *Γ*_{12}. We see their importance upon considering the boundary testings
3.23
and
3.24
The coefficients appearing in (3.23) and (3.24) can be chosen arbitrarily, other than zeros, to weakly enforce transmission conditions on the interface between sub-domains and thereby satisfy (3.13) and (3.14).

### (c) Volume penalty terms

We consider the volume penalty terms , in equation (3.20). Through the application of the Green theorem (or integration by parts), the expression can be re-written as 3.25 A similar expression exists for . Introducing a notation via 3.26 where 3.27

### (d) Surface penalty terms

Consider the surface penalty terms in (3.20). In the finite-element formulation, a popular choice is to set *c*_{1}=*c*_{2}=−1. We shall employ the same values for *c*_{1} and *c*_{2}, and have
3.28
with
3.29

Moreover, we have also used *c*_{3}=*c*_{4}=1 and *p*_{1}=−*σ*/*δ*, *p*_{2}=−*δ*/*σ* in the proposed formulation. From equations (3.23) and (3.24), the corresponding transmission conditions on *Γ*_{12} are therefore
3.30
where *δ* is a non-zero value. In the discretized version, we set *δ* to be the mesh size of the discretization to render the formulation coercive (Epshteyn & Riviere 200). Obviously, the enforcement of the Robin transmission conditions in (3.30) recovers the original boundary conditions described in (3.13) and (3.14) on the domain interface *Γ*_{12}.

Furthermore, we introduce two additional notations to simplify the Galerkin weak statement (3.20). They are 3.31 and 3.32

Finally, the Galerkin weak formulation for the general partition of the domain *Ω* into *M* sub-domains can be formally stated as the following. Find , such that
3.33
.

The extensions of *γ*_{0}〈*v*,*ϕ*〉, *γ*_{1}〈*v*,*ϕ*〉, *γ*_{2}〈*v*,*ϕ*〉 and 〈*v*,*ϕ*/*R*_{l}*S*_{l}〉_{Γl} to general *M* sub-domains are straightforward, and we shall omit the detailed descriptions here.

### (e) Finite-dimensional discretization

The corresponding discrete Galerkin weak statement for (3.33) can be written as the following. Find , such that
3.34
, with the finite-dimensional space . In our current implementation, is taken as the finite-dimensional subspace constructed from , a tetrahedral mesh of *Ω*_{i} with mesh size *δ*, and the Lagrangian linear interpolation polynomial within each tetrahedral element.

Subsequently, the discrete system (3.34) for *M* sub-domains can be expressed as the following matrix form:
3.35
where
3.36
In equation (3.35), denotes the column-coefficient vector of the restriction of the sought-after solution *ϕ*^{δ} in the sub-domain . We further partition into two groups: and for the *interior* and *boundary* unknowns, respectively. Moreover, the following expressions correlate the sub-matrices in (3.35) and (3.36) with the discrete Galerkin weak statement in (3.34)
3.37

### (f) Matrix solution procedure

In figure 4, we show pictorially our decomposition of the unknown coefficients vector, , within each sub-domain into two groups: the interior unknowns, and the boundary unknowns, . We illustrate our decomposition through a two-dimensional example shown in figure 4. For a triangle of the triangulation, if it has at least one edge residing on the sub-domain interface *Γ*_{ij}, we shall label all three nodes associated with the triangle as the boundary unknowns. All the rest nodes inside the sub-domain are, by default, labelled as the interior unknowns. A similar procedure is adopted for the three-dimensional applications.

In this section, we detail our solution procedure based on the DDM. To solve the matrix equation in (3.35), we first apply a block-diagonal preconditioner, which is similar to the additive Schwarz procedure (Lions 1990; Zhao *et al.* 2007). We have
3.38
where
3.39
The corresponding sub-matrices, and the right-hand sides, can be easily identified by comparing equation (3.35) with equation (3.38).

Next, we shall introduce a simple restriction operator, , *m*=1,2,…,*M*. Specifically, we have
3.40
Furthermore, it is easy to show that
3.41
Accordingly, we propose to solve the system matrix equation (3.38) in two steps.

—

*Solving the surface unknowns.*With the help of equation (3.41), it is not difficult to derive the following matrix equation in terms of only the surface unknowns: 3.42 with .—

*Recovering the sub-domain solutions.*Once the surface unknowns, , are computed, the interior solution for each sub-domain can be recovered via 3.43

By following the proposed matrix solution procedure, the construction of the preconditioner requires the inversion of each diagonal block matrices, namely, . In practice, these block matrices can be either factorized in a pre-processing step or solved via another preconditioned Krylov method (Saad 1981) at each DDM iteration (*inner-loop* iteration). In the current implementation, we have employed the multi-frontal solver (Amestoy *et al.* 2005) to factorize the subdomain matrix .

### (g) DC IR-drop analysis of an IBM package example

A Krylov subspace iterative method, generalized conjugate residual (GCR) (Eisenstat *et al.* 1983), is adopted for the solution of (3.38). The convergence criteria for the GCR(10) solver is defined as
3.44

In the following numerical studies, we use a relative residue, *ϵ*=10^{−6}. All computational statistics are reported using a workstation with two quad-core 64 bit Intel Xeon E5520 CPUs and 48 GB of RAM.

For a product-level package model such as the one shown in figure 5, it is very challenging to perform full-wave DC IR-drop analysis. The first challenge arises from its geometrical complexity, which contains tens of thousands of small entities, such as signal traces, three-dimensional interconnects (through-hole vias and buried vias, solder bumps and bump wires) and metallic power/ground layers. The via holes and other cut-outs on the PDN, which has the potential to induce Swiss cheese effects, are illustrated in figure 6. The IR drop owing to such effects can greatly compromise the performance of the high-speed IC packages and PCBs. As shown in figure 5, it contains about 40 000 geometrical entities. It will be extremely challenging to generate a mesh of the entire package with all these fine features. Secondly, even if such a mesh can be constructed, the resulting degrees of freedom (d.f.) will be astronomical. It requires excessive computation time and memory to simulate this product-level three-dimensional package model using conventional conformal finite-element analysis.

The proposed non-overlapping, non-conformal DDM addresses the abovementioned difficulties effectively. For this study, we partitioned the entire power and ground grids into 149 sub-domains, and each of them is discretized independently into a tetrahedral mesh. We may notice in figure 7 that the DDM exhibits fast convergence (75 Krylov iterations are required to converge to 10^{−6}). The computational statistics of IR-drop analysis of the IBM model are summarized in table 1. The preconditioner setup time indicates the CPU time required to factorize the 149 sub-domain matrices. As can be seen from table 1, the computation resources (both memory and CPU times) required to analyse the DC IR drop for this product-level multi-scale IC package example are quite modest.

Figure 8 shows the boundary conditions and the voltage distribution on the top layer. The results indicate that the left bottom corner of layer 1 will see 28 mV of voltage drop. If the nominal voltage of this rail is 1.0 V, the IR drop is 2.8 per cent of nominal. Moreover, the simulated IR drop result on the power grids and on the ground grids of the example are shown in figures 9 and 10, respectively.

## 4. Steady-state thermal formulations

Heat conduction is the process of heat transfer in a given region over time. The governing partial differential equation for the steady-state thermal analysis can be written as
4.1
where *q*=*q*_{d}+*q*_{e} includes the DC power dissipation *q*_{d}(=**J**⋅**E**=*σ***E**^{2}) and chip power consumption *q*_{e} generated in the medium. Moreover, *q*′ is the flow rate of heat energy through the outer surface, which is proportional to the negative temperature gradient across the surface,
4.2
with *κ* being the thermal conductivity. Consequently, the steady-state BVP for thermal analysis can be stated as
4.3
and
4.4
where *h* and *T*_{a} are the heat-transfer coefficient and ambient temperature, respectively.

### (a) Decomposed boundary-value problem for the thermal analysis

For simplicity and without loss of generality, we again decompose the original problem into two sub-domains, as illustrated in figure 11. Subsequently, the decomposed BVP for the steady-state thermal analysis can be written as 4.5 4.6 4.7 4.8 4.9

Following a similar process as in the non-conformal DDM formulation for IR drop, the discrete Galerkin weak formulation of the steady-state thermal analysis can be stated as the following. Find such that 4.10 , where 4.11 4.12 4.13 4.14

The discrete system (4.10) results in a matrix equation of the form 4.15 with 4.16

In equation (4.15), denotes the column-coefficient vector of the restriction of the sought-after solution *u*^{δ} in sub-domain . We further partition into two groups: and for the *interior* and *boundary* unknowns, respectively. Moreover, the following expressions correlate the sub-matrices in (4.15) and (4.16) with the discrete Galerkin weak statement in (4.10),
4.17

## 5. Numerical study: thermal-aware IR-drop analysis of printed circuit board example

In this section, we will use the power loads at the chip modules as boundary conditions to obtain a closed-form numerical expression for the voltage profile of a chip-package-PCB example. IR drop, starting from the power supply source, the voltage regulator module to the IC circuits, experiences three stages: on-board IR drop, on-package IR drop and on-chip IR drop. Here, we consider the on-board IR drop of the PDN of the model. Numerical simulations were conducted on a Linux workstation with two quad-core 64 bit Intel Xeon X5450 CPUs and 48 GB of RAM. All the numerical computations were implemented through the use of non-overlapping and non-conformal DDMs, which provide an effective way to alleviate the burden of mesh generation. All the metal layers were fabricated using copper with a conductivity of 5.96×10^{7} S m^{−1} at 20^{°}C. The ambient temperature at initial iteration is 20^{°}C. The heat-transfer coefficient between the model and air is 10 W (m^{2} K)^{−1}.

The computational statistics of the IR-drop analysis of the chip-package-PCB example are summarized in table 2. As seen from figure 12, we observed fast convergence (28 Krylov iterations for IR-drop analysis and 41 iterations for steady-state thermal analysis) of the proposed DDM approach. For IR-drop analysis, when resistivity changed with temperature, the matrix entry on the l.h.s. of equation (3.35) also changes. So, for each thermal-aware IR-drop analysis step, we need to factorize the sub-domain matrices again.

### (a) Example of chip-package-printed circuit board

The PDN in the chip-package-PCB model consists of two parallel copper planes and multiple vias. The vias are not only connected to CPU and RAM, but also served as thermal TSVs, as shown in figure 13. The heat sources of the current study, both on the chip layer, as well as on the memory layer of the stack, are shown in figure 14. The PCB plane in figure 1, with dimensions of 20×20 cm, together with the vias, is modelled at room temperature to compute the initial DC IR drop. The entire computational domain is decomposed into 115 sub-domains. Figure 15 shows the geometrically non-conformal sub-domain partitions and non-matching meshes on the sub-domain interfaces.

As shown in figure 16*a*, 2.5 V is assigned to the top left corner of the plane to mimic the ideal voltage supply. The resulting voltage drop across the plane is shown in figure 16*b*. In figure 17, the minimum voltage at the chip1 module is 2.47172 V, resulting in a maximum voltage drop of 28.28 mV from the power supply.

Figure 17*b* illustrates that the hot spot in the temperature profile exists in the chip1 module, which matches with its power map. After the convergence of the electrical–thermal co-analysis loop within four iterations (mean square root of voltage distribution difference between iterations smaller than 1.0^{−4}), final voltage distributions compared with the initial voltage profile of the two chip modules are shown in figure 18. The minimum voltage of the two chip modules with each iteration are illustrated in figure 19. Figure 18 shows that the IR drop at every TSV location is increased owing to the Joule heating effect on the PDN. The final voltage distributions of the CPU1 and RAM1 within the chip1 module have maximum IR drops of 34.9 and 33.1 mV, while the initial voltage distributions have maximum IR drops of 28.28 and 26.9 mV. Moreover, the final voltage distributions of the CPU2 and RAM2 within the chip2 module increase from 22.5 and 22.6 mV to 27.2 and 27.32 mV from the power supply to the chip2 module, respectively. Compared with the initial IR drop, the final IR drop increases by about 20 per cent.

## 6. Conclusion

We propose a multi-physics co-analysis technique to efficiently compute the thermal-aware DC IR drops on the power grid of high-power ICs. The DDM described herein is a type of non-overlapping and non-conformal DDM, and hence, it is effective in dealing with multi-scale problems. Each sub-region can be modularly treated in terms of discretization. The use of the proposed DDM is justified through a practical example of a product-level IC package model and a multi-scale chip-package-PCB example. The thermal-aware DC IR-drop analysis results demonstrate that owing to the Joule heating of the power grid, the IR drop may be adversely affected by up to 20 per cent.

## Acknowledgements

The authors would like to thank ANSYS/ANSOFT Corporation for their support on this work.

- Received December 2, 2011.
- Accepted January 13, 2012.

- This journal is © 2012 The Royal Society