November- December 1998
eMail the Editor
A Test Strategy to Verify the Reliability of Chip-Scale Packages
Asking what tests should be used for assuring CSP reliability forces us to focus on the key question: What do we mean by''reliability"?-By Dr. Per-Erik Tegehall, IVF-The Swedish Institute of Production Engineering Research, Molndal, Sweden
This paper outlines a test strategy for verifying the reliability of chip-scale packages. The emphasis in this test strategy is placed on understanding the root causes of failures-the physics of failure—to enable a rational selection of relevant test methods for reliability verification. Various failure mechanisms and adequate test methods for reliability verification will be described.
The limited availability of reliability data for chip-scale packages (CSPs) has slowed the widespread use of this package type. This is due to the few independent evaluations that have been performed and also because CSP reliability is a much more complex matter than the reliability of conventional package types such as PLCCs and QFPs.
An evaluation conducted by Newman and Yuan indicates that the reliability for CSPs may vary considerably for various package types. 2 Since CSPs include component types with probably more dissimilar characteristics than any two other IC package types, this result is not surprising. Therefore, CSP reliability must be verified for each CSP type.
The traditional way to assure reliability of components is to test according to a set of standardized test methods, such as those given in MIL-STD-883 3 and JESD22. 4 These methods are based on the experience of mature technologies, and the purpose is to ensure a uniform level ability.
However, during recent years the absolute accuracy of these tests has been questioned. The high acceleration factor of many of these tests may lead to insignificant failure mechanisms and result in false conclusions. This risk increases when new, innovative technologies are used for manufacturing packages, as with CSPs.
Consider also that many CSPs will be employed in consumer products that will be used in rather benign environments over rather short expected lifetimes. The relevance of using the high-reliability test methods for these products is questionable. As Greathouse noted:5
"The cost drivers associated with the final system price are forcing a re-look at the real package reliability requirements for an application. There is a need to get the best possible product for an application, withoutpaying high pricesfor components that exceed what is needed for your application. A set of modified standards is likely to come forward, based on the actual needs of the products, instead of a preestablished set of standardized tests, as has been the history with companies using Mil-Spec 883 standards."
JEDEC s Standardfor Stress-Test-Driven Qualification of Integrated Circuits, JESD47, 6 reflects an increased awareness of this problem and cautions against the indiscriminate use of accelerated tests for qualification. It stresses that new qualification projects must be examined for potential new and unique failure mechanisms.
...It is impossible to evaluate the reliability of a package generally; the reliability has to be assessed for specific applications.Methods Questioned
However, it is not only for the qualification of packages that the older test methods have been questioned. In 1995, the United States Department of Defense cancelled many of the MIL standards without replacement. Among those cancelled was the standard regulating production of printed circuit board assemblies MIL-STD-2000. 7 The reason given for the cancellation was that standards consisting of "how-to" instructions are undesirable. Instead, future acquisitions should be based on performance terms.8
Asking what tests should be used for assuring CSP reliability forces us to focus on the key question: What do we mean by "reliability"? The definition given by the IPC is, "The ability of a product to function under given conditions and for a specified period of time without exceeding an acceptablefailure level."9
According to this definition, it is impossible to evaluate the reliability of a package generally; the reliability has to be assessed for specific applications. A component found reliable for one application may be unreliable for another. Furthermore, reliability tests must be based on acceleration of the failure mechanisms that will be crucial for a specific product's reliability, and the tests must be continued until enough parts have failed for a life distribution to be calculated.10
Using the IPC s definition of reliability, the objective of reliability testing is to assess thefailure rate during a specific product's lifetime. To obtain test results within a reasonable time period, accelerated tests are used. Acceleration of a test can be achieved in two ways, which may be combined:1° The frequency of the occurrence that causes failure can be accelerated or the severity of the conditions causing the failure can be increased.
Generally, a failure mechanism that is a continuous process—a corrosion process, for example—can best be accelerated by increasing the severity of the conditions causing failure. Alternatively, a failure mechanism that is caused by a number of events with rather short duration can best be accelerated by increasing the frequency of the events. The outline of an adequate test strategy for reliability verification of packages is discussed below.
A useful method for adopting new technologies must be based on verifying performance.Performance-Based Reliability Testing
Since the goal of reliability work is to assure the high quality of manufactured products, a useful methodology for adopting new technologies must be based on verifying performance. This paper is limited to discussing reliability verification of packages. A more general discussion of how to verify the reliability of printed circuit board assemblies can be found in Reference 11.
For package qualification, JEDEC has published a standard entitled, Failure-Mechanism Driven Reliability Qualification of Silicon Devices,12 which is presented as an alternative to traditional stress-driven qualification.
JEDEC points out that for the standard to reach its full effectiveness, the original equipment manufacturer (OEM) and the supplier must develop partnerships. The OEM should accept the reliability qualification process performed by the supplier, per the guidelines of the standard, in lieu of the specific (special) reliability qualification process that many OEMs require today.
This JEDEC standard represents a fundamental shift from regulating how manufacturing and qualification should be performed to a performance-based approach. Consequently, the manufacturer selects those materials and processes that he wants to use. However, the manufacturer must be able to adequately verify that he meets the buyer's requirements. The manufacturer is free to chose the verification methods as long as their relevance can be shown.
The reliability of a package at board level is affected by several factors. These include the materials and processes used both for manufacturing the package and for producing PWB assemblies. The use conditions and environmental exposures during the product's service life is another factor. In addition, the acceptable failure probability may differ considerably for various types of products.
Each of these constituents must be thoroughly understood and considered when assessing the reliability of a package,13 and it is normally not possible to evaluate the reliability of a package based on a single item. For example, stresses due to CTE mismatches between package and PWB substrate will be directly distributed into area array packages and may have a large impact on the reliability of internal interconnections. Therefore, when possible, packages should be mounted on test vehicles that are representative of all the materials and processes used for producing a product. In addition, a test program must be representative of the use conditions and environmental exposures during the product's entire service life.
The process of designing a relevant test program is often called tailoring and is part of a Design for Reliability (DfR) process that may involve the steps shown in Table 1:13,14
The key issues in this process involve understanding the mechanisms of the potential failure modes (physics of failure) and how to appropriately accelerate these in reliability tests. Since reliability testing has to be application specific, it will not be possible to have a number of fixed reliability tests to chose between for evaluation. In most cases, existing test methods will have to be altered to be adequate, or new test methods will have to be developed.
Usually, more than one failure mode will be crucial for the reliability of a package. The ideal solution is to find a test that accelerates all failure modes simultaneously, but that is seldom achievable. Tailoring a test program of consecutive tests is usually the only solution. Interaction between various failure modes must then be considered so that the order of test performance yields the same type of interaction as in manufacturing and service life.
Since component manufacturers lack much of the knowledge needed for tailoring a relevant test program, JESD34 points out that OEMs must be committed to the collection and analysis of field data and be prepared to share applicable data with component suppliers.12
It then becomes the responsibility of the component manufacturer to identify all potential physical failure mechanisms and evaluate their impact on reliability. Potential failure mechanisms must include mechanisms that may be introduced by subsequent levels of manufacturing, consistent with the intended use of the product. Failure mechanisms must also include those that may occur over the expected lifetime of the product in the intended field application conditions to which the product is expected to be subjected.
It will be the OEM s responsibility to compare the manufacturer s application assumptions against the intended application conditions. However, the present trend among OEMs to outsource much of the production to contract manufacturing services (CMS) makes the situation more complicated and requires that contract manufacturers become involved in this process.
It is easy to say that one should employ a performance-based reliability assessment approach, but quite difficult to realize. The largest obstacle is probably the mental metamorphoses required for adapting to a new situation where you are responsible for the reliability verification process. Everyone has become comfortable with the situation where someone else (the organizations which release standards, for example) seemingly takes full responsibility for this process in a way that makes it unnecessary for the other parties to be knowledgeable about reliability work. This has been convenient and most people will be very reluctant to forsake it. This reluctance will be due to a lack of knowledge, primarily of how to handle this new situation. In addition, for many application areas, the knowledge of which failure mechanisms are important for reliability and especially how to accelerate these is inadequate, not only at individual companies but world-wide.
Some of the more important failure mechanisms at board level affecting the reliability of CSPs are cracking of solder joints, interconnection failures inside packages and corrosion of circuitry on chips. These failure mechanisms and adequate test methods for evaluating the impact on reliability will be described.
Cracking of Solder JointsFailure Mechanisms
IPC has published a guideline for accelerated reliability testing of surface mount solder attachment, IPC-SM-785. 9 A more thorough discussion of the mechanisms which cause solder attachment failures can be found in this guideline and elsewhere. 15,16 Therefore, only a short summary of the basic failure mechanisms will be given in this paper. The emphasis is instead placed on parameters which affect the fatigue of solder joints that are not covered in IPC-SM-785. The idea is to show the width and complexity of the knowledge required for performing an adequate DfR process.
The most common cause of solder-joint cracking is temperature change. This will cause strains in the solder joints due to thermal expansion mismatches between various materials resulting in fatigue of the solder joints. The larger the temperature change, thermal mismatch and size of the package, the fewer thermal cycles to failure. Leaded components usually have a much better resistance to fatigue of the solder joints, compared to leadless devices, due to the compliance afforded by the leads.
When a solder joint is loaded, the strain will first cause elastic deformation, followed by plastic deformation above the yield strength of the solder. If the loading of the solder joint is continued, a relaxation through creeping will occur. The time needed for complete relaxation is highly temperature-dependent. At temperatures above 100°C, relaxation will occur within a few minutes,whereas it can be neglected at temperatures below -20°C.
Below -20°C, the failure mechanism is primarily elastic/plastic fatigue. From -20°C up to +20°C, both elastic/plastic and creep fatigue are involved. At higher temperatures, creep fatigue is dominant. The creep of solder joints will cause structural changes that first will result in grain growth followed by the formation of microcracks. In time, these will grow until the solder joints are completely cracked. Although the grain growth process is a natural process, which may occur without any loading, the process is enhanced by strain loadings.
Wear out of solder joints may also be affected by vibration and mechanical loading. 9 Very high stresses can be induced if vibration levels approach the natural frequency of the board, causing large deflections. 17
If the board is fixed to avoid this problem, vibration will have little effect on crack initiation, but may contribute sign)ficantly to crack propagation. 18 The weight of components is also important for the susceptibility to damage of solder joints, since a large weight will increase the influence of vibration. When heat sinks are used, these may have a large impact on the weight of the components and, consequently, also on fatigue of the solder joints caused by vibration. 19 In addition, combined vibration and thermal cycling may cause synergistic effects with a faster propagation of cracks.
Properties Affecting Fatigue
Package properties and the design of solder lands play a major role in determining how the fatigue behavior of solder joints will affect area array components. Examples of package properties that will cause a large influence are size, pitch, CTE, local influence of chip on CTE ,20 ball size and ball composition.21,22,23 Solder land parameters of importance are type of definition (solder mask versus non-solder mask defined) 24,25 and type of surface finish.
Nickel/gold and nickel/palladium finishes have been found to promote cracking. 26 It is thought that this cracking is due to enrichment of nickel phosphide in the intermetallic layer 27 or that nickel-tin intermetallic phases are weaker or more brittle than copper-tin intermetallic phases. 28 In the case of nickel/palladium finishes, it may also be due to formation of palladiumtin intermetallic particles.29
When flip chips are soldered to organic PWBs, the use of underfill is mandatory to achieve acceptable solder joint reliability. The function of the underfill is to redistribute the stress from the joints to the chip, substrate and underfill material.'3 The use of underfill also improves the reliability of solder joints to area array components. The impact of an underfill on the solder fatigue life is influenced by a number of parameters: An underfill with high filler content is more effective in reducing creep strain than an underfill with low filler content, due to higher elastic stiffness.30 This reduction in creep strain is diminished if a soft solder-mask layer is present between the underfill and board substrate. A concentration of the creep strain will then occur at that part of the solder joint that is embedded in the solder mask. 30 Flux residues may have an effect like a soft solder mask on the strain distribution.25
Figure 1 illustrates the decreased protection against crack formation due to flux residues. In this context, the properties of the underfill are important. If an underfill with low CTE is used, the influence of flux residues is reduced (as shown in Figure 2). This may not be due completely to the low CTE for this underfill. (The Young s modulus of the underfill is probably also important, but information about it is missing for the tested underfills).
Conformal coatings may also affect the reliability of solder joints. It is well known that paraxylylene (Parylene) enhances the thermal fatigue life of solder joints. 31 Finite element calculations indicate that the extended solder life is the result of mod)fication of the stress and strain fields in the solder joints using the paraxylylene coating.32
Accelerated Test Methods
Since the fatigue of solder joints is mainly caused by temperature variations, test methods for evaluating solder attachment failure are usually based on cyclic temperature variations. Acceleration is achieved by increasing the temperature swing (AT), as well as the frequency.
According to IPC-SM-785, tests based on temperature variations can be divided into two groups, thermal shock tests and thermal cycling tests. Exposing the assemblies to rapid temperature changes which cause transient temperature gradients, warpages and stresses is defined as thermal shock.
As a rule of thumb, rates of temperature change in excess of 30°C/minute are required. Thermal cycling is defined as "the exposure of assemblies to cyclic temperature changes where the rate of temperature change is slow enough to avoid thermal shock," i.e. below 30°C/minute.
Since several thousand temperature cycles may be required before failure occurs, it is tempting to perform each cycle as fast as possible using a AT as large as possible to minimize the number of cycles required for failures. The most extreme test is when assemblies are dipped alternatively in two liquids of very different temperatures. However, the more the test is accelerated, the larger the risk that irrelevant failure mechanisms will determine the test results. Therefore, a good understanding of the failure mechanism is imperative when performing accelerated testing of the fatigue properties of solder joints. (continued)
Chip Scale Review o 7291 Coronado Drive, Suite 8 o San Jose, CA 95129 o Email: email@example.com
|© 1998 ChipScale REVIEW|