The New Economics of Semiconductor Manufacturing
The Toyota Production System has been applied to chip making. The electronics industry may never be the same
Image: Stuart Bradford
The semiconductor industry is undergoing a sea change. It's being split into haves and have-nots, and it has become much more difficult for everyone to make a profit. Never have so many smart people worked so hard for so little money.
Walk into a multibillion-dollar chip-fabrication plant--a fab--and you may very well get the impression that the industry is headed for a spectacular meltdown. One of the first things you'll see is a bay the size of two basketball courts packed with equipment for projecting a lithographic design onto wafers. Nearby, you'll find a towering bin, called a stocker, filled with wafers waiting to be processed by this equipment. The wafers are worth from US $10 million to $100 million--all of it idle inventory.
Why? To amortize the $5 billion investment in a fab over a five-year schedule costs more than $3 million a day. Conventional wisdom holds that to generate that much money you must keep all the equipment running all the time, even if that means creating large unused queues of wafers. What's more, to justify that scale, you have to produce a semiconductor product in volumes of at least 5000 to 10 000 wafers per month.
More than anything else, Moore's Law has been responsible for the gigantic costs. It takes huge amounts of capital to support the incessant cycles of investment and obsolescence that keep Moore's Law on the march. That rapid cycling explains why a company's shining jewels can turn into white elephants in just five years.
Although industry giants like Intel and Samsung work on a vast scale and can therefore make these huge investments work for them, smaller companies (and even some sovereign states) can no longer afford to play the game. A massive restructuring in the industry is forcing them to consolidate or outsource production in order to gain sufficient scale to compete.
Every month new alliances and divestitures bring fresh evidence of this restructuring. In 2006, Texas Instruments announced that it would partner with foundries to codevelop future process technologies based on line widths (the smallest feature on a chip) of less than 45 nanometers. In 2003 and 2006, respectively, Motorola and Philips--iconic companies in the industry--spun off their semiconductor operations entirely. In 2006, LSI Logic (now LSI Corp.) acquired Agere Systems and continues to struggle. Intel sold its communications and application-processor business to Marvell Technology Group. Advanced Micro Devices, its cash flow and its competitiveness in question, acquired ATI Technologies in 2006.
All these strategic moves were meant to recover the growth and profitability of the past. But none of them have done so.
There is, however, a glimmer of hope, and it comes from an unlikely source: the Toyota Motor Corp. For more than 30 years, Toyota has followed a production system that has enabled it to increase quality, double capacity, produce a wider variety of models in a given factory, and change the mix on a dime. Last year Toyota made more cars than any other company, surpassing General Motors.
Even more important, Toyota's approach to mass production has produced bountiful profits. In 2005, it earned more than all the other auto manufacturers in the world combined. Yet although many scholars and executives have scrutinized Toyota's plants and production methods--GM went so far as to open a joint venture with Toyota in California--no one has yet been able to fully replicate its success.
In early 2007, we had the opportunity not merely to emulate Toyota's system but to apply its principles to a logic fab belonging to an integrated device manufacturer (IDM). As consultants, we are not at liberty to divulge the company's name; however, it's safe to say that the company is highly competitive--that is, it has survived and prospered by pursuing Moore's Law, always remaining at the forefront in technology and operational excellence. But Moore's Law was turning this jewel of a fab into a white elephant while the equipment was still relatively new.
In just seven months, the organization was able to reduce the manufacturing cost per wafer by 12 percent and the cycle time--the time it takes to turn a blank silicon wafer into a finished wafer, full of logic chips--by 67 percent. It did all this without investing in new equipment or changing the product design or technical specifications. And this short experiment has exposed only the tip of the iceberg. We believe that these early results point to what we call the new economics of semiconductor manufacturing and that this will have a profound and lasting effect on the industry and create new opportunities for growth.
The principles and philosophy of the Toyota Production System (TPS) that we applied were first described in 1999 by Steve Spear and Kent Bowen, then at the Harvard Business School , in their article ”Decoding the DNA of the Toyota Production System” in the Harvard Business Review. They noted that Toyota trains its workers to treat any problem that arises as an opportunity to learn. Toyota designs and redesigns work according to a rigorous process to examine the current state of production and generate hypotheses on how to improve it, together with a highly specified expected outcome.
It's an empirical approach based on iterative experimentation, one that long escaped the many Toyota watchers who typically fell into the trap of confusing the company's tools--such as kanban cards, used to order parts--with its principles.
Spear and Bowen distilled TPS into four rules, which in summary are (1) highly specify activities, (2) clearly define the transfer of material and information, (3) keep the pathway for every product and service simple and direct, and (4) detect and solve problems where and when they happen, using the scientific method. When we present these rules, even in their fully detailed form, clients generally protest that they ”do it that way already.” But on closer examination--while auditing their fabs--we often find something quite different [see sidebar, ”The Toyota Production System Sanity Check”].
Here are examples from our work.
The first rule, on activities, states that ”all work shall be highly specified as to the content, sequence, timing, and outcome.” At the fab we studied, maintenance technicians were supposed to clean the etch chamber from top to bottom, but we observed that sometimes they did it from bottom to top. That order wouldn't have been so bad if it had been followed consistently, because the behavior would have become a new set point around which further improvements could be based. But in fact, the method of cleaning changed unpredictably. There was so much random variability in the work that nothing could be learned from the results.
Image: Stuart Bradford
The next rule states that ”every customer-supplier connection must be direct, and there must be an unambiguous yes-or-no way to send requests and receive responses.” This rule was violated when, for example, a worker--we'll call her Jane--operated a deposition-process machine that received wafers supplied by another worker, whom we'll call Bill. That arrangement made Jane Bill's ”customer.” In a properly ordered system, he'd send her wafers when--and only when--they were needed. In practice, he sometimes had no wafers when she needed them, and at other times he encumbered her with wafers she could not use. Jane then had to throw those excess wafers onto the costly pile of inventory.
The third rule states that ”the pathway for every product and service must be simple and direct.” This rule was violated when a cassette of wafers showed up at a bay equipped with 10 identical process tools, any one of which could be used to process the cassette. But which tool were the workers supposed to use? No one could say offhand because the path was not simple and direct, and the critical decision was left to the operator on the floor.
Why does this indeterminacy matter? First off, not knowing the direct path from the outset makes it hard to reconstruct that path later on, if a batch of wafers should show defects. Even worse, ignorance of the path prevents the system as a whole from discovering a defective machine in time to prevent a recurrence. Memory fades quickly, and when workers are able to avoid a malfunctioning machine without fixing it immediately, they lose opportunities to learn and improve. The Toyota system requires that the problem be addressed locally, immediately, and intensively by clearly defined actors, not just any ”expert” who happens to be available.
The fourth rule states that ”any improvement must be made in accordance with the scientific method under the guidance of a teacher, at the lowest possible level in the organization.” One worker found a better way to get a customer the material he needed. He helped the customer pull the material from his supplier instead of having the supplier push it to the customer, as had been done before. The worker had to analyze the current state of things, document it, and formulate a hypothesis that included an experiment with an expected outcome that could be measured and compared with the actual outcome. Such problem solving engages everyone and creates an army of scientists engaged in continual improvement and organizational learning.
Implementation of these ideas is harder than it may seem; it requires a certain adjustment of thinking. We have studied many companies that were trying to increase their operational efficiency. In the beginning, they typically lump TPS under the rubric of ”lean” manufacturing. But while many lean techniques have great merit, the Toyota system is strikingly different.
Most semiconductor companies undertake elaborate theoretical work in offices and conference rooms, using computer simulations and spreadsheets, in the hope of determining fundamental mechanisms. They typically focus on creating big projects that promise to yield ”silver bullet” solutions. While this approach can certainly have its rewards, it's not the Toyota way.
TPS constitutes a highly empirical method of managing a multistep manufacturing process. Such empiricism beats simulation, because no simulation model is sophisticated enough to capture the complexities inherent in semiconductor processing. The fastest way to develop process understanding is to execute lots of small, fast-paced scientific experiments on the factory floor. The factory is the laboratory. This is the essence of TPS--rapid, iterative, experimental problem solving.
Such experimentation, done in the course of regular production work on the factory floor, doesn't necessarily yield a final, perfect answer to a particular problem--no such perfection may exist. Nor does TPS use an off-the-shelf ”cookbook” approach. Instead it adjusts to the strategic imperatives concerning cost, quality, flexibility, or any other metric the company wishes to emphasize at a given time. For this project, our client instructed us to concentrate on reducing cycle time and cost, both of which were critical to opening new markets for the fab.
In January 2007, we set to work on this collaborative effort. At its inception, the organization viewed the initiative to implement TPS as just another management flavor of the month. The prevailing attitude was to wait it out until the consultants left and things returned to normal. Thanks to the commitment, tenacity, and vision of the plant manager, this kind of resistance melted away.
To get the people at the fab to buy into our program, we formed a project team composed of eight people representing key functions, such as manufacturing management, equipment maintenance, finance, strategic planning, engineering, and fab floor personnel. As the first item of business, the team set a goal to cut costs by 12 percent and cycle time by 32 percent within the first six months. Moreover, we strove to build a learning organization that would be able to sustain the work long after we had left.
The next order of business was to train people in the Toyota system and in manufacturing science. In accordance with Toyota's principles, much of the training took place on the fab floor. The plant management played the role of new hires attempting to gain certification to process wafers in the fab under the supervision of a senior technician. In other words, they acted as apprentices. The basic premise behind this approach is that in order to mentor, coach, or teach someone to solve problems, you must have direct experience in solving them yourself.
Although at first this learning technique met with considerable resistance (and skepticism), it proved to be highly effective. Many members of the plant's management were very surprised, and exhausted, after experiencing what a typical day was really like for the people who actually make the product.
By August 2007, the organization had lowered cycle time in the fab by 67 percent and reduced costs by 12 percent. In addition, the number of products produced increased by 50 percent, and the production capacity increased by 10 percent, all without additional investment. If the fab continues on this journey of organizational learning and improves aspects such as equipment maintenance variability, we expect even bigger gains.
The potential impact of the Toyota Production System is profound, because its improvements affect the general relationship between a factory's cost of additional production capacity and the average cost per unit. This relationship forms what economists call an economy-of-scale curve, and it applies to a number of capital-intensive businesses, including semiconductor and automobile manufacturing.
Let's examine the concept in detail. Imagine that it costs an average of $20 per chip to produce 2000 identical chips. If you then increase the volume to 4000 chips, the average unit cost drops to $12 per chip. Increase it further, say, to 6000 chips, and the cost per chip will drop to $10. This is a consequence of many factors, but mostly the rise in operational efficiency and manufacturing yields.
The major reason for increasing the size of a plant is to make full use of the lower unit cost that can be achieved at higher production volume, that is, economies of scale. Economies of scale exist when the factory's total capital and operating costs are increasing at a slower rate than its production volume.
Remember the old adage ”You can't get something for nothing”? Well, there comes a point when you can't increase the output without making costs rise at an even faster rate. To take our earlier example, if you increase your output to 7000, the result may be that the cost per chip rises to $11. Increase the volume to 8000 and the cost per unit rises to $16 per chip. This rise comes because layers of management tend to grow as the workforce grows and because the burden of management increases as additional product types are assigned. Such ”diseconomies of scale” explain the right-hand side of the U-shaped scale-to-cost curve [see graph, ”TPS Lowers the Curve”].
Implementing TPS not only reduces the cost per unit at a given production volume, it also reduces the minimum number of units a fab needs to turn out to be cost-effective. That is, TPS moves the cost curve down and also broadens it.
Throughout the past 40 years, the only way to move the scale curve has been through the pursuit of Moore's Law, along with the enormous capital investment that this entails. Unfortunately, such spending pushes the curve not only down but also to the right [see graph, ”New Opportunities for Profitable Growth”]. The result has been an increase in the minimum volume at which production is cost-effective. What all this means is that TPS will lower both the minimum cost and the volume of efficient production. When that happens, a lot of the great engineering ideas that have been shot down by the bean counters over the years will suddenly become attractive from a business perspective.
Image: Stuart Bradford
The new economics of semiconductor manufacturing now makes it possible to produce chips profitably in much smaller volumes. This effect may not be very important for the fabs that make huge numbers of high-performance chips, but then again, that segment will take up a declining share of the total market. This isn't because demand for those chips will shrink. Rather, demand will grow even faster for products that require chips with rapid time-to-market and lower costs, such as consumer electronics.
Competition is shifting toward a new playing field. Now what matters is making a large variety of products, each product in small volumes and each perhaps for only a short time. Examples of these growing markets include cellphones and MP3 players, which are subject to trends in fashion. Then there are the thousands of chips that are increasingly finding their way into our homes, offices, automobiles--and into every nook and cranny of our lives.
You often hear executives in the semiconductor industry sighing for the next great vehicle for industry growth, like the PC in the 1990s and the minicomputer before that. Well, perhaps the next killer application won't be one thing but rather scores or hundreds of things, none of which require the raw performance that only the biggest, most technically advanced fabs can provide. Perhaps what the next wave of killer apps requires is a new business model, made possible by such things as TPS.
Throughout history, business models that reduced the minimum effective size of factories have transformed entire industries. Steelmaking was transformed by the minimill's ability to efficiently produce small batches of steel, business computing by a succession of ever-smaller machines starting from mainframes for payrolls and ultimately leading to the personal computer, and photographic film processing by fully automated one-hour film-processing machines, which were then replaced by digital photography. Because these transformations offered customers entirely new ways of doing things--rather than simply making the existing model work a bit better--we call them disruptions. The agents of disruption are invariably business models (although these models often come with a new technology wrapped inside).
Toyota's system has transformed the automobile industry. Fifty years ago, the industry offered far fewer car models because its scale curve was high--you had to sell a lot of units of a given model to be cost-effective. For example, in the 1950s, Chevrolet sold 1.5 million Impalas per year, a number that was considered high but not extraordinary. Now the industry regards 250 000 units per year as high, and many models sell at only a fifth to a tenth of that rate.
This change came about because of the decline in the minimum economic scale of a car factory. Some companies have handled the transition better than others. GM, once the paragon of massive mass production, posted a record $39 billion loss for 2007, providing yet more evidence of how hard it can be to emulate Toyota.
But there's more. To gain full benefit from the advances made on the manufacturing side, you may also need to restructure product development and design, purchasing, marketing, service, and other aspects of your company. That is, you must create a new business model.
Consider how the emergence of standardized modular components has made it possible for a technically untrained person to select among them and order a precisely configured computer, which a company can assemble and deliver in three days. This business model, made famous by Dell, has created new markets, industries, and subindustries.
Now imagine this modular design idea being extended to semiconductor devices. If that happened, even MBAs might be capable of specifying the components of their very own chips to be delivered to their doorsteps. Well, maybe not MBAs, but you get the picture.
About the Author
CLAYTON M. CHRISTENSEN is part of the multidisciplinary team that wrote this month's feature on applying Toyota's production methods to semiconductor manufacturing. Christensen is a bestselling author and a professor at Harvard Business School. Semiconductor consultant STEVEN KING holds a bachelor's in engineering from Worcester Polytechnic Institute. MATT VERLINDEN, also a consultant, is a graduate of the MIT Sloan School of Management. WOODWARD YANG is a professor of electrical engineering and computer science at Harvard.
To Probe Further
To learn more about factory management science, see Operations, Strategy, and Technology: Pursuing the Competitive Edge , by Robert H. Hayes, Gary P. Pisano, David M. Upton, and Steven C. Wheelwright, Wiley 2004. For a detailed description of the method of continuous improvement described in this article, see ”Decoding the DNA of the Toyota Production System,” by Steven Spear and H. Kent Bowen, Harvard Business Review, 1999.
The concept of a disruptive technology and how it affects entire industries is laid out in The Innovator's Dilemma , by Clayton M. Christensen, Harper Business Essentials, 2003.