Why PCs Fail

Leslie Fiering is the computer coroner. As an analyst at Gartner, the Stamford, Conn.-based technology-research firm, it’s Fiering’s job to figure out why some computers seem to last forever while others give up the ghost after only a few weeks. “Motherboards are the biggest problem,” she says, referring to the main circuit board found inside both desktop and laptop PCs. “More functions are being integrated into the motherboard, so if one component pops the entire system can go down.”

Fiering notes that PCs, at least from a hardware perspective, are more reliable than ever. In fact, PC vendors have slashed annual failure rates (AFRs)–the number of machines that experience a significant hardware malfunction during their first year–by about 25 percent over the past few years. On the other hand, AFRs are annual averages, which means that some enterprises will experience even higher failure rates.

Measuring real-world PC durability can be tricky, however. Desktop machines are generally more reliable than on-the-move, heat-trapping notebooks. The smallest notebooks and handheld computers, which tend to be dropped and generally knocked about, are the most failure prone. “Ultra portables tend to be a little more fragile than big, heavy machines,” Fiering says.

Contrary to prevailing wisdom, system pedigree has little impact on reliability. Matthew Wilkins, a senior analyst at iSuppli, an El Segundo, Calif.-based company that tracks PC component vendors, dismisses the notion that “brand name” PCs are inherently superior to systems made by lesser known vendors. “Both major PC OEMs [original equipment manufacturers] and ‘white box’ vendors have access to the same components,” he says. “You cannot really claim that a PC from a smaller white box vendor is less reliable.”

Fiering says CFOs are largely unaware of how system failures can affect acquisition and maintenance costs. She notes that once CFOs begin to comprehend the importance of PC failure rates, especially in enterprises that buy thousands of PC each year, there should be extra pressure placed on chief information officers to spot problems and hold their PC suppliers responsible.

Yet many enterprises don’t even bother to track their PC failures, claims Rob Ayoub, an analyst at Frost & Sullivan, a technology research company headquartered in Palo Alto, Calif. “They’re flying blind,” he says. “To hold the vendor accountable, they need to at least note what failed, the surrounding circumstances, and the age of the machine.”

A lack of adequate benchmarking means that many enterprises don’t understand the full cost of PC failures. “Even if you’re under warranty and the system is repaired for free, there is still administrative overhead–dealing with the user, lost employee productivity, time spent restoring the data, and so [on],” says Fiering. “In the case of a notebook, there may be the expense of getting a restored system to an employee in a remote place.”

Fiering suggests that enterprises buying 5,000 or more machines at a time can gain control over system failure costs by negotiating a service level agreement (SLA) that holds the vendor to a pre-determined quality benchmark. “It’s a fairly new idea,” she says, however, that “clearly, the vendors are resistant to it.”

Financial remuneration shouldn’t be the goal when bargaining an SLA. Fiering recommends that enterprises should instead attempt to negotiate a spare-system pool of one to three PCs for every 100 machines bought. If the SLA’s AFR cap is exceeded, the vendor becomes responsible for contributing additional machines to the pool. For enterprises that fix their own PCs, the vendor can be bound to provide extra spare parts to a buffer stock allowance.

Since a batch of bad components can cause trouble for any PC vendor, Fiering says, CFOs shouldn’t be too hard on CIOs who negotiate deals for machines that turn out to be lemons. “Look at the millions of batteries that have been recalled in the past few months from really high-end vendors,” she says. “I mean, stuff happens.”