Bugs In Firmware Are Here To Stay - Can Companies Deal With It?

The CEO of Research in Motion, Jim Balsillie, has admitted to the Washington Post that the recent release of the Blackberry Storm was buggy, and they knew it. Pushed out to make sure it was in the shops for Black Friday – one of America's biggest day for consumer electronics sales – after the planned shipping date in October was missed. And he ominously warned that shipping with imperfect software was the future of electronics. He's right – and let me explain why.

New devices being launched with firmware that isn't quite right and is hopefully fixed with some post-release updates? Doesn't that sound familiar to Symbian users? Indeed it is, and while I think the phrasing of Balsillie was a touch melodramatic, it is certainly true.

The days of a perfect operating system at launch are long gone – even Palm, when they released their first organsier had to ship it with a patch in software, back in 1996.

Why is this? It's to do with complexity. The modern mobile Operating System is huge, and measured in tens of Megabytes. That's a far cry from ten years ago when the rock solid reputations of Palm OS and Psion's SIBO (a forerunner to Symbian OS) were made. They had very little external devices to cope with beyond a serial port and user input either by the screen or the keyboard. Now you have multiple ways of connecting your smartphone to other computers and networks – Bluetooth, sometimes infrared, GSM, GPRS, EDGE, 3G, Wi-Fi, NFC... That adds a lot of complexity to the inputs that any OS processes have to handle.

Bundled applications are perceived as part of the OS, and rightly so. If the Contacts app doesn't work, how well a call sounds doesn't matter, because you can't get to Auntie Jean's phone number to call her. That means that as well as the OS, all the applications need testing to ensure they don't do anything silly, that they act as the user should reasonably expect, and don't interfere with other programs or tasks.

Firmware updating - Nokia Style

That's a big task. With around 50 applications, just sorting out the interactions between any pair of programs is potentially 2 to the power of 49, which is a rather large 562 billion combinations. Now test all that again, but with the MP3 player running in the background.

There's no way to brute force check that a smartphone will work, the maths is just too much. Start adding in unpredictable users, varying conditions for connectivity, the physical changes weather can bring on an icy day compared to the summer sun.... A certain amount of conjecture has to be employed, along with programs designed to catch errors and be graceful about them.

I'm not giving anyone a free pass because the task is complicated (all the worthwhile ones usually are), but to give you some idea of the scale and complexity involved. Now think about how few failures there are relative to the complexity of your smartphone and you'll begin to see that they are actually incredibly stable.

Smartphones will be released with problems in the firmware – the problem comes with human nature, and two areas can have a dramatic effect on the perception of a device. Stable phone, but the real world gets in the way. Deadlines have to be met for marketing, sales teams and the press wanting early review units. If there is a “date of no return”, such as RIM experienced with Black Friday, then it's going to go out whatever the condition, and the team will have to work on the problem while the device is in the stores and end-users' hands.

How much will RIM lose in bad press over the Storm? And how much would they have lost if they had missed the sales of Black Friday and taken a PR hit in not having the device out? That's not an engineering call, that's a management call, and one that every company has to face up to.

What the company does when bugs are found 'in the wild' is the area where things need to change. Rather than hide everything behind a wall of silence, before, at some almost random point in time, providing a new version with no indication of what has changed, there needs to be a little bit more honesty and openness.

We can see what the bugs are – we have the phone and the bugs are in front of us. What we want to know is that (a) you know about them and (b) when we are likely to see a fix for them. Yes it will mean admitting you are not perfect, but as the Open Source world has shown, being open about bugs builds more trust than hiding them away. When a new firmware is released, supply a changelog publicly, rather than having one leaked by the feeding frenzy of hacker web sites.

And make sure that people know about the updates and can get them. While there is a business reality of each network getting a custom build of the firmware of Nokia devices, it makes it a pain to update the devices with recent firmwares because the network has to control the roll-out. This is one area that Apple have got it right – they control the roll-out of the firmwares across all devices worldwide and everyone gets them at the same time.

The situation has certainly improved over the last few years, with over the air firmware updates being the big breakthrough. But there are still many issues, including presentation, implementation and the perception of updates being technically hard to do. The innovation and process needs to improve, to be more user friendly, and perhaps a touch of education as well couldn't hurt.

Firmwares will be released with errors – that's not a slur on a company. It's what they do after the release that's important.

-- Ewan Spence, Jan 2009.


Published by Ewan Spence at 10:23 UTC, January 28th

Section: News
Categories: Links of Interest, Editorial Thoughts
Platforms: General