Saturday, January 3, 2009

What killed the Zune30?

So we've been hearing a lot about Microsoft Zune 30s crashing. Microsoft have now said that it was a leap year bug.

And indeed it is! From Pastie (start at line 249):

//------------------------------------------------------------------
//
// Function: ConvertDays
//
// Local helper function that split total days since Jan 1, ORIGINYEAR into
// year, month and day
//
// Parameters:
//
// Returns:
// Returns TRUE if successful, otherwise returns FALSE.
//
//------------------------------------------------------------------
BOOL ConvertDays(UINT32 days, SYSTEMTIME* lpTime)
{
int dayofweek, month, year;
UINT8 *month_tab;

//Calculate current day of the week
dayofweek = GetDayOfWeek(days);

year = ORIGINYEAR;

while (days > 365)
{
if (IsLeapYear(year))
{
if (days > 366)
{
days -= 366;
year += 1;
}
}
else
{
days -= 365;
year += 1;
}
}


// Determine whether it is a leap year
month_tab = (UINT8 *)((IsLeapYear(year))? monthtable_leap : monthtable);

for (month=0; month<12;>wDay = days;
lpTime->wDayOfWeek = dayofweek;
lpTime->wMonth = month;
lpTime->wYear = year;

return TRUE;
}


Why is this bad? Well, 2008 was a leap year that has 366 days. Let's step through the lines of code that caused the problem.

//Calculate current day of the week
dayofweek = GetDayOfWeek(366);

year = 2008;

while (366 > 365)
{
if (IsLeapYear(2008))
{
if (366 > 366)
{
days -= 366;
year += 1;
}
}
else
{
days -= 365;
year += 1;
}
}

As you can see, the while loop condition becomes true - yes, the day is day 366 and that's greater than 365. And yes, 2008 is a leap year. But as you can see, 366 will never be greater than... 366.

Therefore, the loop condition never evaluates to false, hence an infinite loop. Thus your Zune will crash.

Guess Freescale, the makers of the Zune's processor (the MC13783), had a programmer who didn't understand about boundary conditions.

Update: Another blogger has now gone and suggested a few bug fixes for the Zune issue. Nice going :-)

2 comments:

  1. Although the code can be written more efficiently, it's a classic example of the off-by-1 class or errors.

    Bugs like these are easily caught by unit tests. For this function, the unit test would be fairly easy to write and it would be fairly easy to completely test the next hundred years or so.

    ReplyDelete
  2. take a look at test driving the bug out of the zune code: http://www.renaissancesoftware.net/blog/archives/38

    ReplyDelete