hitcounter
dpointer
Sunday, July 11, 2004
 
Column Label of Spreadsheet

If you once work with spreadsheet application, like Microsoft Excel, you'll notice that unlike rows, columns are labelled not by numbers, but by text. Column 1 is A, column 2 is B, and so on. Since we have only A to Z, column 26 is still Z but the next one would be a two-character label, i.e starts from AA, AB, and so on. When this reaches ZZ, the label starts over with AAA, AAB, until ZZZ, and restarts over, and so on, and so on.

So if I give you a column index, say column 1977, how could you tell me its label? The following code is the answer:

QString columnLabel( unsigned column )
{
  QString str;
  unsigned digits = 1;
  unsigned offset = 0;
  
  column--;
  for( unsigned limit = 26; column >= limit+offset; limit *= 26, digits++ )
    offset += limit;
      
  for( unsigned c = column - offset; digits; --digits, c/=26 )
    str.prepend( QChar( 'A' + (c%26) ) );
    
  return str;
}

So it looks a lot simpler than it could be, right? Note that I wrote the code with Qt's QString and QChar classes in mind, because at the time I did it for KSpread. So column 1977 is actually BXA.

The corner case of course when there's integer overflow duing limit calculation. If you track the algorithm, you'll know that this means this code fragment works find until column 4058115285 (which is MCNIOCC). I doubt that at the moment there's an application which can handle so big worksheet, so in the mean time we're on the safe side.

The task to improve the code, for example handling the above mentioned corner case (possible solution: switch to floating-point number, with performance sacrifice), porting it to pure C, C++ string class, or even glib's g_string, are as usual given as exercises for the reader.


Comments:
With the hip-and-hurra of UNICODE -- which pushes even variable name into some cryptics style (Arabic, Hebrew, Asia Letters, and so on) -- it is not so crazy idea to implement column name with broader set of new letter. So not only 26 ASCII that by some peoples thought as too US-centric. For example, how if column is named with alpha, betha, gamma, etc. with their own native symbol? Maybe this is not bad idea and not only lucky for Greek, but also for most mathematicians who daily deal with those symbols.
 
Years ago, Lotus Improv had used the idea of naming columns in a more reasonable manner (e.g SALES, PRICE, TOTAL, etC), along with its unique and different spreadsheet point of view. Unfortunately, Improv refused to exist anymore and yet nobody comes up with similar approach.
 
Hi,

I am average programmer, and mostly work with C, so pardon the question:

I notice that the function returns a local variable str.
Is this safe in C++? Doesn't it go out of scope?

Rgds,
Rachman
 
str will go out of scope, but a copy of it will be created as the return value. Since QString uses shallow-copy trick, this is fast and still memory-efficient. If you look at Qt API, there are thousands of member functions returning a QString.
 
Post a Comment

<< Home

Powered by Blogger