2010-08-09

Counting from Zero

A pal new to programming wrote me:
I'm reading that C programing book you bought me and I'm very annoyed by this counting starting at zero. Can you explain?

I remember annoying people in class counting one as zero.

I'm a little cranky this morning.

My reply…

Keep in mind that C is dumb. And old.
In the old days, computers had little memory. Computers were slow. Compilers were simple with limited capabilities. So, C was built to meet the needs of computers, not humans.

C counts from zero, typically for accessing arrays. The computer languages that survived over the decades tend to follow in the footsteps of C, to one degree or another. So many other languages count from zero.

The issue is:
  • Zero-based vs One-based
  • Index vs Ordinal
Imagine you are on your iPhone, standing on the corner of 1st and Main in Mukilteo, calling Bob to ask which house on the block is his house. He could say either:

  • Go to the 3rd house. (ordinal, by position in order)
  • Move your body forward, deeper into the block, by 2 more houses. (index, moving a cursor n number of units along a group)


    An array in all computer languages is a contiguous block of memory, a group of octets in most modern computers. Contiguous is the operative word. In C, the programmer accesses the data by telling the computer two things:

    (a) Tell where the beginning of the block of memory is located. The name of the array does that, instructing the computer to move its attention to a certain position in memory.

    (b) Tell how many octets to move forward from that starting point. If you want the first octet, you say "Move 0" because the computer is already at the first octet. If you want the second octet, you say "Move forward 1 unit" and that gets you to the 2nd octet.



    Say we have an array of 8 small numbers in an array named "myData". Say the array starts with octet # 4,054. This array uses octets numbered 4054-4061 as pictured above.

    Programmers usually think of that name "myData" as containing the array. But actually it doesn't contain the array; it just leads us *to* the array. That variable "myData" is just a single number (using 2 octets in this example, # 2,011 & 2,012), the number of the memory location where the array begins. It leads us to the first bit of the first octet of our array. Note that the arrow above points to the front of the first octet, not the contents inside.

    Begins is another operative word. If we have the beginning of a contiguous group, then we can bounce to and fro, reaching any item in the group. We just need to know how many jumps to make. If you want the first item, you make no jumps (index = 0). If you want the 3rd item, you make two jumps (index = 2).

    Think of each octet as the houses on Bob's block. If you are standing at the corner, you are already at the first house. If Bob lives in that first house, you don't have to walk further, that is, no jumps required (index = 0). If Bob lives at the 3rd house, you need 2 jumps (index = 2).

    What makes arrays fast and simple for the computer is this jumping around, jumping x number of octets forward or backward to access data without any other tracking, considerations, or overhead. 

    If we access by an index number (zero-based counting), we are asking to move our attention/cursor/marker inwards so many units into a group. If we access by an ordinal number (one-based counting), we are asking for a particular position within that group.

    Do data groups have to be zero-based? No.

    In modern computing languages, such as REALbasic, Java, and Cocoa, we have other data structures besides arrays, sometimes known as "Collections". Collections are typically ordinal, one-based. These other structures take more memory and run slower. But today's computers can afford that. Collections are much easier on human brains so they make programmers more productive while writing better code though requiring more work from the computer.

    No comments:

    Post a Comment