Pointers.

Pointers are at the heart of C. When you crack this subject, you have got the worst of C behind you. Before you tackle pointers though, you should get a grip on arrays.

Also:

Zeroth Principles.

The CPU accesses memory, and can read out or write in values to any address. It addresess the memory by address number only and writes the values as numbers only. For example, it might write the number 1 to the address 2. But as humans, programming the CPU based entirely on numbers is painful; we associate values with names, not numbers. e.g. "The total is 100" or "A=1". The C compiler helps us associate addresses with names. In fact, it automatically assigns the next available address to a name as soon as you declare it, and it sets aside enough space in memory for the size of the variable, by moving the pointer to the next available memory by that size. For example:

char c;

Allocates space for a single character. However,

int i;

saves room for an entire integer, however big that is on the system for which it's being compiled; anything from 2 to 8 bytes, 16 to 64 bits.

It's important to remember that c, or i or whatever the name of the variable might be, it's just an address number. When we assign a value to the variable, the compiler is just putting that value into memory at the address assigned to the name.

If i was assigned to address 2, then

i = 1;

puts a 1 into the memory starting at address 2, using as many bytes as an integer requires. If an integer is 32 bits, or 4 bytes, then address 2 would be 1, and address 3, 4, and 5 would all be 0. Or... it might put 0 into address 2 thru 4 and the 1 might end up at address 5. The difference is called "endianness" and it just describes the order multi-byte values are stored in. To make it easy for us, as we are used to writing numbers left to right, let's go with the second (0,0,0,1) version which is called "big endian". The first version (1,0,0,0) is called "little endian". Anyway...

By default, when the name is referenced, the c compiler uses it to reference the value at that address. We can "see" the address, instead of the value, by prefacing the name with an & symbol. If i was assigned to address 2, then &i will return the value 2; not the value at address 2, but the actual address: 2.

& returns the address of a normal variable.

Now, there is a different class of variable which, by default, returns it's address, not the value at that address: The Pointer.

Pointers are variables that point to locations in memory. They are just like regular variables, except that their values are again pointers to other locations in memory. Pointers point to a location, where a value is stored which itself points to another location. So when we reference a pointer variable, by default, they return that second location, that "value which is an address" . For example, if we define:

int* pi = &i;

and i was assigned address 2, so that &i resolves to 2, at which is stored the value 1, then pi will point to the same address as i, but pi will return 2 where i returns 1. However, just like regular variables, we can get to the other meaning with a prefix; in this case, it is *. So *pi will return 1; the value at the address pointed to by pi.

* returns the value at the address of a pointer variable

Note that while regular, non-pointer, variables will require different amounts of space in memory depending on the size of the variable, pointers always take the same amount of space; because they are always pointing to a memory address. So on a 32 bit system, a char would take a single byte, an int would (most often) take 32 bits and a long 64 bits, but a pointer to any of those will always take the same 32 bits required to address a location in memory.

There is another tricky way to define a pointer; define an array. For example:

#define STR_LEN 10
char pc[STR_LEN];

Will define a pointer to the first of a series of 10 characters which will be placed in memory one after the next. We use a define to associate the text "STR_LEN" with the value 10 so that we might check for anything going into pc to ensure it is not longer than 10. Failing to do this is a root cause of many crimes.

We can access the 2nd character of pc by asking for pc[1]. Note that the first character of pc is pc[0] as arrays are 0 indexed in c. If this is horribly confusing you can write pc[2-1] as a way of both knowing that you asked for the 2nd character, and addressing the fact that the c index is always 1 less.

Just like the * prefix, the [] suffix changes a pointer into a reference to its value. For example, you can also address the 2nd character of pc by asking for *(pc + 1) as this takes the started address, pc, adds 1, then converts that into the value at that address.

[] returns the value at an array index

Note that C understands the size of the values stored at an address pointed to by a pointer. For example, an array of integers:

int pi[STR_LEN];

Would be 40 bytes long (10 integers, each 4 bytes long). But the 2nd integer is still pi[1] and not pi[4]. And (get this) if you add 1 to pi, it will advance it by 4, not 1. (!)

After defining an array, it is often handy to define a second pointer into the array to help point to the elements of the array. This can be a little confusing, and isn't necessary, but it can be quite a bit faster. For example:

char c[] = "Hello World\0";
char* pc = c;
...
for (int i = 0; c[i]>0; i++) {
 printf("%c",c[i]);
 }

will print out all the values of f, which will be Hello World as expected. The \0 at the end signals the for loop to stop. However, for each loop, the CPU must check for that \0, multiply i by the size of a char (which is 1, so no multiply is actually needed), add that to the base address of c, get that value, and print it, then increment i. If we instead do:

while(*pc) {
 printf("%c",*pc++);
 }

We get the same thing, but all we had to do was get the value at pc, check to see if it was zero, and if not, print it, and increment pc to point to the next character.

As we mentioned above, if the size of the data in the array is 1, this doesn't really save anything. But if you have an array of ints, or doubles, and your c compiler doesn't optimize it for you, it can make things a bit faster.


First Principles.

To understand pointers, it may be worth understanding how normal variables are stored. If you disagree, Click here to move on.

What does the following program really mean?


        main()
	{
	  int Length;     
	}      

In my mind, it means, reserve enough storage to hold an integer and assign the variable name 'Length' to it. Assuming int is a 32 bit value, that would mean taking up 4 bytes. The data held in this storage is undefined. Graphically it looks like:


  Addr Data
  ---- ----
 | F1 | ??  <------- LENGTH --------- 
 | F2 | ??
 | F3 | ??
 | F4 | ??
 | F5 | ??  <------- (next free space)
   ...

To put a known value into 'Length' we code,


	main()
	{
	  int Length;
	  Length = 20;
        }

the decimal value 20 (Hex 14) is placed into the storage location.


  Addr Data
  ---- ----
 | F1 | 00  <------- LENGTH --------- 
 | F2 | 00
 | F3 | 00
 | F4 | 14
   ...

Finally, if the program is expanded to become


      main()
      {
        int Length;
        Length = 20;
        printf("Length is %d\n", Length);
        printf("Address of Length is %p\n", &Length);
      }
      

The output would look something like this .....

    
      Length is 20
      Address of Length is 0xF1
      

Please note the '&Length' on the second printf statement. The & means address of Length. If you are happy with this, you should push onto the pointers below.


Pointer definition.

A pointer contains an address that points to data. An example of code defining a pointer could be...


      main()
      {
        int Width;
        int* pWidth;
        pWidth = &Width;
        *pWidth = 34;
      }

A graphical representation could be...


  Addr Data
  ---- ----
 | F1 | 00  <------- Width --------- 
 | F2 | 00
 | F3 | 00
 | F4 | 22
 | F5 | 00  <------- pWidth --------- 
 | F6 | 00
 | F7 | 00
 | F8 | F1
   ...

Unlike the Length = 20 example above, the storage pointed to by 'pWidth' does NOT contain 34 (22 in Hex), it contains the address where the value 34 can be found. The final program is...


      main()
      {
        int Width;
        int* pWidth;
        pWidth = &Width;
        *pWidth = 34;

	printf("  Data stored at *pWidth is %d\n", *pWidth); 
	printf("       Address of pWidth is %p\n", &pWidth);
	printf("Address stored at pWidth is %p\n", pWidth);
      }
				  

The program would O/P something like.


          Data stored at *pWidth is 34
	       Address of pWidth is 0xF5
	Address stored at pWidth is 0xF1

A pointer can point to any data type, ie int, float, char. When defining a pointer you place an * (asterisk) character between the data type and the variable name, here are a few examples.


	main()
	{
	  int    count;		/* an integer variable              */
	  int   *pcount;	/* a pointer to an integer variable */
	  float  miles;		/* a floating point variable.	    */
	  float *m;		/* a pointer			    */
	  char   ans;		/* character variable		    */ 	
	  char  *charpointer;	/* pointer to a character variable  */
	}


Pointers to arrays

When looking at arrays we had a problem accessing the data within a two dimensional character array. This is what the code looked like.

        main()
        {
          char colours[][6]={"red","green","blue"};
        }

The code above has defined 3 arrays, each containing 6 character strings. We can access the individual characters with the following syntax.

	printf ("%c \n", colours[0][0]);

but can't extract a whole string. By using pointers as below, we can.

        main()
        {
          char *colours[]={"red","green","blue"};
        }

This now defines an array of 3 pointers, all pointing to storage locations. So

	printf("%s \n", colours[1]);

will return green.


Char Arrays verses Char pointers

What is the difference between these to lumps of code?

	
        main()			
	{			
          char colour[]="red";		
          printf("%s \n",colour);	
	}				


        main()
        {				
            char *colour="red";
            printf("%s \n",colour);	
        }				

The answer is, not a great deal, at first sight! They both print the word red because in both cases 'printf' is being passed a pointer to a string. The difference is on how the pointer is defined.

The pointer can also point to dynamically allocated memory. See the malloc function for details.


Void Pointers

There are times when you write a function but do not know the datatype of the returned value. When this is the case, you can use a void pointer.


Pointers to pointers

To be done ....


Pointers to functions

Here are a few examples. Simple example. Example passing 'int' variables. Example passing 'char' and 'char *' variables.


See Also:

VOID keyword. function arguments. linked lists. Strings. Arrays.


Top Master Index Keywords Functions


Martin Leslie 06-Feb-96