Linked Lists

A Tutorial by John Findlay

Creating a new Node
 
Adding a Node to the list
 
Searching through the nodes
 
Removing a node from the list

 

Linked list's are extensively used throughout the programming industry and are particularly useful when one needs an expandable list.

What is a linked list? A linked list is a collection of structures called nodes that are linked by one member of the structure pointing to the next structure in the list. For a double linked list the structure uses two members, one pointing to the previous node and one pointing to the next node.

At least one end of the list needs to be terminated in some way so that the end of the list can be determined. This termination node can be terminated anyway you choose but normally a NULL pointer is used meaning that there is no next node or no pervious node in the double linked list.

Below is a diagram that shows a visual representation of a single linked list together with a minimum structure that could be used to implement a list. The diagram does not show the end node which would have a null pointer.

Single Linked List.

typedef struct tagNODE {
    NODE * pNext;         // pointer to next node
} NODE;

Of course that structure is not very useful because it doesn't hold any information except the pointer to the next node. Normally one would have perhaps a pointer to some data or even the data itself in each node.

As well as a single linked list one can use a double linked list - below is a representation of both the structure and a diagram showing the relationship between nodes. This diagram does show the end nodes, each having a null pointer.

Double Linked List.

typedef struct tagNODE {
    NODE * pPrev;         // pointer to previous node
    NODE * pNext;         // pointer to next node
} NODE;

This tutorial will use the double linked list as this is the most complicated to implement - having understood the double linked list there will be no problems implementing a single linked list.

The example accompanying this tutorial will be my tmalloc library that can be found in jlib in the libraries section of this web site. tmalloc uses a double linked list to log all calls to malloc, calloc and realloc. As each log occurs the list has to be expanded by adding a new node to the double linked list. 

The tricky part of managing a listing list is adding and removing nodes and searching the list's nodes, this is where most beginners find problems. The tmalloc example strictly speaking could be done with a single linked list but we will implement a double linked list for the reason stated earlier, that is, it is the more difficult of the two.

The tmalloc library works by replacing the standard defines of malloc, calloc and realloc with tmalloc, tcalloc and trealloc where the double linked list is used to log all calls to those run time library functions.

Let's start with the user of the tmalloc library calling malloc. Remember malloc has been re-defined as tmalloc so the call is redirected. First, here's the node structure used by the tmalloc library's linked list.

typedef struct tagNODE {
    NODE * pPrev;
    NODE * pNext;
    size_t size; 
    int    type;
    char   filename[32]; 
    int    linenum;
    void * pMem;
} NODE;

pPrev        previous node in heap
pNext        next node in heap
size         size of requested block of memory
type         "MALLOC = 1", "CALLOC = 2", "REALLOC = 3"
filename     file/module name where the call was made
linenum      line number where the call was made
pMem         pointer to user memory (returned to user)

malloc is re-defined like so in tmalloc.h

#define malloc(n) tmalloc( n, __FILE__, __LINE__ )

when the user calls malloc, tmalloc is called instead

char * myMem = malloc(value); 

actually calls tmalloc(value, __FILE__, __LINE__);

 

Creating a new Node

Here is the tmalloc function

void * tmalloc( size_t size, char * pFile, int nLine )
{
    void * ret;
    NODE * pNode;

    pNode = malloc( sizeof(NODE) + size );
    if (pNode) {
        AppendToLinkedList( pNode );
        pNode->size = size;
        pNode->type = MALLOC;
        strnCpy(pNode->filename, ExtractFileName(pFile), 30);
        pNode->linenum = nLine;
        // The user mem is + sizeof(NODE)
        // it will be on a 4 byte boundary
        pNode->pMem = pNode+1;
        ret = pNode->pMem;
    }else {
        ret = NULL;
    }

    return ret;
}

Note: strnCpy is an own function that will always append a null char at the last position of the copy. (position 30 in this case)

First tmalloc tries to allocate the requested size plus the size of the NODE structure. If successful the node is passed to AppendToLinkedList() where the nodes are linked together. Then various information is stored in the node structure. The size of the requested block, which call was used to allocate the memory (MALLOC, CALLOC or REALLOC) , the name of the translation unit and the line number where the call to malloc (tmalloc) was made.

The value returned to the caller (pNode->pMem) is the address of the start of the allocated memory block plus an offset of sizeof(NODE). So each node stores information about the call to malloc (tmalloc) and the returned address to the caller. The return address is the address where the user stores his/her data as usual when allocating memory with malloc. If the call to malloc fails, NULL is returned.

 

Adding a Node to the list

Here is the call to append the node to the list.

static NODE * pHeadNode = NULL;

static void AppendToLinkedList( NODE * pCurrent )
{
    if (!pHeadNode) {              // Add first node
        pCurrent->pPrev = NULL;    // no previous node
        pCurrent->pNext = NULL;    // no next node
    } else { // else other nodes
        pCurrent->pPrev  = pHeadNode;
        pHeadNode->pNext = pCurrent;
        pCurrent->pNext  = NULL;    // no more nodes
    }
    // make current node head of list
    pHeadNode = pCurrent;
}

If pHeadNode is NULL, which will be the case for the first call to malloc, calloc or realloc, we set ->pPrev and ->pNext both to NULL. This means there is only one node in the list and just before we exit the function pHeadNode is made to equal the current node being added to the list.

Else we add a node that is not the first node to the list. So, ->pPrev must now point to the previous node which was stored in the variable pHeadNode and the previous node's pointer to the next node, ->pNext, must point to this current node. Then the current node's pointer to the next node must be set to NULL.

    pCurrent->pPrev  = pHeadNode;
    pHeadNode->pNext = pCurrent;
    pCurrent->pNext  = NULL;    // no more nodes

Lastly, as when the first node was being added, pHeadNode is made equal to the current node. 

This process is repeated for each node added to the list. If the preceding is not clear go through it again until it is clear.

 

Searching through the nodes

One needs to be able to find any particular node in the linked list - the tmalloc library uses the function called IsValidPointer() to search through the list before accessing a particular node.

Basically, searching through the nodes requires starting at one end and traversing through the nodes until the right one is found. This is very fast as the operation only requires setting a pointer, in this case to the previous node. One can of course have the head node at the beginning and search through by using the pointer to the next node.

static BOOL IsValidPointer( void * pMem )
{
    BOOL fFlag = FALSE;

    NODE * pFindNode = (NODE*)pMem-1;

    NODE * pNode = pHeadNode;
    // walk through the list to find the node
    while( pNode ){
        if ( pNode == pFindNode ){
            fFlag = TRUE;
            break;
        }
        pNode = pNode->pPrev;
    }
    return fFlag; // if not found return FALSE
}

In our case, implementing tmalloc, the pointer to user memory is passed, so the first operation retrieves the pointer to the node for the user memory block.

    NODE * pFindNode = (NODE*)pMem-1;

Then we traverse the nodes until the right node is found using a while loop, exiting when found and returning TRUE, if the node is not found we return FALSE.

The node is a pointer to struct NODE that contains all the information we stored earlier when creating the node. The routine IsValidPointer() is also used in the tmalloc library when the caller requests to free the memory, if the node does not match any node in the list the free operation is aborted and a message box is displayed informing the user that the memory can't be free-ed.

 

Removing a node from the list

When the user of the tmalloc library free's a block of memory a node should be removed from the list. This is accomplished by adjusting the ->pPrev and ->pNext pointers and is probably the most tricky part of managing a linked list. This is done when a call to free() (tfree) is used.

First a call is made to IsValidPointer, trying to free a pointer that has not been allocated will often cause a crash, depending on what OS you are using.

Then the address of the node is calculated from the pointer to the user memory and passed to RemoveFromLinkedList().

void tfree( void * pMem, char * pFile, int nLine )
{
    // free accepts a NULL pointer.
    if (!pMem)
        return;

    if (IsValidPointer(pMem)) {
        NODE * pNode = (NODE*)pMem-1;
        RemoveFromLinkedList( pNode );
        free(pNode);
    }else{
        char str[256], str1[32];
        strcpy(str, "Trying to free from invalid pointer from -> ");
        strnCpy(str1, ExtractFileName(pFile), 30);
            sprintf( str + strlen(str), "%s %s %d\n\r\n\r%s", str1, "at line ",
            nLine, "Aborting operation!" );
        MessageBox(NULL, str, "TMalloc Error. . . ", MB_OK);
    }
}

Here is the RemoveFromLinkedList function.

static void RemoveFromLinkedList( NODE * pNode )
{
    if (!pNode->pNext && !pNode->pPrev){ // only one node left
        pHeadNode = NULL;                // no nodes left
        return;
    }else if (!pNode->pPrev){            // delete first node
        (pNode->pNext)->pPrev = NULL;
    }else if (!pNode->pNext){            // delete last node
        pHeadNode = pNode->pPrev;
        (pNode->pPrev)->pNext = NULL;
    }else{                               // delete other nodes
        (pNode->pPrev)->pNext = pNode->pNext;
        (pNode->pNext)->pPrev = pNode->pPrev;
    }
}

As you can see it is a little more complex than the other calls to add or search the nodes. 

IF the ->pPrev and ->pNext pointers are both NULL there is only one node, set the pHeadNode to NULL and return.

ELSE IF there is no previous node is must be the first node so set the next node's ->pPrev to NULL.

ELSE IF there is no next node is must be the last node so set pHeadNode to previous node and set the previous nodes ->pNext to NULL.

ELSE set the previous node's ->pNext to this nodes ->pNext and set next node's ->pPrev to this node's ->pPrev.

As said before, if the preceding is not clear go through it again until it is clear.

Please take a look at the tmalloc library sources - run the code and see how this linked list is implemented.

tmalloc sources


END

Back to main page