pmhash - Man Page

general purpose hashing routines

C Synopsis

#include <pcp/pmapi.h>
#include <pcp/libpcp.h>

void __pmHashInit(__pmHashCtl *hcp);
int __pmHashPreAlloc(int hsize, __pmHashCtl *hcp);
int __pmHashAdd(unsigned int key, void *data, __pmHashCtl *hcp);
int __pmHashDel(unsigned int key, void *data, __pmHashCtl *hcp);
__pmHashNode *__pmHashSearch(unsigned int key, __pmHashCtl *hcp);
__pmHashNode *__pmHashWalk(__pmHashCtl *hcp, __pmHashWalkState state);
void __pmHashWalkCB(__pmHashWalkCallback cb, void *cdata, const __pmHashCtl *hcp);
void __pmHashFree(__pmHashCtl *hcp);
void __pmHashClear(__pmHashCtl *hcp);

cc ... -lpcp

Caveat

This documentation is intended for internal Performance Co-Pilot (PCP) developer use.

These interfaces are not part of the PCP APIs that are guaranteed to remain fixed across releases, and at some point in the future they may not work or may provide different semantics.

Description

The __pmHash group of routines implement a generalized suite of linear hashing services based on a key that is an unsigned int and data that is an opaque pointer to information the caller wishes to associate with key.

The data type of key makes is suitable for hashed access based on metric identifier (pmID), instance domain number (pmInDom) or internal instance identifiers within an instance domain.

Multiple hash tables may exist, each identified by a hash control struct (__pmHashCtl) hcp which is declared by the caller and initialized by calling __pmHashInit. Refer to the Hash Control and Hash Node sections below for more information on the hash table internal data structures.

The hash table is initially empty but dynamically grows by approximate doubling in size as entries are added, and this may cause some organizational overhead in relinking the hash chains when the hash table grows. This overhead can be avoided by optionally calling __pmHashPreAlloc with an initial hash table size of hsize entries, but this must be done after calling __pmHashInit and before any entries are added.

Entries are added to a hash table by calling __pmHashAdd. The opaque data is typically a pointer to a block of additional information to be associated with the key, although it may be NULL if there is no additional information required or currently available.

Although usually not required, duplicate key values can be stored in a hash table and __pmHashAdd will silently add these (presumably with a different data pointer). If uniqueness of keys is required, it is necessary to call __pmHashSearch first to determine that there is no entry for key, before calling __pmHashAdd.

Entries may be removed from a hash table using __pmHashDel where the entry to be deleted is the first one with a matching key and data pointer. See the note above about duplicate keys to understand why the data parameter is needed.

__pmHashSearch finds the first entry with a matching key. If duplicate keys are being stored, then the caller will have to follow the hp->next chain looking for additional entries with the same key. Refer to the Hash Control and Hash Node sections below for more information on the hash table internal data structures.

__pmHashWalk provides a stateful interface to walk each node in the hash table. It is first called with state set to PM_HASH_WALK_START to retrieve the first entry and then repeatedly called with state set to PM_HASH_WALK_NEXT to retrieve subsequent entries.

__pmHashWalkCB provides an alternative method to traverse the hash table. The callback function cb is called with two arguments, a pointer to the current hash entry and cdata (the latter allows the caller to pass auxiliary information into the callback function, but can be NULL if this is not required). The callback function must return one of the following __pmHashWalkState values:

PM_HASH_WALK_NEXT: continue the traversal
PM_HASH_WALK_STOP: terminate the traversal
PM_HASH_DELETE_NEXT: delete the current node from the hash table and continue the traversal
PM_HASH_DELETE_STOP: delete the current node from the hash table and terminate the traversal

__pmHashFree will release all storage associated with the hash table and return the hash table to the empty state, just like after __pmHashInit has been called. But __pmHashFree cannot free any storage attached via the data argument to __pmHashAdd calls. So the most appropriate way to clean up the hash table is to first traverse the table releasing any data and then call __pmHashFree as the example below shows.

__pmHashCtl	hash;

__pmHashWalkState
mycallback(const __pmHashNode *hp, void *cp)
{
    (void)cp;
    if (hp->data) {
	/*
	 * free() if malloc'd or some datum-specific
	 * method, e.g. __pmFreeProfile()
	 */
        free(hp->data);
    }
    return PM_HASH_WALK_NEXT;
}

    __pmHashWalkCB(mycallback, NULL, &hash);
    __pmHashFree(&hash);
}

__pmHashClear returns the hash table to the empty state, just like after __pmHashInit has been called. Beware that __pmHashClear does not release any storage associated with hash entries, and so risks leaking memory, however the following example shows how to release all memory in a single traversal of the hash table with __pmHashWalkCB before calling __pmHashClear.

__pmHashCtl	hash;

__pmHashWalkState
mycallback(const __pmHashNode *hp, void *cp)
{
    (void)cp;
    if (hp->data) {
	/*
	 * free() if malloc'd or some datum-specific
	 * method, e.g. __pmFreeProfile()
	 */
        free(hp->data);
    }
    /*
     * compared to the previous example, this difference
     * is important and frees each hash node
     */
    return PM_HASH_DELETE_NEXT;
}

    __pmHashWalkCB(mycallback, NULL, &hash);
    __pmHashClear(&hash);
}

Hash Control

The __pmHashCtl struct is defined as:

typedef struct __pmHashCtl {
    int                 nodes;
    int                 hsize;
    __pmHashNode        **hash;
    __pmHashNode        *next;
    unsigned int        index;
} __pmHashCtl;

The hash table hash contains hsize entries, each of which may point to a linked list of hash nodes. The total number of hash nodes is held in nodes. The index and next fields are used to maintain state during hash table walk operations.

Hash Node

The __pmHashNode struct is defined as:

typedef struct __pmHashNode {
    struct __pmHashNode *next;
    unsigned int        key;
    void                *data;
} __pmHashNode;

Each node holds the key, the opaque pointer (data) and next implements the linked list of hash nodes from each entry in the hash table.

Diagnostics and Return Values

__pmHashPreAlloc returns -1 if the hash table is not empty, else a value < 0 to indicate an error, probably from calloc(3), that can be turned into an error message by calling pmErrStr(3).

__pmHashAdd returns 1 for success, else a value < 0 to indicate an error, probably from calloc(3).

Return values from __pmHashDel are 0 if no matching entry is found, else 1 if a matching entry was deleted.

__pmHashSearch returns with a pointer to the entry if found, else NULL.

__pmHashWalk returns with a pointer to the next entry if found, else NULL when all entries have been traversed.

Referenced By

The man pages __pmHashAdd(3), __pmHashClear(3), __pmHashDel(3), __pmHashFree(3), __pmHashInit(3), __pmHashPreAlloc(3), __pmHashSearch(3), __pmHashWalk(3) and __pmHashWalkCB(3) are aliases of pmhash(3).

PCP Performance Co-Pilot