mysql内核分析--innodb哈希表的内部实现(上)

cooldatabase

浏览: 203595 次
性别:
来自: 杭州

最近访客更多访客>>

shouwang361

dongguangming88

qiuxia812913

liexusong001

博主相关

博客

微博

相册

留言

关于我

文章分类

社区版块

存档分类

2010-04 ( 23)
2010-03 ( 94)
2009-12 ( 26)
更多存档...

博客分类：

mysql

MySQL 数据结构算法 .net Blog

1.哈希表的概述

   hash表的实现是innodb的基础功能之一，通过关键值进行映射，从而迅速进行查询、插入、删除的操作。

   hash表算法，在数据库内核里面被广泛的使用，举个例子，这个结构将会在下文中继续使用的。

/* Data structure for a column in a table */

struct dict_col_struct{

       hash_node_t   hash;       /* hash chain node */

       ulint        ind; /* table column position (they are numbered

                            starting from 0) */

       ulint        clust_pos;/* position of the column in the

                            clustered index */

       ulint        ord_part;/* count of how many times this column

                            appears in ordering fields of an index */

       const char*    name;      /* name */

       dtype_t           type;       /* data type */

       dict_table_t*   table;       /* back pointer to table of this column */

       ulint        aux; /* this is used as an auxiliary variable

                            in some of the functions below */

};

   从数据结构的名称上来看，这个关于列结构的，具有相同hash值的col结构，通过hash字段进行连接。该字段的定义如下：

typedef void* hash_node_t;

col结构里面的其它字段表明该列的一些属性：name表示列名，type表明列的值类型，table表明该列所属的表结构。

2.数据结构

typedef struct hash_table_struct hash_table_t;

typedef struct hash_cell_struct hash_cell_t;

typedef void* hash_node_t;

/* The hash table structure */

struct hash_table_struct {

       ibool              adaptive;/* TRUE if this is the hash table of the

                            adaptive hash index */

       ulint        n_cells;/* number of cells in the hash table */

       hash_cell_t*    array;      /* pointer to cell array */

       ulint        n_mutexes;/* if mutexes != NULL, then the number of

                            mutexes, must be a power of 2 */

       mutex_t* mutexes;/* NULL, or an array of mutexes used to

                            protect segments of the hash table */

       mem_heap_t**       heaps;     /* if this is non-NULL, hash chain nodes for

                            external chaining can be allocated from these

                            memory heaps; there are then n_mutexes many of

                            these heaps */

       mem_heap_t* heap;

       ulint        magic_n;

};

struct hash_cell_struct{

       void*      node;      /* hash chain node, NULL if none */

};

1）创建hash表

n_cells表明hash表的大小，我们都知道hash表常用的是进行素数的模操作。先看下创建hash表的函数的实现。

/*****************************************************************

Creates a hash table with >= n array cells. The actual number of cells is

chosen to be a prime number slightly bigger than n. */

hash_table_t*

hash_create(

/*========*/

                     /* out, own: created table */

       ulint n)    /* in: number of array cells */

{

       hash_cell_t*    array;

       ulint        prime;

       hash_table_t* table;

       ulint        i;

       hash_cell_t*    cell;



       prime = ut_find_prime(n);

       table = mem_alloc(sizeof(hash_table_t));

       array = ut_malloc(sizeof(hash_cell_t) * prime);



       table->adaptive = FALSE;

       table->array = array;

       table->n_cells = prime;

       table->n_mutexes = 0;

       table->mutexes = NULL;

       table->heaps = NULL;

       table->heap = NULL;

       table->magic_n = HASH_TABLE_MAGIC_N;



       /* Initialize the cell array */

       for (i = 0; i < prime; i++) {

              cell = hash_get_nth_cell(table, i);

              cell->node = NULL;

       }

       return(table);

}

在创建hash表的时候，我们提供的常用是一个普通的数字，来指明hash表的大小。这个n的值可能不是素数，所以需要通过ut_find_prime(n)来产生一个稍大于n的素数，当然这个素数不一定是大于n的最小素数。ut_find_prime的具体实现见文件ut0rnd.c。

接着调用mem_alloc函数来分配hash表结构，为该结构分配空间，这个比较简单。接着分配数量为prime个的hash单元，并通过后面的循环语句将单元的node指针置为空。

2)基础的函数

提供值之后，我们需要进行映射，获取对应的单元的编号。

/******************************************************************

Calculates the hash value from a folded value. */

UNIV_INLINE

ulint

hash_calc_hash(

/*===========*/

                            /* out: hashed value */

       ulint        fold, /* in: folded value */

       hash_table_t* table)      /* in: hash table */

{

       return(ut_hash_ulint(fold, table->n_cells));

}

其中ut_hash_ulint函数的实现如下：

/***********************************************************

The following function generates a hash value for a ulint integer

to a hash table of size table_size, which should be a prime

or some random number for the hash table to work reliably. */

UNIV_INLINE

ulint

ut_hash_ulint(

/*=========*/

                            /* out: hash value */

       ulint    key,            /* in: value to be hashed */

       ulint    table_size)       /* in: hash table size */

{

       key = key ^ UT_HASH_RANDOM_MASK2;

       return(key % table_size);

}

通过模操作获得了对应的hash表单元的数值，然后就可以通过该数值找到hash表的该单元节点，调用hash_get_nth_cell函数，函数实现如下：

/****************************************************************

Gets the nth cell in a hash table. */

UNIV_INLINE

hash_cell_t*

hash_get_nth_cell(

/*==============*/

                            /* out: pointer to cell */

       hash_table_t*        table,       /* in: hash table */

       ulint              n)    /* in: cell index */

{

       ut_ad(n < table->n_cells);

       return(table->array + n);

}

本文来自CSDN博客，转载请标明出处：http://blog.csdn.net/whyangwanfu/archive/2008/09/21/2958762.aspx

分享到：

日志系统原理 | Mysql查询优化器浅析

2010-03-30 16:36
浏览 1278
评论(0)
分类:数据库
查看更多

发表评论

您还没有登录,请您登录后再发表评论

最近访客更多访客>>

博主相关

文章分类

社区版块

存档分类

最新评论