侵入式链表(Intrusive Linked List)

设计思想

常用的链表是非侵入式链表,它的每个节点包含数据和指向下一个节点的指针(和一个指向前一个节点的指针)

1
2
3
4
5
struct ListNode {
int data;
struct ListNode *next;
struct ListNode *prev;
}ListNode;

在这种链表结构中,data是固定的,即一个链表中,每个节点存储的数据类型必须相同,这样的链表泛化能力比较差。

注:C++语言可以使用模板实现通用:

1
2
3
4
5
template <typename T>
struct ListNode {
struct ListNode *next; // link区域
T data; // data区域
};

但这只不过是将重写代码的工作交给编译器完成,本质上数据和链表仍然是耦合的。


侵入式链表是在其内部直接包含链表节点:

1
2
3
4
5
6
7
8
struct ListLink {
struct ListLink *next, *prev;
}ListLink;

struct ListNode {
data_type data;
struct ListLink list;
}ListNode;

侵入式指的是对象的内部状态由外部对象管理

在非侵入式链表中,每个节点都是一个独立的对象,包含指向下一个节点的指针,链表本身维护各节点的连接关系(节点自己不知道它在链表中的位置)。因此,无需了解节点的内部结构,只需调用链表提供的方法来插入、删除或遍历节点。

而在侵入式链表中,每个节点都包含指向下一个节点的指针,链接关系由节点维护。因此在访问链表时,必须直接访问节点结构体来获取指向下一个节点的指针。虽然这种做法可能增加了代码的复杂性,但它相较于非侵入式链表提供了更高的性能和泛化能力。

数据的存储与访问

Linux内核代码/include/linux/types.h中定义了一种没有数据域的链表

1
2
3
struct list_head {
struct list_head *next, *prev;
};

可以很容易的将这种链表结构体包含在任意数据的结构体中,统一组织一个链表中的所有节点,各节点中可以存储不同的数据,例如

1
2
3
4
5
6
7
8
9
struct ListNodeInt {
int data;
list_head link;
}ListNodeInt;

struct ListNodeDouble {
double data;
list_head link;
}ListNodeDouble;

初始化

完整代码见/include/linux/list.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/*
* Circular doubly linked list implementation.
*
* Some of the internal functions ("__xxx") are useful when
* manipulating whole lists rather than single entries, as
* sometimes we already know the next/prev entries and we can
* generate better code by using them directly rather than
* using the generic single-entry routines.
*/

#define LIST_HEAD_INIT(name) { &(name), &(name) }

// 定义并初始化一个链表头
#define LIST_HEAD(name) \
struct list_head name = LIST_HEAD_INIT(name)

/**
* INIT_LIST_HEAD - Initialize a list_head structure
* @list: list_head structure to be initialized.
*
* Initializes the list_head to point to itself. If it is a list header,
* the result is an empty list.
*/

// 运行时链表初始化,链表头的指针指向自己
static inline void INIT_LIST_HEAD(struct list_head *list)
{
WRITE_ONCE(list->next, list);
WRITE_ONCE(list->prev, list);
}

inline将函数内容直接嵌入至函数的调用点,代替跳转。对于频繁调用的函数,通过嵌入可以减少函数调用的开销, 提高程序效率,但只适合代码量较小的函数。

删除节点

/include/linux/poison.h文件中定义了两个用户不可访问的地址LIST_POISON1LIST_POISON2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/*
* Architectures might want to move the poison pointer offset
* into some well-recognized area such as 0xdead000000000000,
* that is also not mappable by user-space exploits:
*/
#ifdef CONFIG_ILLEGAL_POINTER_VALUE
# define POISON_POINTER_DELTA _AC(CONFIG_ILLEGAL_POINTER_VALUE, UL)
#else
# define POISON_POINTER_DELTA 0
#endif

/*
* These are non-NULL pointers that will result in page faults
* under normal circumstances, used to verify that nobody uses
* non-initialized list entries.
*/
#define LIST_POISON1 ((void *) 0x100 + POISON_POINTER_DELTA)
#define LIST_POISON2 ((void *) 0x122 + POISON_POINTER_DELTA)

在删除节点后,将其指针指向这两个位置,用来在调试时检测非法操作。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
/*
* Delete a list entry by making the prev/next entries
* point to each other.
*
* This is only for internal list manipulation where we know
* the prev/next entries already!
*/
static inline void __list_del(struct list_head * prev, struct list_head * next)
{
next->prev = prev;
WRITE_ONCE(prev->next, next);
}

/*
* Delete a list entry and clear the 'prev' pointer.
*
* This is a special-purpose list clearing method used in the networking code
* for lists allocated as per-cpu, where we don't want to incur the extra
* WRITE_ONCE() overhead of a regular list_del_init(). The code that uses this
* needs to check the node 'prev' pointer instead of calling list_empty().
*/
static inline void __list_del_clearprev(struct list_head *entry)
{
__list_del(entry->prev, entry->next);
entry->prev = NULL;
}

static inline void __list_del_entry(struct list_head *entry)
{
if (!__list_del_entry_valid(entry))
return;

__list_del(entry->prev, entry->next);
}

/**
* list_del - deletes entry from list.
* @entry: the element to delete from the list.
* Note: list_empty() on entry does not return true after this, the entry is
* in an undefined state.
*/
static inline void list_del(struct list_head *entry)
{
__list_del_entry(entry);
entry->next = LIST_POISON1;
entry->prev = LIST_POISON2;
}

查找节点

因为这种链表的链域仅保存了前后节点的地址,对这种链表进行操作时,无法直接得知内部成员变量的地址偏移量。

Linux内核代码中给出两个函数,用于地址偏移量计算:

  • offsetof:计算结构体成员变量相对于结构体的偏移
1
#define offsetof(type, member) (size_t) &((type*)0)->member
  • container_of:通过成员变量,结构体成员的地址以及结构体的类型来获取结构体的首地址
1
2
3
#define container_of(ptr, type, member) ({ \
const typeof( ((type *)0)->member ) *__mptr = (ptr); \
(type *)( (char *)__mptr - offsetof(type,member) );})
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/**
* list_entry - get the struct for this entry
* @ptr: the &struct list_head pointer.
* @type: the type of the struct this is embedded in.
* @member: the name of the list_head within the struct.
*/
#define list_entry(ptr, type, member) \
container_of(ptr, type, member)

/**
* list_for_each_entry - iterate over list of given type
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the list_struct within the struct.
*
* 正向遍历整个链表, 得到list
*/
#define list_for_each_entry(pos, head, member) \
for (pos = list_entry((head)->next, typeof(*pos), member); \
&pos->member != (head); \
pos = list_entry(pos->member.next, typeof(*pos), member))

/**
* list_for_each_entry_reverse - iterate backwards over list of given type.
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the list_struct within the struct.
*
* 反向遍历整个链表, 得到list
*/
#define list_for_each_entry_reverse(pos, head, member) \
for (pos = list_entry((head)->prev, typeof(*pos), member); \
&pos->member != (head); \
pos = list_entry(pos->member.prev, typeof(*pos), member))

应用

在通信场景中,每个消息携带的数据量可能不一样。可以考虑使用链域和不定长的数据域拼接,用于管理不定长消息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
template <unsigned int N>
struct Message {
int size;
char data[N];

static void init(void *Message) {
reinterpret_cast<Message *>(Message)->size = N;
}
};

int main() {
auto node1 = (list_head *) malloc(sizeof(list_head) + sizeof(Message<1024 * 1>));
auto node2 = (list_head *) malloc(sizeof(list_head) + sizeof(Message<1024 * 2>));
auto node3 = (list_head *) malloc(sizeof(list_head) + sizeof(Message<1024 * 3>));

Message<1024 * 1>::init(node1 + 1);
Message<1024 * 3>::init(node2 + 1);
Message<1024 * 2>::init(node3 + 1);

list_head Message_list;

// ...

free(node1);
free(node2);
free(node3);
return 0;
}

在使用malloc分配内存时, 申请的大小是sizeof(link) + sizeof(data),这样在物理上数据和只有link的侵入式链表的节点在一块连续的内存上。在链表视角相当于 在每个节点下面挂载了一个隐藏的消息数据

1
2
3
4
5
6
7
8
        +-------+    +-------+           +-------+
List: | next | -> | next | -> ... -> | next |
+-------+ +-------+ +-------+
+-------+ +-------+ +-------+
Data: |payload| |payload| |payload|
+-------+ | | | |
| | +-------+
+-------+

参考资料