Python内部机制(1) - 垃圾回收
首先介绍下主流的垃圾回收机制,原文在 http://www.zhihu.com/question/20018826/answer/28892543
- 引用计数(reference counting):
基本思路是为每个对象加一个计数器,记录指向这个对象的引用数量。每次有一个新的引用指向这个对象,计数器加一;反之每次有一个指向这个对象引用被置空或者指向其他对象,计数器减一。当计数器变为 0 的时候,自动删除这个对象。
引用计数的优点是 1)相对简单,不需要太多运行时(run-time)的支持,可以在原生不支持 GC 的语言里实现。2)对象会在成为垃圾的瞬间被释放,不会给正常程序的执行带来额外中断。它的死穴是循环引用,对象 A 包含一个引用指向对象 B ,同时对象 B 包含一个引用指向对象 A,计数器就抓瞎了。另外,引用计数对正常程序的执行性能有影响(每次引用赋值都要改计数器),特别是在多线程环境下(改计数器要加锁同步)。
- 标记-清扫(mark-sweep)。
基本思路是先按需分配,等到没有空闲内存的时候从寄存器和程序栈上的引用出发,遍历以对象为节点、以引用为边构成的图,把所有可以访问到的对象打上标记,然后清扫一遍内存空间,把所有没标记的对象释放。
标记-清扫没有无法处理循环引用的问题,不触发 GC 时也不影响正常程序的执行性能。但它的问题是当内存耗尽触发 GC 时,需要中断正常程序一段时间来清扫内存,在内存大对象多的时候这个中断可能很长。
- 节点复制(copying)。
基本思路是把整个内存空间一分为二,不妨记为 A 和 B。所有对象的内存在 A 中分配,当 A 塞满的时候,同样从寄存器和程序栈上的引用出发,遍历以对象为节点、以引用为边构成的图,把所有可以访问到的对象复制到 B 去,然后对调 A 和 B 的角色。
相对于标记-清扫,节点复制的主要缺点是总有一半空间空闲着无法利用,另一个比较隐晦的缺点是它使用内存的方式与现有的内存换页、Cache 换入换出机制有潜在的冲突。但它有个很大的优点: 所有的对象在内存中永远都是紧密排列的,所以分配内存的任务变得极为简单,只要移动一个指针即可。对于内存分配频繁的环境来说,性能优势相当大。另外,由于不需要清扫整个内存空间,所以如果内存中存活对象很少而垃圾对象很多的话(有些语言有这个倾向),触发 GC 造成的中断会小于标记-清扫。
Python 主要是通过引用计数来进行垃圾回收,这里我写了一个简单的 c 程序来模拟一下
1#include <stdio.h>
2#include <stdlib.h>
3
4#define PyDateType int
5#define INT 1
6#define FLOAT 2
7
8
9//ref_count 引用计数器
10//data_type 是数据类型,这是只是模拟,真正不是这样实现的
11#define PyObject_HEAD \
12 int ref_count; \
13 PyDateType data_type;
14
15
16//int 类型的
17typedef struct {
18 PyObject_HEAD
19 int value;
20} PyIntObject;
21
22
23//float 类型的
24typedef struct {
25 PyObject_HEAD
26 float value;
27} PyFloatObject;
28
29
30int main() {
31 PyIntObject *py_int_10;
32 py_int_10 = (PyIntObject *) malloc(sizeof(PyIntObject));
33 py_int_10->data_type = INT;
34 py_int_10->ref_count = 0;
35 py_int_10->value = 10;
36
37 PyIntObject *py_int_20;
38 py_int_20 = (PyIntObject *) malloc(sizeof(PyIntObject));
39 py_int_20->data_type = INT;
40 py_int_20->ref_count = 0;
41 py_int_20->value = 20;
42
43 //a 指向10
44 PyIntObject *a, *b;
45 a = py_int_10;
46 //增加引用计数
47 py_int_10->ref_count++;
48
49 //b 指向20
50 b = py_int_20;
51 py_int_20->ref_count++;
52
53 //把 a 指向20
54 a = py_int_20;
55 //减少10的引用计数 增加20的引用计数
56 py_int_10->ref_count--;
57 //释放掉10的内存空间
58 if(py_int_10 -> ref_count == 0){
59 free(py_int_10);
60 }
61 py_int_20->ref_count++;
62
63 printf("a: %d, b: %d\n", a->value, b->value);
64 printf("ref count: %d", py_int_20->ref_count);
65 return 0;
66}
引用计数有个明显的缺点就是没办法处理循环引用,比如
1l = []
2l.append(l)
3del l
这个 l 永远没办法被回收,因为它的引用计数没办法归零了。为了解决循环引用的问题,CPython特别设计了一个模块gc,其主要作用就是检查出循环引用的垃圾对象,并清除他们。该模块的实现,实际上也是引入了其余两种主流的垃圾收集技术——标记清除和分代收集。
gc
模块是 Python 对外暴露的垃圾回收机制,这个模块包含了控制垃圾回收操作的函数,还有检查垃圾回收状态的函数。
追踪引用
gc
模块里面有两个函数get_referents(*objs)
和get_referrers(*objs)
。第一个函数是return a list of objects directly referred to by any of the arguments,也就是获取引用 objs 的 objects,第二个函数是return the list of objects that directly refer to any of objs,也就是获取 objs 引用的 objects。
代码示例
1import gc
2import pprint
3
4class Graph(object):
5 def __init__(self, name):
6 self.name = name
7 self.next = None
8 def set_next(self, next):
9 print 'Linking nodes %s.next = %s' % (self, next)
10 self.next = next
11 def __repr__(self):
12 return '%s(%s)' % (self.__class__.__name__, self.name)
13
14# Construct a graph cycle
15one = Graph('one')
16two = Graph('two')
17three = Graph('three')
18one.set_next(two)
19two.set_next(three)
20three.set_next(one)
21
22print
23print 'three refers to:'
24for r in gc.get_referents(three):
25 pprint.pprint(r)
输出是
1$ python gc_get_referents.py
2
3Linking nodes Graph(one).next = Graph(two)
4Linking nodes Graph(two).next = Graph(three)
5Linking nodes Graph(three).next = Graph(one)
6
7three refers to:
8{'name': 'three', 'next': Graph(one)}
9<class '__main__.Graph'>
这个例子中,three
这个实例在__dict__
属性中保存了对它的实例字典的引用。
下面这个例子,使用Queue
对所有的引用来进行广度优先搜索,用来发现循环依赖。队列中的每个元素都是一个元组,里面包含了引用链和下一个需要检查的元素。从three
这个实例开始,然后遍历它引用的所有的东西。
1#!/usr/bin/env python
2# encoding: utf-8
3#
4# Copyright (c) 2010 Doug Hellmann. All rights reserved.
5#
6"""Show the objects with references to a given object.
7"""
8# end_pymotw_header
9
10import gc
11import pprint
12import Queue
13
14
15class Graph(object):
16 def __init__(self, name):
17 self.name = name
18 self.next = None
19
20 def set_next(self, next):
21 print 'Linking nodes %s.next = %s' % (self, next)
22 self.next = next
23
24 def __repr__(self):
25 return '%s(%s)' % (self.__class__.__name__, self.name)
26
27# Construct a graph cycle
28one = Graph('one')
29two = Graph('two')
30three = Graph('three')
31one.set_next(two)
32two.set_next(three)
33three.set_next(one)
34
35print
36
37seen = set()
38to_process = Queue.Queue()
39
40# Start with an empty object chain and Graph three.
41to_process.put(([], three))
42
43# Look for cycles, building the object chain for each object found
44# in the queue so the full cycle can be printed at the end.
45while not to_process.empty():
46 chain, next = to_process.get()
47 print "get:", chain, next
48 chain = chain[:]
49 chain.append(next)
50 print "chain", chain
51 print 'Examining:', repr(next)
52 print "add id", id(next)
53 seen.add(id(next))
54 for r in gc.get_referents(next):
55 # 跳过部分没用的引用
56 if not (isinstance(r, basestring) or isinstance(r, type)):
57 print "refer to:", r
58 if id(r) in seen:
59 print
60 print 'Found a cycle to %s:' % r
61 for i, link in enumerate(chain):
62 print ' %d: ' % i,
63 pprint.pprint(link)
64 else:
65 print "put:", chain, r
66 to_process.put((chain, r))
67 print "---------------"
输出是
1Linking nodes Graph(one).next = Graph(two)
2Linking nodes Graph(two).next = Graph(three)
3Linking nodes Graph(three).next = Graph(one)
4# 初始化的状态 下一个要检查的是 three
5get: [] Graph(three)
6chain [Graph(three)]
7Examining: Graph(three)
8# 标记three 的 id 已经检查
9add id 4392667024
10# three 指向这个东西
11refer to: {'name': 'three', 'next': Graph(one)}
12# 下一个要检查的就是{'name': 'three', 'next': Graph(one)}
13put: [Graph(three)] {'name': 'three', 'next': Graph(one)}
14---------------
15get: [Graph(three)] {'name': 'three', 'next': Graph(one)}
16chain [Graph(three), {'name': 'three', 'next': Graph(one)}]
17Examining: {'name': 'three', 'next': Graph(one)}
18add id 140688010354752
19# {'name': 'three', 'next': Graph(one)} 引用了 one
20refer to: Graph(one)
21put: [Graph(three), {'name': 'three', 'next': Graph(one)}] Graph(one)
22---------------
23get: [Graph(three), {'name': 'three', 'next': Graph(one)}] Graph(one)
24chain [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one)]
25Examining: Graph(one)
26add id 4392615504
27refer to: {'name': 'one', 'next': Graph(two)}
28# 现在循环链是 three -> {'name': 'three', 'next': Graph(one)} -> one
29put: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one)] {'name': 'one', 'next': Graph(two)}
30---------------
31get: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one)] {'name': 'one', 'next': Graph(two)}
32chain [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}]
33Examining: {'name': 'one', 'next': Graph(two)}
34add id 140688010307184
35refer to: Graph(two)
36# 循环链变为three -> {'name': 'three', 'next': Graph(one)} -> one -> {'name': 'one', 'next': Graph(two)}
37put: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}] Graph(two)
38---------------
39get: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}] Graph(two)
40chain [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}, Graph(two)]
41Examining: Graph(two)
42add id 4392665936
43refer to: {'name': 'two', 'next': Graph(three)}
44# 继续增长
45put: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}, Graph(two)] {'name': 'two', 'next': Graph(three)}
46---------------
47get: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}, Graph(two)] {'name': 'two', 'next': Graph(three)}
48chain [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}, Graph(two), {'name': 'two', 'next': Graph(three)}]
49Examining: {'name': 'two', 'next': Graph(three)}
50add id 140688010362416
51refer to: Graph(three)
52# 这个 id 已经出现过了,说明出现了循环引用
53Found a cycle to Graph(three):
54 0: Graph(three)
55 1: {'name': 'three', 'next': Graph(one)}
56 2: Graph(one)
57 3: {'name': 'one', 'next': Graph(two)}
58 4: Graph(two)
59 5: {'name': 'two', 'next': Graph(three)}
60---------------
强制进行垃圾回收
代码示例
1import gc
2import pprint
3
4class Graph(object):
5 def __init__(self, name):
6 self.name = name
7 self.next = None
8 def set_next(self, next):
9 print 'Linking nodes %s.next = %s' % (self, next)
10 self.next = next
11 def __repr__(self):
12 return '%s(%s)' % (self.__class__.__name__, self.name)
13
14# Construct a graph cycle
15one = Graph('one')
16two = Graph('two')
17three = Graph('three')
18one.set_next(two)
19two.set_next(three)
20three.set_next(one)
21
22print
23
24# Remove references to the graph nodes in this module's namespace
25one = two = three = None
26
27# Show the effect of garbage collection
28for i in range(2):
29 print 'Collecting %d ...' % i
30 n = gc.collect()
31 print 'Unreachable objects:', n
32 print 'Remaining Garbage:',
33 pprint.pprint(gc.garbage)
34 print
输出
1$ python gc_collect.py
2
3Linking nodes Graph(one).next = Graph(two)
4Linking nodes Graph(two).next = Graph(three)
5Linking nodes Graph(three).next = Graph(one)
6
7Collecting 0 ...
8Unreachable objects: 6
9Remaining Garbage:[]
10
11Collecting 1 ...
12Unreachable objects: 0
13Remaining Garbage:[]
这个例子中,第一次进行垃圾回收的时候,循环引用就被打破了,除了自己,没有再引用 Graph 节点的了。collect()
函数返回了 unreachable 的 objects 的数量(不知道怎么翻译比较好,反正就是没办法再获取到这三个实例了,因为之前对它三个的引用被赋值为 None 了)。这个例子中,这个值是6,因为有3个类实例和自己的实例属性字典。
如果Graph
有__del()__
方法的话,垃圾回收就不能打破循环了。
1import gc
2import pprint
3
4class Graph(object):
5 def __init__(self, name):
6 self.name = name
7 self.next = None
8 def set_next(self, next):
9 print '%s.next = %s' % (self, next)
10 self.next = next
11 def __repr__(self):
12 return '%s(%s)' % (self.__class__.__name__, self.name)
13 def __del__(self):
14 print '%s.__del__()' % self
15
16# Construct a graph cycle
17one = Graph('one')
18two = Graph('two')
19three = Graph('three')
20one.set_next(two)
21two.set_next(three)
22three.set_next(one)
23
24# Remove references to the graph nodes in this module's namespace
25one = two = three = None
26
27# Show the effect of garbage collection
28print 'Collecting...'
29n = gc.collect()
30print 'Unreachable objects:', n
31print 'Remaining Garbage:',
32pprint.pprint(gc.garbage)
输出
1$ python gc_collect_with_del.py
2
3Graph(one).next = Graph(two)
4Graph(two).next = Graph(three)
5Graph(three).next = Graph(one)
6Collecting...
7Unreachable objects: 6
8Remaining Garbage:[Graph(one), Graph(two), Graph(three)]
因为在循环中,超过一个的 object 拥有 finalizer 方法,垃圾回收机制没办法确定回收的顺序,为了安全起见,就保持原样了。
当打破循环的时候,Graph 实例就可以被回收了。
1import gc
2import pprint
3
4class Graph(object):
5 def __init__(self, name):
6 self.name = name
7 self.next = None
8 def set_next(self, next):
9 print 'Linking nodes %s.next = %s' % (self, next)
10 self.next = next
11 def __repr__(self):
12 return '%s(%s)' % (self.__class__.__name__, self.name)
13 def __del__(self):
14 print '%s.__del__()' % self
15
16# Construct a graph cycle
17one = Graph('one')
18two = Graph('two')
19three = Graph('three')
20one.set_next(two)
21two.set_next(three)
22three.set_next(one)
23
24# Remove references to the graph nodes in this module's namespace
25one = two = three = None
26
27# Collecting now keeps the objects as uncollectable
28print
29print 'Collecting...'
30n = gc.collect()
31print 'Unreachable objects:', n
32print 'Remaining Garbage:',
33pprint.pprint(gc.garbage)
34
35# Break the cycle
36print
37print 'Breaking the cycle'
38gc.garbage[0].set_next(None)
39print 'Removing references in gc.garbage'
40del gc.garbage[:]
41
42# Now the objects are removed
43print
44print 'Collecting...'
45n = gc.collect()
46print 'Unreachable objects:', n
47print 'Remaining Garbage:',
48pprint.pprint(gc.garbage)
输出
1$ python gc_collect_break_cycle.py
2
3Linking nodes Graph(one).next = Graph(two)
4Linking nodes Graph(two).next = Graph(three)
5Linking nodes Graph(three).next = Graph(one)
6
7Collecting...
8Unreachable objects: 6
9Remaining Garbage:[Graph(one), Graph(two), Graph(three)]
10
11Breaking the cycle
12Linking nodes Graph(one).next = None
13Removing references in gc.garbage
14Graph(two).__del__()
15Graph(three).__del__()
16Graph(one).__del__()
17
18Collecting...
19Unreachable objects: 0
20Remaining Garbage:[]
因为gc.garbage
会有一个对上次的垃圾回收的引用,这里打破循环后必须清除掉它,减少引用计数,然后回收掉。
寻找对不能被垃圾回收的 objects 的引用
这个例子创建了一个循环依赖,然后通过垃圾回收的 Graph 实例来寻找和移除对父节点的引用。
1import gc
2import pprint
3import Queue
4
5class Graph(object):
6 def __init__(self, name):
7 self.name = name
8 self.next = None
9 def set_next(self, next):
10 print 'Linking nodes %s.next = %s' % (self, next)
11 self.next = next
12 def __repr__(self):
13 return '%s(%s)' % (self.__class__.__name__, self.name)
14 def __del__(self):
15 print '%s.__del__()' % self
16
17# Construct two graph cycles
18one = Graph('one')
19two = Graph('two')
20three = Graph('three')
21one.set_next(two)
22two.set_next(three)
23three.set_next(one)
24
25# Remove references to the graph nodes in this module's namespace
26one = two = three = None
27
28# Collecting now keeps the objects as uncollectable
29print
30print 'Collecting...'
31n = gc.collect()
32print 'Unreachable objects:', n
33print 'Remaining Garbage:',
34pprint.pprint(gc.garbage)
35
36REFERRERS_TO_IGNORE = [ locals(), globals(), gc.garbage ]
37
38def find_referring_graphs(obj):
39 print 'Looking for references to %s' % repr(obj)
40 referrers = (r for r in gc.get_referrers(obj)
41 if r not in REFERRERS_TO_IGNORE)
42 for ref in referrers:
43 if isinstance(ref, Graph):
44 # A graph node
45 yield ref
46 elif isinstance(ref, dict):
47 # An instance or other namespace dictionary
48 for parent in find_referring_graphs(ref):
49 yield parent
50
51# Look for objects that refer to the objects that remain in
52# gc.garbage.
53print
54print 'Clearing referrers:'
55for obj in gc.garbage:
56 for ref in find_referring_graphs(obj):
57 ref.set_next(None)
58 del ref # remove local reference so the node can be deleted
59 del obj # remove local reference so the node can be deleted
60
61# Clear references held by gc.garbage
62print
63print 'Clearing gc.garbage:'
64del gc.garbage[:]
65
66# Everything should have been freed this time
67print
68print 'Collecting...'
69n = gc.collect()
70print 'Unreachable objects:', n
71print 'Remaining Garbage:',
72pprint.pprint(gc.garbage)
Collection Thresholds and Generations
垃圾收集器维护了三个列表,每一个列表都成为"一代"。每一代的 objects 都会被检查的时候,它不是被垃圾回收就是进入了下一代,直到进入了被永久保存的的状态。
垃圾收集的频率是可以调节的,这个和 objects 的分配和回收的数量有关。当分配的数目减去释放的数目大于这一代的 threshold 的时候,就会进行垃圾收集。这个 thresholds 可以使用get_threshold()
函数看到。
1import gc
2
3print gc.get_threshold()
返回每一代的 threshold 的值
1$ python gc_get_threshold.py
2
3(700, 10, 10)
thresholds 的值可以通过set_threshold()
函数修改,这个例子在命令行中读取0代的 threshold 的值,然后然后分配 objects。
1import gc
2import pprint
3import sys
4
5try:
6 threshold = int(sys.argv[1])
7except (IndexError, ValueError, TypeError):
8 print 'Missing or invalid threshold, using default'
9 threshold = 5
10
11class MyObj(object):
12 def __init__(self, name):
13 self.name = name
14 print 'Created', self.name
15
16gc.set_debug(gc.DEBUG_STATS)
17
18gc.set_threshold(threshold, 1, 1)
19print 'Thresholds:', gc.get_threshold()
20
21print 'Clear the collector by forcing a run'
22gc.collect()
23print
24
25print 'Creating objects'
26objs = []
27for i in range(10):
28 objs.append(MyObj(i))
不同的 thresholds 的值导致垃圾收集的频率发生变化,打开 debug 之后可以看到。
1$ python -u gc_threshold.py 5
2
3Thresholds: (5, 1, 1)
4Clear the collector by forcing a run
5gc: collecting generation 2...
6gc: objects in each generation: 144 3163 0
7gc: done, 0.0004s elapsed.
8
9Creating objects
10gc: collecting generation 0...
11gc: objects in each generation: 7 0 3234
12gc: done, 0.0000s elapsed.
13Created 0
14Created 1
15Created 2
16Created 3
17Created 4
18gc: collecting generation 0...
19gc: objects in each generation: 6 4 3234
20gc: done, 0.0000s elapsed.
21Created 5
22Created 6
23Created 7
24Created 8
25Created 9
26gc: collecting generation 2...
27gc: objects in each generation: 5 6 3232
28gc: done, 0.0004s elapsed.
threshold 的值变小可以让垃圾收集更频繁
1$ python -u gc_threshold.py 2
2
3Thresholds: (2, 1, 1)
4Clear the collector by forcing a run
5gc: collecting generation 2...
6gc: objects in each generation: 144 3163 0
7gc: done, 0.0004s elapsed.
8
9Creating objects
10gc: collecting generation 0...
11gc: objects in each generation: 3 0 3234
12gc: done, 0.0000s elapsed.
13gc: collecting generation 0...
14gc: objects in each generation: 4 3 3234
15gc: done, 0.0000s elapsed.
16Created 0
17Created 1
18gc: collecting generation 1...
19gc: objects in each generation: 3 4 3234
20gc: done, 0.0000s elapsed.
21Created 2
22Created 3
23Created 4
24gc: collecting generation 0...
25gc: objects in each generation: 5 0 3239
26gc: done, 0.0000s elapsed.
27Created 5
28Created 6
29Created 7
30gc: collecting generation 0...
31gc: objects in each generation: 5 3 3239
32gc: done, 0.0000s elapsed.
33Created 8
34Created 9
35gc: collecting generation 2...
36gc: objects in each generation: 2 6 3235
37gc: done, 0.0004s elapsed.
调试
Python gc 的 set_debug()
可以接受一些参数来配置垃圾收集器。调试信息被输出到 stderr。
DEBUG_STATS
标志可以打开统计报告,显示每一代追踪的 objects 的数量,还有收集花费的时间。
1import gc
2
3gc.set_debug(gc.DEBUG_STATS)
4
5gc.collect()
这个例子输出了两次独立的垃圾收集过程,一次是手动调用的时候,一次就是 Python 解释器退出的时候。
1$ python gc_debug_stats.py
2
3gc: collecting generation 2...
4gc: objects in each generation: 667 2808 0
5gc: done, 0.0011s elapsed.
6gc: collecting generation 2...
7gc: objects in each generation: 0 0 3164
8gc: done, 0.0009s elapsed.
打开DEBUG_COLLECTABLE
和DEBUG_UNCOLLECTABLE
可以让垃圾回收器在检查每一个 object 的时候显示它能否被收集,你还可以同时和DEBUG_OBJECTS
一起使用,这样的话,每个 objects 被检查的时候都会输出一些信息。
1import gc
2
3flags = (gc.DEBUG_COLLECTABLE |
4 gc.DEBUG_UNCOLLECTABLE |
5 gc.DEBUG_OBJECTS
6 )
7gc.set_debug(flags)
8
9class Graph(object):
10 def __init__(self, name):
11 self.name = name
12 self.next = None
13 print 'Creating %s 0x%x (%s)' % (self.__class__.__name__, id(self), name)
14 def set_next(self, next):
15 print 'Linking nodes %s.next = %s' % (self, next)
16 self.next = next
17 def __repr__(self):
18 return '%s(%s)' % (self.__class__.__name__, self.name)
19
20class CleanupGraph(Graph):
21 def __del__(self):
22 print '%s.__del__()' % self
23
24# Construct a graph cycle
25one = Graph('one')
26two = Graph('two')
27one.set_next(two)
28two.set_next(one)
29
30# Construct another node that stands on its own
31three = CleanupGraph('three')
32
33# Construct a graph cycle with a finalizer
34four = CleanupGraph('four')
35five = CleanupGraph('five')
36four.set_next(five)
37five.set_next(four)
38
39# Remove references to the graph nodes in this module's namespace
40one = two = three = four = five = None
41
42print
43
44# Force a sweep
45print 'Collecting'
46gc.collect()
47print 'Done'
输出结果可以看出来,Graph 实例one
和two
出现了循环依赖,但是仍然可以被垃圾回收,因为它们没有自己的 findlizer 方法,而且它们只有来自可以被垃圾回收的 objects 的引用。虽然CleanupGraph
有 finalizer 方法,但是three
一旦引用计数变为0还是可以被重新识别的。相比之下,four
和five
就有循环以来,而且没办法被释放。
1$ python -u gc_debug_collectable_objects.py
2
3Creating Graph 0x10045f750 (one)
4Creating Graph 0x10045f790 (two)
5Linking nodes Graph(one).next = Graph(two)
6Linking nodes Graph(two).next = Graph(one)
7Creating CleanupGraph 0x10045f7d0 (three)
8Creating CleanupGraph 0x10045f810 (four)
9Creating CleanupGraph 0x10045f850 (five)
10Linking nodes CleanupGraph(four).next = CleanupGraph(five)
11Linking nodes CleanupGraph(five).next = CleanupGraph(four)
12CleanupGraph(three).__del__()
13
14Collecting
15gc: collectable <Graph 0x10045f750>
16gc: collectable <Graph 0x10045f790>
17gc: collectable <dict 0x100360a30>
18gc: collectable <dict 0x100361cc0>
19gc: uncollectable <CleanupGraph 0x10045f810>
20gc: uncollectable <CleanupGraph 0x10045f850>
21gc: uncollectable <dict 0x100361de0>
22gc: uncollectable <dict 0x100362140>
23Done
DEBUG_INSTANCES
调试标志可以用来追踪旧式类,不继承 object 的那种。
1import gc
2
3flags = (gc.DEBUG_COLLECTABLE |
4 gc.DEBUG_UNCOLLECTABLE |
5 gc.DEBUG_INSTANCES
6 )
7gc.set_debug(flags)
8
9class Graph:
10 def __init__(self, name):
11 self.name = name
12 self.next = None
13 print 'Creating %s 0x%x (%s)' % (self.__class__.__name__, id(self), name)
14 def set_next(self, next):
15 print 'Linking nodes %s.next = %s' % (self, next)
16 self.next = next
17 def __repr__(self):
18 return '%s(%s)' % (self.__class__.__name__, self.name)
19
20class CleanupGraph(Graph):
21 def __del__(self):
22 print '%s.__del__()' % self
23
24# Construct a graph cycle
25one = Graph('one')
26two = Graph('two')
27one.set_next(two)
28two.set_next(one)
29
30# Construct another node that stands on its own
31three = CleanupGraph('three')
32
33# Construct a graph cycle with a finalizer
34four = CleanupGraph('four')
35five = CleanupGraph('five')
36four.set_next(five)
37five.set_next(four)
38
39# Remove references to the graph nodes in this module's namespace
40one = two = three = four = five = None
41
42print
43
44# Force a sweep
45print 'Collecting'
46gc.collect()
47print 'Done'
这种情况下,dict
object 包含的实例属性没有包含在输出中。
1$ python -u gc_debug_collectable_instances.py
2
3Creating Graph 0x1004687a0 (one)
4Creating Graph 0x1004687e8 (two)
5Linking nodes Graph(one).next = Graph(two)
6Linking nodes Graph(two).next = Graph(one)
7Creating CleanupGraph 0x100468878 (three)
8Creating CleanupGraph 0x1004688c0 (four)
9Creating CleanupGraph 0x100468908 (five)
10Linking nodes CleanupGraph(four).next = CleanupGraph(five)
11Linking nodes CleanupGraph(five).next = CleanupGraph(four)
12CleanupGraph(three).__del__()
13
14Collecting
15gc: collectable <Graph instance at 0x1004687a0>
16gc: collectable <Graph instance at 0x1004687e8>
17gc: uncollectable <CleanupGraph instance at 0x1004688c0>
18gc: uncollectable <CleanupGraph instance at 0x100468908>
19Done