virusdefender's blog ʕ•ᴥ•ʔ

Python内部机制(1) - 垃圾回收

首先介绍下主流的垃圾回收机制,原文在 http://www.zhihu.com/question/20018826/answer/28892543

  1. 引用计数(reference counting):

基本思路是为每个对象加一个计数器,记录指向这个对象的引用数量。每次有一个新的引用指向这个对象,计数器加一;反之每次有一个指向这个对象引用被置空或者指向其他对象,计数器减一。当计数器变为 0 的时候,自动删除这个对象。

引用计数的优点是 1)相对简单,不需要太多运行时(run-time)的支持,可以在原生不支持 GC 的语言里实现。2)对象会在成为垃圾的瞬间被释放,不会给正常程序的执行带来额外中断。它的死穴是循环引用,对象 A 包含一个引用指向对象 B ,同时对象 B 包含一个引用指向对象 A,计数器就抓瞎了。另外,引用计数对正常程序的执行性能有影响(每次引用赋值都要改计数器),特别是在多线程环境下(改计数器要加锁同步)。

  1. 标记-清扫(mark-sweep)。

基本思路是先按需分配,等到没有空闲内存的时候从寄存器和程序栈上的引用出发,遍历以对象为节点、以引用为边构成的图,把所有可以访问到的对象打上标记,然后清扫一遍内存空间,把所有没标记的对象释放。

标记-清扫没有无法处理循环引用的问题,不触发 GC 时也不影响正常程序的执行性能。但它的问题是当内存耗尽触发 GC 时,需要中断正常程序一段时间来清扫内存,在内存大对象多的时候这个中断可能很长。

  1. 节点复制(copying)。

基本思路是把整个内存空间一分为二,不妨记为 A 和 B。所有对象的内存在 A 中分配,当 A 塞满的时候,同样从寄存器和程序栈上的引用出发,遍历以对象为节点、以引用为边构成的图,把所有可以访问到的对象复制到 B 去,然后对调 A 和 B 的角色。

相对于标记-清扫,节点复制的主要缺点是总有一半空间空闲着无法利用,另一个比较隐晦的缺点是它使用内存的方式与现有的内存换页、Cache 换入换出机制有潜在的冲突。但它有个很大的优点: 所有的对象在内存中永远都是紧密排列的,所以分配内存的任务变得极为简单,只要移动一个指针即可。对于内存分配频繁的环境来说,性能优势相当大。另外,由于不需要清扫整个内存空间,所以如果内存中存活对象很少而垃圾对象很多的话(有些语言有这个倾向),触发 GC 造成的中断会小于标记-清扫。

Python 主要是通过引用计数来进行垃圾回收,这里我写了一个简单的 c 程序来模拟一下

 1#include <stdio.h>
 2#include <stdlib.h>
 3
 4#define PyDateType int
 5#define INT 1
 6#define FLOAT 2
 7
 8
 9//ref_count 引用计数器
10//data_type 是数据类型,这是只是模拟,真正不是这样实现的
11#define PyObject_HEAD \
12    int ref_count;    \
13    PyDateType data_type;
14
15
16//int 类型的
17typedef struct {
18    PyObject_HEAD
19    int value;
20} PyIntObject;
21
22
23//float 类型的
24typedef struct {
25    PyObject_HEAD
26    float value;
27} PyFloatObject;
28
29
30int main() {
31    PyIntObject *py_int_10;
32    py_int_10 = (PyIntObject *) malloc(sizeof(PyIntObject));
33    py_int_10->data_type = INT;
34    py_int_10->ref_count = 0;
35    py_int_10->value = 10;
36
37    PyIntObject *py_int_20;
38    py_int_20 = (PyIntObject *) malloc(sizeof(PyIntObject));
39    py_int_20->data_type = INT;
40    py_int_20->ref_count = 0;
41    py_int_20->value = 20;
42
43    //a 指向10
44    PyIntObject *a, *b;
45    a = py_int_10;
46    //增加引用计数
47    py_int_10->ref_count++;
48
49    //b 指向20
50    b = py_int_20;
51    py_int_20->ref_count++;
52
53    //把 a 指向20
54    a = py_int_20;
55    //减少10的引用计数 增加20的引用计数
56    py_int_10->ref_count--;
57    //释放掉10的内存空间
58    if(py_int_10 -> ref_count == 0){
59        free(py_int_10);
60    }
61    py_int_20->ref_count++;
62
63    printf("a: %d, b: %d\n", a->value, b->value);
64    printf("ref count: %d", py_int_20->ref_count);
65    return 0;
66}

引用计数有个明显的缺点就是没办法处理循环引用,比如

1l = []
2l.append(l)
3del l

这个 l 永远没办法被回收,因为它的引用计数没办法归零了。为了解决循环引用的问题,CPython特别设计了一个模块gc,其主要作用就是检查出循环引用的垃圾对象,并清除他们。该模块的实现,实际上也是引入了其余两种主流的垃圾收集技术——标记清除和分代收集。

gc模块是 Python 对外暴露的垃圾回收机制,这个模块包含了控制垃圾回收操作的函数,还有检查垃圾回收状态的函数。

追踪引用

gc模块里面有两个函数get_referents(*objs)get_referrers(*objs)。第一个函数是return a list of objects directly referred to by any of the arguments,也就是获取引用 objs 的 objects,第二个函数是return the list of objects that directly refer to any of objs,也就是获取 objs 引用的 objects。

代码示例

 1import gc
 2import pprint
 3
 4class Graph(object):
 5    def __init__(self, name):
 6        self.name = name
 7        self.next = None
 8    def set_next(self, next):
 9        print 'Linking nodes %s.next = %s' % (self, next)
10        self.next = next
11    def __repr__(self):
12        return '%s(%s)' % (self.__class__.__name__, self.name)
13
14# Construct a graph cycle
15one = Graph('one')
16two = Graph('two')
17three = Graph('three')
18one.set_next(two)
19two.set_next(three)
20three.set_next(one)
21
22print
23print 'three refers to:'
24for r in gc.get_referents(three):
25    pprint.pprint(r)

输出是

1$ python gc_get_referents.py
2
3Linking nodes Graph(one).next = Graph(two)
4Linking nodes Graph(two).next = Graph(three)
5Linking nodes Graph(three).next = Graph(one)
6
7three refers to:
8{'name': 'three', 'next': Graph(one)}
9<class '__main__.Graph'>

这个例子中,three这个实例在__dict__属性中保存了对它的实例字典的引用。

下面这个例子,使用Queue对所有的引用来进行广度优先搜索,用来发现循环依赖。队列中的每个元素都是一个元组,里面包含了引用链和下一个需要检查的元素。从three这个实例开始,然后遍历它引用的所有的东西。

 1#!/usr/bin/env python
 2# encoding: utf-8
 3#
 4# Copyright (c) 2010 Doug Hellmann.  All rights reserved.
 5#
 6"""Show the objects with references to a given object.
 7"""
 8# end_pymotw_header
 9
10import gc
11import pprint
12import Queue
13
14
15class Graph(object):
16    def __init__(self, name):
17        self.name = name
18        self.next = None
19
20    def set_next(self, next):
21        print 'Linking nodes %s.next = %s' % (self, next)
22        self.next = next
23
24    def __repr__(self):
25        return '%s(%s)' % (self.__class__.__name__, self.name)
26
27# Construct a graph cycle
28one = Graph('one')
29two = Graph('two')
30three = Graph('three')
31one.set_next(two)
32two.set_next(three)
33three.set_next(one)
34
35print
36
37seen = set()
38to_process = Queue.Queue()
39
40# Start with an empty object chain and Graph three.
41to_process.put(([], three))
42
43# Look for cycles, building the object chain for each object found
44# in the queue so the full cycle can be printed at the end.
45while not to_process.empty():
46    chain, next = to_process.get()
47    print "get:", chain, next
48    chain = chain[:]
49    chain.append(next)
50    print "chain", chain
51    print 'Examining:', repr(next)
52    print "add id", id(next)
53    seen.add(id(next))
54    for r in gc.get_referents(next):
55        # 跳过部分没用的引用
56        if not (isinstance(r, basestring) or isinstance(r, type)):
57            print "refer to:", r
58            if id(r) in seen:
59                print
60                print 'Found a cycle to %s:' % r
61                for i, link in enumerate(chain):
62                    print '  %d: ' % i,
63                    pprint.pprint(link)
64            else:
65                print "put:", chain, r
66                to_process.put((chain, r))
67    print "---------------"

输出是

 1Linking nodes Graph(one).next = Graph(two)
 2Linking nodes Graph(two).next = Graph(three)
 3Linking nodes Graph(three).next = Graph(one)
 4# 初始化的状态 下一个要检查的是 three
 5get: [] Graph(three)
 6chain [Graph(three)]
 7Examining: Graph(three)
 8# 标记three 的 id 已经检查
 9add id 4392667024
10# three 指向这个东西
11refer to: {'name': 'three', 'next': Graph(one)}
12# 下一个要检查的就是{'name': 'three', 'next': Graph(one)}
13put: [Graph(three)] {'name': 'three', 'next': Graph(one)}
14---------------
15get: [Graph(three)] {'name': 'three', 'next': Graph(one)}
16chain [Graph(three), {'name': 'three', 'next': Graph(one)}]
17Examining: {'name': 'three', 'next': Graph(one)}
18add id 140688010354752
19# {'name': 'three', 'next': Graph(one)} 引用了 one
20refer to: Graph(one)
21put: [Graph(three), {'name': 'three', 'next': Graph(one)}] Graph(one)
22---------------
23get: [Graph(three), {'name': 'three', 'next': Graph(one)}] Graph(one)
24chain [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one)]
25Examining: Graph(one)
26add id 4392615504
27refer to: {'name': 'one', 'next': Graph(two)}
28# 现在循环链是 three -> {'name': 'three', 'next': Graph(one)} -> one
29put: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one)] {'name': 'one', 'next': Graph(two)}
30---------------
31get: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one)] {'name': 'one', 'next': Graph(two)}
32chain [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}]
33Examining: {'name': 'one', 'next': Graph(two)}
34add id 140688010307184
35refer to: Graph(two)
36# 循环链变为three -> {'name': 'three', 'next': Graph(one)} -> one -> {'name': 'one', 'next': Graph(two)}
37put: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}] Graph(two)
38---------------
39get: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}] Graph(two)
40chain [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}, Graph(two)]
41Examining: Graph(two)
42add id 4392665936
43refer to: {'name': 'two', 'next': Graph(three)}
44# 继续增长
45put: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}, Graph(two)] {'name': 'two', 'next': Graph(three)}
46---------------
47get: [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}, Graph(two)] {'name': 'two', 'next': Graph(three)}
48chain [Graph(three), {'name': 'three', 'next': Graph(one)}, Graph(one), {'name': 'one', 'next': Graph(two)}, Graph(two), {'name': 'two', 'next': Graph(three)}]
49Examining: {'name': 'two', 'next': Graph(three)}
50add id 140688010362416
51refer to: Graph(three)
52# 这个 id 已经出现过了,说明出现了循环引用
53Found a cycle to Graph(three):
54  0: Graph(three)
55  1: {'name': 'three', 'next': Graph(one)}
56  2: Graph(one)
57  3: {'name': 'one', 'next': Graph(two)}
58  4: Graph(two)
59  5: {'name': 'two', 'next': Graph(three)}
60---------------

强制进行垃圾回收

代码示例

 1import gc
 2import pprint
 3
 4class Graph(object):
 5    def __init__(self, name):
 6        self.name = name
 7        self.next = None
 8    def set_next(self, next):
 9        print 'Linking nodes %s.next = %s' % (self, next)
10        self.next = next
11    def __repr__(self):
12        return '%s(%s)' % (self.__class__.__name__, self.name)
13
14# Construct a graph cycle
15one = Graph('one')
16two = Graph('two')
17three = Graph('three')
18one.set_next(two)
19two.set_next(three)
20three.set_next(one)
21
22print
23
24# Remove references to the graph nodes in this module's namespace
25one = two = three = None
26
27# Show the effect of garbage collection
28for i in range(2):
29    print 'Collecting %d ...' % i
30    n = gc.collect()
31    print 'Unreachable objects:', n
32    print 'Remaining Garbage:', 
33    pprint.pprint(gc.garbage)
34    print

输出

 1$ python gc_collect.py
 2
 3Linking nodes Graph(one).next = Graph(two)
 4Linking nodes Graph(two).next = Graph(three)
 5Linking nodes Graph(three).next = Graph(one)
 6
 7Collecting 0 ...
 8Unreachable objects: 6
 9Remaining Garbage:[]
10
11Collecting 1 ...
12Unreachable objects: 0
13Remaining Garbage:[]

这个例子中,第一次进行垃圾回收的时候,循环引用就被打破了,除了自己,没有再引用 Graph 节点的了。collect()函数返回了 unreachable 的 objects 的数量(不知道怎么翻译比较好,反正就是没办法再获取到这三个实例了,因为之前对它三个的引用被赋值为 None 了)。这个例子中,这个值是6,因为有3个类实例和自己的实例属性字典。

如果Graph__del()__方法的话,垃圾回收就不能打破循环了。

 1import gc
 2import pprint
 3
 4class Graph(object):
 5    def __init__(self, name):
 6        self.name = name
 7        self.next = None
 8    def set_next(self, next):
 9        print '%s.next = %s' % (self, next)
10        self.next = next
11    def __repr__(self):
12        return '%s(%s)' % (self.__class__.__name__, self.name)
13    def __del__(self):
14        print '%s.__del__()' % self
15
16# Construct a graph cycle
17one = Graph('one')
18two = Graph('two')
19three = Graph('three')
20one.set_next(two)
21two.set_next(three)
22three.set_next(one)
23
24# Remove references to the graph nodes in this module's namespace
25one = two = three = None
26
27# Show the effect of garbage collection
28print 'Collecting...'
29n = gc.collect()
30print 'Unreachable objects:', n
31print 'Remaining Garbage:', 
32pprint.pprint(gc.garbage)

输出

1$ python gc_collect_with_del.py
2
3Graph(one).next = Graph(two)
4Graph(two).next = Graph(three)
5Graph(three).next = Graph(one)
6Collecting...
7Unreachable objects: 6
8Remaining Garbage:[Graph(one), Graph(two), Graph(three)]

因为在循环中,超过一个的 object 拥有 finalizer 方法,垃圾回收机制没办法确定回收的顺序,为了安全起见,就保持原样了。

当打破循环的时候,Graph 实例就可以被回收了。

 1import gc
 2import pprint
 3
 4class Graph(object):
 5    def __init__(self, name):
 6        self.name = name
 7        self.next = None
 8    def set_next(self, next):
 9        print 'Linking nodes %s.next = %s' % (self, next)
10        self.next = next
11    def __repr__(self):
12        return '%s(%s)' % (self.__class__.__name__, self.name)
13    def __del__(self):
14        print '%s.__del__()' % self
15
16# Construct a graph cycle
17one = Graph('one')
18two = Graph('two')
19three = Graph('three')
20one.set_next(two)
21two.set_next(three)
22three.set_next(one)
23
24# Remove references to the graph nodes in this module's namespace
25one = two = three = None
26
27# Collecting now keeps the objects as uncollectable
28print
29print 'Collecting...'
30n = gc.collect()
31print 'Unreachable objects:', n
32print 'Remaining Garbage:', 
33pprint.pprint(gc.garbage)
34    
35# Break the cycle
36print
37print 'Breaking the cycle'
38gc.garbage[0].set_next(None)
39print 'Removing references in gc.garbage'
40del gc.garbage[:]
41
42# Now the objects are removed
43print
44print 'Collecting...'
45n = gc.collect()
46print 'Unreachable objects:', n
47print 'Remaining Garbage:', 
48pprint.pprint(gc.garbage)

输出

 1$ python gc_collect_break_cycle.py
 2
 3Linking nodes Graph(one).next = Graph(two)
 4Linking nodes Graph(two).next = Graph(three)
 5Linking nodes Graph(three).next = Graph(one)
 6
 7Collecting...
 8Unreachable objects: 6
 9Remaining Garbage:[Graph(one), Graph(two), Graph(three)]
10
11Breaking the cycle
12Linking nodes Graph(one).next = None
13Removing references in gc.garbage
14Graph(two).__del__()
15Graph(three).__del__()
16Graph(one).__del__()
17
18Collecting...
19Unreachable objects: 0
20Remaining Garbage:[]

因为gc.garbage会有一个对上次的垃圾回收的引用,这里打破循环后必须清除掉它,减少引用计数,然后回收掉。

寻找对不能被垃圾回收的 objects 的引用

这个例子创建了一个循环依赖,然后通过垃圾回收的 Graph 实例来寻找和移除对父节点的引用。

 1import gc
 2import pprint
 3import Queue
 4
 5class Graph(object):
 6    def __init__(self, name):
 7        self.name = name
 8        self.next = None
 9    def set_next(self, next):
10        print 'Linking nodes %s.next = %s' % (self, next)
11        self.next = next
12    def __repr__(self):
13        return '%s(%s)' % (self.__class__.__name__, self.name)
14    def __del__(self):
15        print '%s.__del__()' % self
16
17# Construct two graph cycles
18one = Graph('one')
19two = Graph('two')
20three = Graph('three')
21one.set_next(two)
22two.set_next(three)
23three.set_next(one)
24
25# Remove references to the graph nodes in this module's namespace
26one = two = three = None
27
28# Collecting now keeps the objects as uncollectable
29print
30print 'Collecting...'
31n = gc.collect()
32print 'Unreachable objects:', n
33print 'Remaining Garbage:', 
34pprint.pprint(gc.garbage)
35
36REFERRERS_TO_IGNORE = [ locals(), globals(), gc.garbage ]
37
38def find_referring_graphs(obj):
39    print 'Looking for references to %s' % repr(obj)
40    referrers = (r for r in gc.get_referrers(obj)
41                 if r not in REFERRERS_TO_IGNORE)
42    for ref in referrers:
43        if isinstance(ref, Graph):
44            # A graph node
45            yield ref
46        elif isinstance(ref, dict):
47            # An instance or other namespace dictionary
48            for parent in find_referring_graphs(ref):
49                yield parent
50
51# Look for objects that refer to the objects that remain in
52# gc.garbage.
53print
54print 'Clearing referrers:'
55for obj in gc.garbage:
56    for ref in find_referring_graphs(obj):
57        ref.set_next(None)
58        del ref # remove local reference so the node can be deleted
59    del obj # remove local reference so the node can be deleted
60
61# Clear references held by gc.garbage
62print
63print 'Clearing gc.garbage:'
64del gc.garbage[:]
65        
66# Everything should have been freed this time
67print
68print 'Collecting...'
69n = gc.collect()
70print 'Unreachable objects:', n
71print 'Remaining Garbage:', 
72pprint.pprint(gc.garbage)

Collection Thresholds and Generations

垃圾收集器维护了三个列表,每一个列表都成为"一代"。每一代的 objects 都会被检查的时候,它不是被垃圾回收就是进入了下一代,直到进入了被永久保存的的状态。

垃圾收集的频率是可以调节的,这个和 objects 的分配和回收的数量有关。当分配的数目减去释放的数目大于这一代的 threshold 的时候,就会进行垃圾收集。这个 thresholds 可以使用get_threshold()函数看到。

1import gc
2
3print gc.get_threshold()

返回每一代的 threshold 的值

1$ python gc_get_threshold.py
2
3(700, 10, 10)

thresholds 的值可以通过set_threshold()函数修改,这个例子在命令行中读取0代的 threshold 的值,然后然后分配 objects。

 1import gc
 2import pprint
 3import sys
 4
 5try:
 6    threshold = int(sys.argv[1])
 7except (IndexError, ValueError, TypeError):
 8    print 'Missing or invalid threshold, using default'
 9    threshold = 5
10
11class MyObj(object):
12    def __init__(self, name):
13        self.name = name
14        print 'Created', self.name
15
16gc.set_debug(gc.DEBUG_STATS)
17
18gc.set_threshold(threshold, 1, 1)
19print 'Thresholds:', gc.get_threshold()
20
21print 'Clear the collector by forcing a run'
22gc.collect()
23print
24
25print 'Creating objects'
26objs = []
27for i in range(10):
28    objs.append(MyObj(i))

不同的 thresholds 的值导致垃圾收集的频率发生变化,打开 debug 之后可以看到。

 1$ python -u gc_threshold.py 5
 2
 3Thresholds: (5, 1, 1)
 4Clear the collector by forcing a run
 5gc: collecting generation 2...
 6gc: objects in each generation: 144 3163 0
 7gc: done, 0.0004s elapsed.
 8
 9Creating objects
10gc: collecting generation 0...
11gc: objects in each generation: 7 0 3234
12gc: done, 0.0000s elapsed.
13Created 0
14Created 1
15Created 2
16Created 3
17Created 4
18gc: collecting generation 0...
19gc: objects in each generation: 6 4 3234
20gc: done, 0.0000s elapsed.
21Created 5
22Created 6
23Created 7
24Created 8
25Created 9
26gc: collecting generation 2...
27gc: objects in each generation: 5 6 3232
28gc: done, 0.0004s elapsed.

threshold 的值变小可以让垃圾收集更频繁

 1$ python -u gc_threshold.py 2
 2
 3Thresholds: (2, 1, 1)
 4Clear the collector by forcing a run
 5gc: collecting generation 2...
 6gc: objects in each generation: 144 3163 0
 7gc: done, 0.0004s elapsed.
 8
 9Creating objects
10gc: collecting generation 0...
11gc: objects in each generation: 3 0 3234
12gc: done, 0.0000s elapsed.
13gc: collecting generation 0...
14gc: objects in each generation: 4 3 3234
15gc: done, 0.0000s elapsed.
16Created 0
17Created 1
18gc: collecting generation 1...
19gc: objects in each generation: 3 4 3234
20gc: done, 0.0000s elapsed.
21Created 2
22Created 3
23Created 4
24gc: collecting generation 0...
25gc: objects in each generation: 5 0 3239
26gc: done, 0.0000s elapsed.
27Created 5
28Created 6
29Created 7
30gc: collecting generation 0...
31gc: objects in each generation: 5 3 3239
32gc: done, 0.0000s elapsed.
33Created 8
34Created 9
35gc: collecting generation 2...
36gc: objects in each generation: 2 6 3235
37gc: done, 0.0004s elapsed.

调试

Python gc 的 set_debug()可以接受一些参数来配置垃圾收集器。调试信息被输出到 stderr。

DEBUG_STATS标志可以打开统计报告,显示每一代追踪的 objects 的数量,还有收集花费的时间。

1import gc
2
3gc.set_debug(gc.DEBUG_STATS)
4
5gc.collect()

这个例子输出了两次独立的垃圾收集过程,一次是手动调用的时候,一次就是 Python 解释器退出的时候。

1$ python gc_debug_stats.py
2
3gc: collecting generation 2...
4gc: objects in each generation: 667 2808 0
5gc: done, 0.0011s elapsed.
6gc: collecting generation 2...
7gc: objects in each generation: 0 0 3164
8gc: done, 0.0009s elapsed.

打开DEBUG_COLLECTABLEDEBUG_UNCOLLECTABLE可以让垃圾回收器在检查每一个 object 的时候显示它能否被收集,你还可以同时和DEBUG_OBJECTS一起使用,这样的话,每个 objects 被检查的时候都会输出一些信息。

 1import gc
 2
 3flags = (gc.DEBUG_COLLECTABLE |
 4         gc.DEBUG_UNCOLLECTABLE |
 5         gc.DEBUG_OBJECTS
 6         )
 7gc.set_debug(flags)
 8
 9class Graph(object):
10    def __init__(self, name):
11        self.name = name
12        self.next = None
13        print 'Creating %s 0x%x (%s)' % (self.__class__.__name__, id(self), name)
14    def set_next(self, next):
15        print 'Linking nodes %s.next = %s' % (self, next)
16        self.next = next
17    def __repr__(self):
18        return '%s(%s)' % (self.__class__.__name__, self.name)
19
20class CleanupGraph(Graph):
21    def __del__(self):
22        print '%s.__del__()' % self
23
24# Construct a graph cycle
25one = Graph('one')
26two = Graph('two')
27one.set_next(two)
28two.set_next(one)
29
30# Construct another node that stands on its own
31three = CleanupGraph('three')
32
33# Construct a graph cycle with a finalizer
34four = CleanupGraph('four')
35five = CleanupGraph('five')
36four.set_next(five)
37five.set_next(four)
38
39# Remove references to the graph nodes in this module's namespace
40one = two = three = four = five = None
41
42print
43
44# Force a sweep
45print 'Collecting'
46gc.collect()
47print 'Done'

输出结果可以看出来,Graph 实例onetwo出现了循环依赖,但是仍然可以被垃圾回收,因为它们没有自己的 findlizer 方法,而且它们只有来自可以被垃圾回收的 objects 的引用。虽然CleanupGraph有 finalizer 方法,但是three一旦引用计数变为0还是可以被重新识别的。相比之下,fourfive就有循环以来,而且没办法被释放。

 1$ python -u gc_debug_collectable_objects.py
 2
 3Creating Graph 0x10045f750 (one)
 4Creating Graph 0x10045f790 (two)
 5Linking nodes Graph(one).next = Graph(two)
 6Linking nodes Graph(two).next = Graph(one)
 7Creating CleanupGraph 0x10045f7d0 (three)
 8Creating CleanupGraph 0x10045f810 (four)
 9Creating CleanupGraph 0x10045f850 (five)
10Linking nodes CleanupGraph(four).next = CleanupGraph(five)
11Linking nodes CleanupGraph(five).next = CleanupGraph(four)
12CleanupGraph(three).__del__()
13
14Collecting
15gc: collectable <Graph 0x10045f750>
16gc: collectable <Graph 0x10045f790>
17gc: collectable <dict 0x100360a30>
18gc: collectable <dict 0x100361cc0>
19gc: uncollectable <CleanupGraph 0x10045f810>
20gc: uncollectable <CleanupGraph 0x10045f850>
21gc: uncollectable <dict 0x100361de0>
22gc: uncollectable <dict 0x100362140>
23Done

DEBUG_INSTANCES调试标志可以用来追踪旧式类,不继承 object 的那种。

 1import gc
 2
 3flags = (gc.DEBUG_COLLECTABLE |
 4         gc.DEBUG_UNCOLLECTABLE |
 5         gc.DEBUG_INSTANCES
 6         )
 7gc.set_debug(flags)
 8
 9class Graph:
10    def __init__(self, name):
11        self.name = name
12        self.next = None
13        print 'Creating %s 0x%x (%s)' % (self.__class__.__name__, id(self), name)
14    def set_next(self, next):
15        print 'Linking nodes %s.next = %s' % (self, next)
16        self.next = next
17    def __repr__(self):
18        return '%s(%s)' % (self.__class__.__name__, self.name)
19
20class CleanupGraph(Graph):
21    def __del__(self):
22        print '%s.__del__()' % self
23
24# Construct a graph cycle
25one = Graph('one')
26two = Graph('two')
27one.set_next(two)
28two.set_next(one)
29
30# Construct another node that stands on its own
31three = CleanupGraph('three')
32
33# Construct a graph cycle with a finalizer
34four = CleanupGraph('four')
35five = CleanupGraph('five')
36four.set_next(five)
37five.set_next(four)
38
39# Remove references to the graph nodes in this module's namespace
40one = two = three = four = five = None
41
42print
43
44# Force a sweep
45print 'Collecting'
46gc.collect()
47print 'Done'

这种情况下,dict object 包含的实例属性没有包含在输出中。

 1$ python -u gc_debug_collectable_instances.py
 2
 3Creating Graph 0x1004687a0 (one)
 4Creating Graph 0x1004687e8 (two)
 5Linking nodes Graph(one).next = Graph(two)
 6Linking nodes Graph(two).next = Graph(one)
 7Creating CleanupGraph 0x100468878 (three)
 8Creating CleanupGraph 0x1004688c0 (four)
 9Creating CleanupGraph 0x100468908 (five)
10Linking nodes CleanupGraph(four).next = CleanupGraph(five)
11Linking nodes CleanupGraph(five).next = CleanupGraph(four)
12CleanupGraph(three).__del__()
13
14Collecting
15gc: collectable <Graph instance at 0x1004687a0>
16gc: collectable <Graph instance at 0x1004687e8>
17gc: uncollectable <CleanupGraph instance at 0x1004688c0>
18gc: uncollectable <CleanupGraph instance at 0x100468908>
19Done

主要参考 http://pymotw.com/2/gc/index.html

提交评论 | 微信打赏 | 转载必须注明原文链接

#Python