Module implementing Cyclops to stare at cycles in Python data structures.
Cyclops started life as a faster implementation of Lars Marius Garshol's Plumbo.py, but almost instantly diverged due to feature bloat.
The only object of real interest is class CycleFinder. First get an instance of that:
import Cyclops z = Cyclops.CycleFinder()
This creates a cycle-tracker with an empty "root set". Then tell z which objects you're curious about (more on that later).
After running your regular code, tell z to look for cycles:
Then various methods can be used to get useful reports. Which are *most* useful you'll just have to play with. If you have a few small cycles, show_cycles is very revealing; if you have many cycles, show_sccs and show_arcs *may* cut to the heart of the problem quickly.
There are two ways to add objects to the root set:
To see the current root set,
Finally, to empty the root set again:
Notes:
CHANGING THE SET OF INTERESTING TYPES
Methods of a CycleFinder object can be used to alter/query its view of the set of types find_cycles "chases":
CHANGING THE SET OF INTERESTING CYCLES
Sometimes there are "expected" cycles you would like to ignore; e.g., this can happen if you install a module-chaser, because there are cycles in Python's implementation you typically don't care about (for example, if your module imports sys, there's a cycle because sys.modules points back to your module!).
CASE STUDY
Below is the driver I used to track cycles in IDLE; it's a replacement for IDLE's idle.py.
At first it didn't install function or module chasers, or a cycle filter, and printed everything. This turned up a bunch of easy cases, and the show_sccs output was surprisingly revealing (all the cycles fell into a handful of SCCs, which corresponded to distinct cycle-creating IDLE subsystems). show_arcs was also very helpful in getting the big picture. show_cycles output was too voluminous to be helpful.
After those cycles were broken, the job got harder. A module chaser was added, which turned up another class of cycles, and then a function chaser turned up 100s more. Most of these involved expected cycles due to Python's implementation, so a cycle filter was installed to ignore cycles that didn't contain at least one class instance. The remaining cycles were isolated special cases, and only show_cycles output was of real use.
After all cycles were purged, IDLE was still leaking, so driver output was added to display the root-set objects still alive at the end. This turned up many cases where objects were living only because registration in the root set was keeping them alive. So a loop was added to the driver that repeatedly purges dead root-set objects and tries again. The __del__ methods of the purged roots caused other root objects to become trash, and after several iterations of this the output reaches a steady state.
IDLE is still leaking (as of 18-Jul-1999), but ever less so, and while Cyclops is no longer finding cycles, the driver's "live at the end" output is still the best clue I've got for guessing what to do next.
Interesting: At the start of this, it turned out that almost all cycles were reachable from places outside themselves. That is, they would not have been considered trash even if Python used a mark-&-sweep form of garbage collection. IDLE's problems, in large part inherited from Tkinter, are simply that "everything points to everything else". The good news is that Guido was able to clean most of this up just by adding reference- purging code to widgets' explicitly-called destroy() methods.
#! /usr/bin/env python import PyShell import Cyclops import types def mod_refs(x): return x.__dict__.values() def mod_tag(x, i): return "." + x.__dict__.keys()[i] def func_refs(x): return x.func_globals, x.func_defaults def func_tag(x, i): return (".func_globals", ".func_defaults")[i] def instance_filter(cycle): for obj, index in cycle: if type(obj) is types.InstanceType: return 1 return 0 # Note: PyShell binds ModifiedInterpreter.locals to __main__.__dict__, # and __main__ is *us*. So if we don't keep the CycleFinder instance # out of the global namespace here, z starts chewing over its own # instance attributes. Nothing breaks, but the output is at best # surprising. def hidez(): z = Cyclops.CycleFinder() z.chase_type(types.ModuleType, mod_refs, mod_tag) z.chase_type(types.FunctionType, func_refs, func_tag) z.install_cycle_filter(instance_filter) z.run(PyShell.main) z.find_cycles() z.show_stats() z.show_cycles() # z.show_cycleobjs() # z.show_sccs() z.show_arcs() while 1: print "*" * 70 print "non-cyclic root set objects:" sawitalready = {} numsurvivors = numdead = 0 for rc, cyclic, x in z.get_rootset(): if not sawitalready.has_key(id(x)): sawitalready[id(x)] = 1 if rc == 0: print "DEAD", numdead = numdead + 1 z.show_obj(x) elif not cyclic: numsurvivors = numsurvivors + 1 z.show_obj(x) x = None print numdead, "dead;", numsurvivors, "non-cycle & alive" if numdead == 0: break print "releasing dead root-set objects and trying again" z.find_cycles(1) z.show_stats() hidez()
_ClassType |
_InstanceType |
__version__ |
_class_refs |
_class_tag |
_default_refs_dispatcher |
_default_tag_dispatcher |
_quickrepr |
_repr |
_tuple_refs |
_tuple_tag |
CycleFinder | Class for finding cycles in Python data structures. |
X | Class for the test |
Y | Empty class for the test. |
_CyclopsRepr | A class to make short string representations of objects. |
_dict_refs | Function returning a sequence of all objects directly reachable from x. |
_dict_tag | Function returning a string describing how the reference was obtained from x. |
_instance_method_refs | Function returning a sequence of all objects directly reachable from x. |
_instance_method_tag | Function returning a string describing how the reference was obtained from x. |
_instance_refs | Function returning a sequence of all objects directly reachable from x. |
_instance_tag | Function returning a string describing how the reference was obtained from x. |
_list_refs | Function returning a sequence of all objects directly reachable from x. |
_list_tag | Function returning a string describing how the reference was obtained from x. |
_test | Function to test the module. |
typename_address_cmp | Method to do some special comparison based on type or name and address. |
Class for finding cycles in Python data structures.
See Cyclops module description for details.
None |
CycleFinder | Create a cycle finder with empty root set. |
__find_cycles | Private method to find cycles. |
__init_tracer | Private method to initialize the tracer. |
__obj2arcname | Private helper method. |
__reset | Private method to perform a reset. |
__study_cycle | Private method to analyze a cycle. |
_print_cycle | Method to print a cycle. |
_print_separator | Method to print a separator line. |
chase_type | t, t_refs_func, t_tag_func -> chase type t. |
clear | Remove all internal references to external objects. |
dont_chase_type | t -> remove type t from the set of chased types. |
find_cycles | purge_dead_roots=0 -> look for cycles, return true if found. |
get_chased_types | Return the set of chased types, as a list. |
get_rootset | Return the root set, as a list of (rc, cyclic?, obj) tuples. |
install_cycle_filter | filter_func=None -> a way to ignore "expected" cycles. |
register | obj -> add object obj to the root set. |
run | func, args=(), kwargs={} -> add objects to root set by magic. |
show_arcs | compare=None -> print unique arc types in cycles. |
show_cycleobjs | compare=typename_address_cmp -> print all objects in cycles. |
show_cycles | Print all cycles to stdout. |
show_obj | obj -> print short description of obj to sdtout. |
show_sccs | compare=typename_address_cmp -> print SCCs. |
show_stats | Print statistics for the last run of find_cycles. |
Create a cycle finder with empty root set.
Private method to find cycles.
Private method to initialize the tracer.
Private helper method.
Private method to perform a reset.
Private method to analyze a cycle.
Method to print a cycle.
Method to print a separator line.
t, t_refs_func, t_tag_func -> chase type t.
See module docstring for details.
Remove all internal references to external objects.
Empties the root set. Does not change the set of types this CycleFinder chases. Does not change the cycle filter in effect.
t -> remove type t from the set of chased types.
See module docstring for details.
purge_dead_roots=0 -> look for cycles, return true if found.
Identify all cycles among objects reachable from the root set. Return true iff at least one cycle is found.
This should be called before any of the show_XXX methods. Note that it's OK to add more objects to the root set and call it again, or to change the set of chased types, etc. find_cycles starts over from scratch each time it's called.
If optional arg purge_dead_roots is true (default false), before searching for cycles the root set is purged of all objects that the preceding run of find_cycles determined had a true refcount of 0 (that is, the root set objects that are still alive only because they appear in the root set). Purging these allows their finalizers to get invoked, which may allow a cascade of other objects (including cycles) to go away too.
See also method install_cycle_filter.
Return the set of chased types, as a list.
Return the root set, as a list of (rc, cyclic?, obj) tuples.
Should be called after find_cycles. For each object in the root set, returns a triple consisting of
filter_func=None -> a way to ignore "expected" cycles.
See module docstring for details. This is a callback function invoked whenever find_cycles() finds a cycle; the cycle is ignored unless the callback returns true.
obj -> add object obj to the root set.
func, args=(), kwargs={} -> add objects to root set by magic.
Function func is invoked with arguments args and keyword arguments kwargs. For the duration of the call, each class instance initialized by an __init__ call is automatically added to the root set. The result of invoking func is returned.
compare=None -> print unique arc types in cycles.
See module docstring for details. Briefly, each arc in a cycle is categorized by the type of the source node, the kind of arc (how we got from the source to the destination), and the type of the destination node. Each line of output consists of those three pieces of info preceded by the count of arcs of that kind. By default, the rows are sorted first by column 2 (source node type), then by columns 3 and 4.
compare=typename_address_cmp -> print all objects in cycles.
Prints to stdout. Each distinct object find_cycles found in a cycle is displayed. The set of objects found in cycles is first sorted by the optional "compare" function. By default, objects are sorted using their type name as the primary key and their storage address (id) as the secondary key; among objects of instance type, sorts by the instances' class names; among objects of class type, sorts by the classes' names.
Print all cycles to stdout.
obj -> print short description of obj to sdtout.
This is of the form
where
compare=typename_address_cmp -> print SCCs.
Prints to stdout. Shows the objects in cycles partitioned into strongly connected components (that is, the largest groupings possible such that each object in an SCC is reachable from every other object in that SCC). Within each SCC, objects are sorted as for show_cycleobjs.
Print statistics for the last run of find_cycles.
Class for the test
None |
X | |
__repr__ |
Empty class for the test.
None |
None |
A class to make short string representations of objects.
The std repr.repr sorts dicts by keys, but we refer to the keys via numeric subscript in cycle reports, so first a derived class that leaves dicts in raw order. Also, instances, instance methods and classes frequently appear in cycle reports, so avoid chopping their reprs at all. We're only trying to prevent massive expense for massive lists, tuples & dicts.
repr_instance_method |
_CyclopsRepr | Constructor |
repr_class | Method overriding the base class formatter. |
repr_dictionary | Method overriding the base dictionary formatter. |
repr_instance | Method overriding the base instance formatter. |
Constructor
Method overriding the base class formatter.
Method overriding the base dictionary formatter.
Method overriding the base instance formatter.
Function returning a sequence of all objects directly reachable from x.
Function returning a string describing how the reference was obtained from x.
Function returning a sequence of all objects directly reachable from x.
Function returning a string describing how the reference was obtained from x.
Function returning a sequence of all objects directly reachable from x.
Function returning a string describing how the reference was obtained from x.
Function returning a sequence of all objects directly reachable from x.
Function returning a string describing how the reference was obtained from x.
Function to test the module.
Method to do some special comparison based on type or name and address.