Modulegraph2 internals

Warning

Everything documented in this file is private to the implementation of modulegraph and should not be relied on by users of the module.

Please file and issue if you do need functionality documented here, and include the use case you need the functionality for.

Package structure

The modulegraph2 package contains a number of submodules that actually implement the behavior, with code separated in logical modules.

All submodules with a name that start with an underscore are private, and that’s also true for all names defined in those modules unless they are explicitly exported by the package __init__.py file.

Module “_ast_tools”: working with the AST for a module

Tools for working with the AST for a module. This currently just defines a function for extracting information about import statements from the AST.

modulegraph2._ast_tools.extract_ast_info(node: AST) Iterator[ImportInfo]

Scan the AST for a module to look for import statements.

The AST scanner gives the most detailed information about import statements, and includes information about renames (import ... as ...), and the location of imports (global, in a function, in a try/except statement, in a conditional statement).

The scanner explicitly manages a work queue and will not recurse to avoid exhausting the stack.

Parameters

node – The AST for a module

Returns

An iterator that yields information about all located import statements

Module “_bytecode_tools”: working with the bytecode for a module

Tools for working with the bytecode for a module. This currently just defines a function for extracting information about import statements and the use of global names.

modulegraph2._bytecode_tools._all_code_objects(code: code) Iterator[Tuple[code, List[code]]]

Yield all code objects directly and indirectly referenced from code (including code itself).

This could explicitly manages a work queue and does not recurse to avoid exhausting the stack.

Parameters

code – A code object

Returns

An iterator that yields tuples (child_code, parents), where parents is a list all code objects in the reference chain from code.

modulegraph2._bytecode_tools._extract_single(code: code, is_function_code: bool, is_class_code: bool)

Extract import information from a single bytecode object (without recursing into child objects).

Parameters
  • code – The code object to process

  • is_function_code – True if this is a code object for a function or anything in a function.

  • is_class_code – True if this is the code object for a class

modulegraph2._bytecode_tools._is_code_for_function(code: code, parents: List[code], func_codes: Set[code])

Check if this is the code object for a function or inside a function

Parameters
  • code – The code object to check

  • parents – List of parents for this code object

  • func_codes – Set of code objects that are directly for functions

modulegraph2._bytecode_tools.extract_bytecode_info(code: code) Tuple[List[ImportInfo], Set[str], Set[str]]

Extract interesting information from the code object for a module or script

Returns a tuple of three items: 1) List of all imports 2) A set of global names written 3) A set of global names read by

Module “_callback_list”: working with lists of callbacks

class modulegraph2._callback_list.CallbackList

A sequence of callbacks that are called in reverse order of insertion.

The class is a generic type with the type of the callback functions as the type parameter.

__call__(*args, **kwds)

Call every callback in the callback list with the given arguments. The callbacks are called in reverse order of inserting.

The result of the called functions is ignored.

Parameters
  • args – Positional arguments for the function

  • kwds – Keyword arguments for the function

_callbacks: List[T]
add(function: T)

Add a function to the callback list.

Parameters

function – Function to be added

clear()

Clear the callback list

class modulegraph2._callback_list.FirstNotNone

A sequence of callbacks that are called in reverse order of insertion, and where the first result is used.

The class is a generic type with the type of the callback functions as the type parameter.

__call__(*args, **kwds)

Call the functions in the callback list in reverse order of addition. Return the first result that is not None.

Parameters
  • args – Posititional arguments for the callback

  • kwds – Keyword arguments for the callback

Returns

The first not None result or None

_callbacks: List[T]
add(function: T)

Add function to the callback list

Parameters

function – Function to add to the list

clear()

Clear the callback list

modulegraph2._callback_list.as_T(function)

MyPy helper: cast function to type T.

This is used to give the __call__ attribute of CallbackList and FirstNotNone the right type.

Parameters

function – Function to cast to type T

Module “_depinfo”: information about a dependency

Public API

This module defines the following public API:

Private API

modulegraph2._depinfo.from_importinfo(import_info: ImportInfo, in_fromlist: bool, name: Optional[str])

Create an DependencyInfo instance from an ImportInfo and additional information.

Parameters
  • import_info – The ImportInfo found by the AST or bytecode scanners

  • in_fromlist – True if the import refers to a name in the namelist from an from ... imoprt ... statement.

  • name – Rename for the module (import ... as name), None when there was no as clause.

Module “_distributions”: Package distributions

This module contains functions and classes that are used to process information about package distributions (the stuff on PyPI).

Public API

This module defines the following public API:

Private API

modulegraph2._distributions._cached_distributions

Global variable used by modulegraph2.all_distributions() to cache the distributions found on sys.path. This is used both for performance and to ensure module graphs end up with one PyPIDistribution per distribution found.

modulegraph2._distributions.create_distribution(distribution_file: str) PyPIDistribution

Create a distribution object for a given dist-info directory.

Parameters

distribution_file – Filename for a dist-info directory

Returns

A PyPIDistribution for distribution_file

modulegraph2._distributions.distribution_for_file(filename: Union[str, PathLike], path: Optional[Iterable[str]]) Optional[PyPIDistribution]

Find a distribution for a given file, for installed distributions.

Parameters
  • filename – Filename to look for

  • path – Module search path (defaults to sys.path)

Returns

The distribution that contains filename, or None

Module “_dotbuilder”: Outputting graphviz files

Export functions for creating Graphviz files.

Note

This module is fairly experimental at this point. At some time a generic version of this module will be added to the objectgraph package, with modulegraph2 specific functionality in this module.

modulegraph2._dotbuilder.export_to_dot(file: TextIO, graph: ModuleGraph, format_node: Callable[[NODE_TYPE], Dict[str, Union[str, int]]], format_edge: Callable[[NODE_TYPE, NODE_TYPE, Set[EDGE_TYPE]], Dict[str, Union[str, int]]], group_nodes: Callable[[ModuleGraph], Iterator[Tuple[str, str, Sequence[NODE_TYPE]]]]) None

Write an dot (graphviz) version of the graph to fp”.

The arguments “format_node” and “format_edge” specify callbacks to format nodes and edges that are generated.

These return dict with following keys (all optional): - …

modulegraph2._dotbuilder.format_attributes(callable, *args)

Format the results of callable in the format expected by Graphviz.

Module “_graphbuilder”: Support functions for building dependency graphs

Tools for building the module graph

modulegraph2._graphbuilder._contains_datafiles(directory: Path)

Returns true iff directory is contains package data

This works both when directory is a path on a filesystem and when directory points into a zipped directory.

modulegraph2._graphbuilder.node_for_spec(spec: ModuleSpec, path: List[str]) Tuple[BaseNode, Iterable[ImportInfo]]

Create the node for a ModuleSpec and locate related imports

modulegraph2._graphbuilder.relative_package(importing_module: BaseNode, import_level: int)

Module “_htmlbuilder”: Outputting HTML files

Support code for generating HTML output from a module graph

Note

This module is fairly experimental at this point. At some time a generic version of this module will be added to the objectgraph package, with modulegraph2 specific functionality in this module.

modulegraph2._htmlbuilder.export_to_html(file: TextIO, graph: ModuleGraph) None

Write an HTML version of the graph to fp”.

The arguments “format_node” and “format_edge” specify callbacks to format nodes and edges that are generated.

These return dict with following keys (all optional): - …

Module “_implies”: Implied dependencies for stdlib

Definitions related to implied imports.

The implies in this file are based on manual inspection of the stdlib sources for CPython.

class modulegraph2._implies.Alias

Alias for a module name

Aliases are used in “implies” configuration to specify that a module name is actually an alias for some other name. An instance of this class is used to represent. the name of the other module.

modulegraph2._implies.STDLIB_IMPLIES

Implies dictionary for the standard library.

modulegraph2._implies.STDLIB_PLATFORM_IMPLIES

Updates to the base STDLIB_IMPLIES for specific platforms.

modulegraph2._implies.STDLIB_VERSION_IMPLIES

Updates to the base STDLIB_IMPLIES for specific Python releases.

Module “_importinfo”: Information about edges in the dependency graph

class modulegraph2._importinfo.ImportInfo(import_module: import_name, import_level: int, import_names: Set[import_name], star_import: bool, is_in_function: bool, is_in_conditional: bool, is_in_tryexcept: bool)

Information about an import statement found in the AST for a module or script

import_module

The name of the module begin imported: import import_module or from module_name import ....

Type

modulegraph2._importinfo.import_name

import_level

Number of dots at the start of an imported name, 0 for global imports, 1 or higher for relative imports.

Type

int

import_names

The set of names imported with from import_module import ....

Type

Set[modulegraph2._importinfo.import_name]

star_import

True if this describes from import_module import *.

Type

bool

is_in_function

True if this describes an import statement in a function definition

Type

bool

is_in_conditional

True if this describes an import statement in a either branch of a conditional statement.

Type

bool

is_in_tryexcept

True if this describes an import statement in the try or except blocks of a try statement.

Type

bool

property is_global

True if this describes an import statement at module level (and hence affects the set of globals in the module).

property is_optional

True if this describes an import statement that might be optional.

modulegraph2._importinfo.create_importinfo(name: Tuple[str, Optional[str]], fromlist: Optional[Iterable[Tuple[str, Optional[str]]]], level: int, in_def: bool, in_if: bool, in_tryexcept: bool)

Create an ImportInfo instance.

Parameters
  • name – imported name

  • fromlist – The “from” list of an import statement (or None)

  • level – The import level, 0 for global imports and a positive number for relative imoprts.

  • in_def – Import statement inside a function definition

  • in_if – Import statement inside either branch of an if-statement

  • in_tryexcept – Import statement in the try or except blocks of a try statement.

Returns

A newly created ImportInfo instance.

class modulegraph2._importinfo.import_name

Value representing an imported name. The string value itself is the imported name, the asname attribute contains a rename from an “as” clause.

asname

Renamed name from an “as” clause

Type

Optional[str]

Module “_modulegraph”: The main module graph and builder

This module contains the definition of the ModuleGraph class.

Public API

This module defines the following public API:

Private API

modulegraph2._modulegraph.ProcessingCallback

The central part of internal API.

This represents a generic version of type ‘origin’ with type arguments ‘params’. There are two kind of these aliases: user defined and special. The special ones are wrappers around builtin collections and ABCs in collections.abc. These must have ‘name’ always set. If ‘inst’ is False, then the alias can’t be instantiated, this is used by e.g. typing.List and typing.Dict.

alias of Callable[[ModuleGraph, BaseNode], None]

modulegraph2._modulegraph.MissingCallback

The central part of internal API.

This represents a generic version of type ‘origin’ with type arguments ‘params’. There are two kind of these aliases: user defined and special. The special ones are wrappers around builtin collections and ABCs in collections.abc. These must have ‘name’ always set. If ‘inst’ is False, then the alias can’t be instantiated, this is used by e.g. typing.List and typing.Dict.

alias of Callable[[ModuleGraph, Optional[BaseNode], str], Optional[BaseNode]]

modulegraph2._modulegraph.DEFAULT_DEPENDENCY

A frozen dataclass representing information about the dependency edge between two graph nodes.

modulegraph2._modulegraph.is_optional

True if the import appears to be optional

modulegraph2._modulegraph.is_global

True if the import affects global names in the module

modulegraph2._modulegraph.in_fromlist

True if the name is imported in the name list of an from ... import ... statement

modulegraph2._modulegraph.imported_as

Rename for this module (import ... as impoted_as), None when there is no as clause.

The ModuleGraph class also contains private methods, documented below:

ModuleGraph._create_missing_module(importing_module: Optional[BaseNode], module_name: str) BaseNode

Create a MissingModule node for ‘module_name’, after checking if one of the missing hooks can provide a node.

Parameters
  • imoprting_module – The node that triggered the import.

  • module_name – The name that cannot be resolved.

Returns

A new node, either the result of one of the hooks or a new MissingModule.

ModuleGraph._run_stack() None

Process all items in the delayed work queue, until there is no more work.

ModuleGraph._implied_references(importing_module: Optional[BaseNode], module_name: str) Optional[BaseNode]

Check implied actions for module_name.

Parameters
  • importing_module – Module triggering the import

  • module_name – The name that should be imported

Returns

A node if their are implied actions, or None otherwise.

ModuleGraph._load_module(importing_module: Optional[BaseNode], module_name: str) BaseNode

Add a node for a specific module.

The module must not be part of the graph, and the module_name must be an absolute name (not a relative import.

This not just adds the loaded module to the graph, but also pushed functions to the work stack that will process the import statements in module_name.

Parameters
  • importing_module – The node triggering this import

  • module_name – The name to be loaded

Returns

A new node for module_name

ModuleGraph._load_script(script_path: PathLike) Script

Add a Script node to the graph.

The graph not contain a script with script_path as its filesystem location. This also pushes work to the stack to process import statements in the script.

Parameters

script_path – Filesystem path for a script

Raises
ModuleGraph._process_import_list(node: BaseNode, imports: Iterable[ImportInfo]) None

Schedule processing of all imports and the finalizer when that’s done.

Parameters
  • node – The node for which the import list is processed

  • imports – The imports to process

ModuleGraph._find_or_load_module(importing_module: Optional[BaseNode], module_name: str, *, link_missing_to_parent: bool = True) BaseNode

Locate the node for module_name, creating a new name if necessary.

Parameters
  • importing_module – The node that triggers this import

  • module_name – The name to load

  • link_missing_to_parent – If true the function will link a MissingModule node for module_name to importing_module.

Returns

The node for importing_module.

ModuleGraph._process_import(importing_module: BaseNode, import_info: ImportInfo) None

Process a single import.

This locates the node imported name (possibly pushing work to the stack to process its imports), and schedules a call to process the name list of the import statement when the imported name is fully processed.

Parameters
  • importing_module – The node that this import pertains to.

  • import_info – Information about an import

ModuleGraph._process_namelist(importing_module: Union[Module, Package], imported_module: BaseNode, import_info: ImportInfo)

Process the name list for an import statement (‘from … import name_list’).

If imported_module is a package any imported names are assumed to be modules, unless there is clear evidence to the contrarary. For regular modules any imported names are assumed to refer to data (and won’t result in MissingModule nodes in the graph when names cannot be located).

Parameters
  • importing_module – The module triggering the import

  • imported_module – The imported module

  • import_info – Information about the import

Building the graph

The graph building algorithm explictly manages a work stack to avoid exhausting the stack. This is a work stack because imports need to be processed depth first to be able to process from ... import ... statements correctly.

The diagram below is an overview of the imported method interactions while building a dependecy graph.

The call graph implies recursion through _find_or_load_module, but that recursion is managed explicitly through a work stack that is processed iteratively. The intention is to avoid exhausting the stack with convoluted code bases.

Module “_nodes”: Definition of graph nodes

Public API

This module defines the following public API:

Private API

This module does not have a private API.

Module “_swig_support”: Hooks for dealing with SWIG

Support code that deals with SWIG.

modulegraph2._swig_support.swig_missing_hook(graph: ModuleGraph, importing_module: Optional[BaseNode], missing_name: str) Optional[BaseNode]

Hook function to be used with ModuleGraph.add_missing_hook.

This hook detects when a module in a package uses SWIG to load an extension module in the same package using Python 2-style implicit relative imports (that don’t work in Python 3).

Adds this extension module as a global module to the graph, which corresponds to the Python 3 semantics of the import statement used in the code.

Module “_utilities”: Definition of graph nodes

Some useful utility functions.

Public API

This module defines the following public API:

Private API

modulegraph2._utilities.split_package(name: str) Tuple[Optional[str], str]

Return (package, name) given a fully qualified module name

package is None for toplevel modules

class modulegraph2._utilities.FakePackage(path: List[str])

Instances of these can be used to represent a fake package in sys.modules.

Used as a workaround to fetch information about modules in packages when the package itself cannot be imported for some reason (for example due to having a SyntaxError in the module __init__.py file).