Deeploy.EngineExtension.NetworkDeployers.EngineColoringDeployer.EngineColoringDeployer

class Deeploy.EngineExtension.NetworkDeployers.EngineColoringDeployer.EngineColoringDeployer(graph: ~onnx_graphsurgeon.ir.graph.Graph, deploymentPlatform: ~Deeploy.DeeployTypes.DeploymentPlatform, inputTypes: ~typing.Dict[str, ~typing.Type[~Deeploy.AbstractDataTypes.Pointer]], loweringOptimizer: ~Deeploy.DeeployTypes.TopologyOptimizer, scheduler: ~typing.Callable[[~onnx_graphsurgeon.ir.graph.Graph], ~typing.List[~typing.List[~onnx_graphsurgeon.ir.node.Node]] | ~typing.List[~onnx_graphsurgeon.ir.node.Node]] = <function EngineColoringDeployer.<lambda>>, name: str = 'DeeployNetwork', default_channels_first: bool = True, deeployStateDir: str = 'DeeployState', engineMapperCls: ~typing.Type[~Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper] = <class 'Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper'>)

Bases: NetworkDeployer

Methods

__init__(graph: ~onnx_graphsurgeon.ir.graph.Graph, deploymentPlatform: ~Deeploy.DeeployTypes.DeploymentPlatform, inputTypes: ~typing.Dict[str, ~typing.Type[~Deeploy.AbstractDataTypes.Pointer]], loweringOptimizer: ~Deeploy.DeeployTypes.TopologyOptimizer, scheduler: ~typing.Callable[[~onnx_graphsurgeon.ir.graph.Graph], ~typing.List[~typing.List[~onnx_graphsurgeon.ir.node.Node]] | ~typing.List[~onnx_graphsurgeon.ir.node.Node]] = <function EngineColoringDeployer.<lambda>>, name: str = 'DeeployNetwork', default_channels_first: bool = True, deeployStateDir: str = 'DeeployState', engineMapperCls: ~typing.Type[~Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper] = <class 'Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper'>)

Initialize a new NetworkDeployer

Parameters:

graph (gs.Graph) – The raw neural network graph to be deployed, e.g. an output from Quantlib
deploymentPlatform (DeploymentPlatform) – The target deployment platform
inputTypes (Dict[str, Type[Pointer]]) – A mapping of global network inputs to Deeploy datatypes
loweringOptimizer (TopologyOptimizer) – A topology optimizer used to transform the network into a representation that can be mapped to NodeMappers
scheduler (Callable[[gs.Graph], Schedule]) – Method to topologically sort the graph into the order of execution
name (str) – Prefix to avoid name conflicts between Deeploy code and other code
default_channels_first (bool) – Whether data layout is CxHxW, i.e. channels are first, or HxWxC, i.e. channels are last
deeployStateDir (str) – Directory where intermediate states are saved

`__init__`(graph, deploymentPlatform, ...[, ...])	Initialize a new NetworkDeployer
`backEnd`([verbose])	API hook to generate code once kernel implementations are picked and tiling, memory allocation, and other low-level optimizations have been done.
`bind`()	Bind the entire network layer-by-layer
`codeTransform`([verbose])	Apply code transformations on every layer's execution block
`exportDeeployState`(folderPath, fileName)	Export compressed network context and neural network graph
`frontEnd`()	API hook to prepare the graph to be deployed and build the initial NetworkContext
`generateBufferAllocationCode`()	Generates code to allocate space for the global input and output buffer of the network
`generateBufferDeAllocationCode`()	Generates code to deallocate all global buffers
`generateBufferInitializationCode`()	Generates code for all forward-declaration of buffers used during inference
`generateEngineInitializationCode`()	Generate initialization code for all compute engines
`generateFunction`([verbose])	Helper function to prepare deployment and return generated function code
`generateGlobalDefinitionCode`()	Generate all global definition code for inference
`generateIOBufferInitializationCode`()	Generate initialization code for global network inputs and outputs
`generateIncludeString`()	Generate code to include platform-dependent includes
`generateInferenceCode`()	Generate the actual inference function for the entire network
`generateInferenceInitializationCode`()	Generate initialization code, including static memory allocation and other setup tasks
`getParameterSize`()	Return the BYTE size of all static network parameters (weights, biases, parameters,...)
`getTotalSize`()	Returns total size of the network, consisting of all parameters and intermediate buffer size
`importDeeployState`(folderPath, fileName)	Override this container's graph and context with loaded compressed artifacts
`inputs`()	Return a list of all VariableBuffers that are also global inputs of the network
`lower`(graph)	Apply the lowering optimize
`midEnd`()	API hook to be used after finalizing kernel selection; hoist transient buffers, and perform low-level code optimizations (e.g. tiling and static memory allocation).
`numberOfOps`(verbose)	Returns the total number of operations per network inference
`outputs`()	Return a list of all VariableBuffers that are also global outputs of the network
`parse`([default_channels_first])	Parses the full network by iteratively exploring mapping and binding options with backtracking
`prepare`([verbose])	API hook to perform the entire deployment process to the point where generated code may be extracted

Attributes

worstCaseBufferSize

Return the worst-case buffer size occupied by the network implementaiton

lower(graph: Graph) → Graph

Apply the lowering optimize

Parameters:: graph (gs.Graph) – Unmodified input neural network graph
Returns:: Neural network graph that is deployable with the DeploymentPlatform’s Mapping
Return type:: gs.Graph

backEnd(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None, untiledProfiling=None))

API hook to generate code once kernel implementations are picked and tiling, memory allocation, and other low-level optimizations have been done.

Parameters:: verbose (CodeGenVerbosity) – Control verbosity of generated code

bind() → bool

Bind the entire network layer-by-layer

Returns:: Return true if binding was successful
Return type:: bool
Raises:: RuntimeError – Raises a RuntimeError if the network has not been parsed of there exists no valid binding

codeTransform(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None, untiledProfiling=None))

Apply code transformations on every layer’s execution block

Parameters:: verbose (CodeGenVerbosity) – Control code generation verbosity
Raises:: RuntimeError – Raises a RuntimeError if the entire network is not bound

exportDeeployState(folderPath: str, fileName: str)

Export compressed network context and neural network graph

Parameters:

folderPath (str) – path to directory where to save context and graph
fileName (str) – prefix to use when saving artifacts

frontEnd(): API hook to prepare the graph to be deployed and build the initial NetworkContext

generateBufferAllocationCode() → str

Generates code to allocate space for the global input and output buffer of the network

Returns:: Allocation code for global IO buffers
Return type:: str
Raises:: RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateBufferDeAllocationCode() → str

Generates code to deallocate all global buffers

Returns:: Code to deallocate buffers
Return type:: str
Raises:: RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateBufferInitializationCode() → str

Generates code for all forward-declaration of buffers used during inference

Returns:: Returns forward-declaration code
Return type:: str
Raises:: RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateEngineInitializationCode() → str

Generate initialization code for all compute engines

Returns:: Initialization code for all engines
Return type:: str

generateFunction(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None, untiledProfiling=None)) → str: Helper function to prepare deployment and return generated function code

generateGlobalDefinitionCode() → str

Generate all global definition code for inference

Returns:: Global Definition code
Return type:: str
Raises:: RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateIOBufferInitializationCode() → str

Generate initialization code for global network inputs and outputs

Returns:: Initialization code
Return type:: str
Raises:: RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateIncludeString() → str

Generate code to include platform-dependent includes

Returns:: Include code
Return type:: str

generateInferenceCode() → str

Generate the actual inference function for the entire network

Returns:: The full inference method
Return type:: str
Raises:: ValueError – Raises a RuntimeError if network is not parsed and bound

generateInferenceInitializationCode() → str

Generate initialization code, including static memory allocation and other setup tasks

Returns:: Initialization code
Return type:: str
Raises:: RuntimeError – Raises a RuntimeError if network is not parsed and bound

getParameterSize() → int

Return the BYTE size of all static network parameters (weights, biases, parameters,…)

Returns:: Size of all network parameters
Return type:: int
Raises:: RuntimeError – Raises a RuntimeError if network is not parsed and bound

getTotalSize() → int

Returns total size of the network, consisting of all parameters and intermediate buffer size

Returns:: Total network size
Return type:: int
Raises:: RuntimeError – Raises a RuntimeError if network is not parsed and bound

importDeeployState(folderPath: str, fileName: str)

Override this container’s graph and context with loaded compressed artifacts

Parameters:

folderPath (str) – Path to the artifact directory
fileName (str) – prefix of the saved artifacts

inputs() → List[VariableBuffer]

Return a list of all VariableBuffers that are also global inputs of the network

Returns:: Global inputs
Return type:: List[VariableBuffer]

midEnd(): API hook to be used after finalizing kernel selection; hoist transient buffers, and perform low-level code optimizations (e.g. tiling and static memory allocation)

numberOfOps(verbose: bool) → int

Returns the total number of operations per network inference

Parameters:: verbose (bool) – Control whether the number of operations are printed to STDOUT for each operator
Returns:: Number of operations (1 MAC = 2 Ops) per network inference
Return type:: int
Raises:: RuntimeError – Raises a RuntimeError if network is not parsed and bound

outputs() → List[VariableBuffer]

Return a list of all VariableBuffers that are also global outputs of the network

Returns:: Global outputs
Return type:: List[VariableBuffer]

parse(default_channels_first: bool = True) → bool

Parses the full network by iteratively exploring mapping and binding options with backtracking

Parameters:: default_channels_first (bool) – Whether the default data layout is CxHxW or HxWxC
Returns:: Returns a boolean to indicate whether parsing was successful
Return type:: bool
Raises:: RuntimeError – Raises a RuntimeError if backtracking was exhausted without finding a mapping solution

prepare(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None, untiledProfiling=None))

API hook to perform the entire deployment process to the point where generated code may be extracted

Parameters:: verbose (CodeGenVerbosity) – Control verbosity of generated code

property worstCaseBufferSize: Return the worst-case buffer size occupied by the network implementaiton