Deeploy.EngineExtension.NetworkDeployers.EngineColoringDeployer.EngineColoringDeployer
- class Deeploy.EngineExtension.NetworkDeployers.EngineColoringDeployer.EngineColoringDeployer(graph: ~onnx_graphsurgeon.ir.graph.Graph, deploymentPlatform: ~Deeploy.DeeployTypes.DeploymentPlatform, inputTypes: ~typing.Dict[str, ~typing.Type[~Deeploy.AbstractDataTypes.Pointer]], loweringOptimizer: ~Deeploy.DeeployTypes.TopologyOptimizer, scheduler: ~typing.Callable[[~onnx_graphsurgeon.ir.graph.Graph], ~typing.List[~typing.List[~onnx_graphsurgeon.ir.node.Node]] | ~typing.List[~onnx_graphsurgeon.ir.node.Node]] = <function EngineColoringDeployer.<lambda>>, name: str = 'DeeployNetwork', default_channels_first: bool = True, deeployStateDir: str = 'DeeployState', engineMapperCls: ~typing.Type[~Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper] = <class 'Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper'>)
Bases:
NetworkDeployer
Methods
- __init__(graph: ~onnx_graphsurgeon.ir.graph.Graph, deploymentPlatform: ~Deeploy.DeeployTypes.DeploymentPlatform, inputTypes: ~typing.Dict[str, ~typing.Type[~Deeploy.AbstractDataTypes.Pointer]], loweringOptimizer: ~Deeploy.DeeployTypes.TopologyOptimizer, scheduler: ~typing.Callable[[~onnx_graphsurgeon.ir.graph.Graph], ~typing.List[~typing.List[~onnx_graphsurgeon.ir.node.Node]] | ~typing.List[~onnx_graphsurgeon.ir.node.Node]] = <function EngineColoringDeployer.<lambda>>, name: str = 'DeeployNetwork', default_channels_first: bool = True, deeployStateDir: str = 'DeeployState', engineMapperCls: ~typing.Type[~Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper] = <class 'Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper'>)
Initialize a new NetworkDeployer
- Parameters:
graph (gs.Graph) – The raw neural network graph to be deployed, e.g. an output from Quantlib
deploymentPlatform (DeploymentPlatform) – The target deployment platform
inputTypes (Dict[str, Type[Pointer]]) – A mapping of global network inputs to Deeploy datatypes
loweringOptimizer (TopologyOptimizer) – A topology optimizer used to transform the network into a representation that can be mapped to NodeMappers
scheduler (Callable[[gs.Graph], Schedule]) – Method to topologically sort the graph into the order of execution
name (str) – Prefix to avoid name conflicts between Deeploy code and other code
default_channels_first (bool) – Whether data layout is CxHxW, i.e. channels are first, or HxWxC, i.e. channels are last
deeployStateDir (str) – Directory where intermediate states are saved
__init__
(graph, deploymentPlatform, ...[, ...])Initialize a new NetworkDeployer
backEnd
([verbose])API hook to generate code once kernel implementations are picked and tiling, memory allocation, and other low-level optimizations have been done.
bind
()Bind the entire network layer-by-layer
codeTransform
([verbose])Apply code transformations on every layer's execution block
exportDeeployState
(folderPath, fileName)Export compressed network context and neural network graph
frontEnd
()API hook to prepare the graph to be deployed and build the initial NetworkContext
Generates code to allocate space for the global input and output buffer of the network
Generates code to deallocate all global buffers
Generates code for all forward-declaration of buffers used during inference
Generate initialization code for all compute engines
generateFunction
([verbose])Helper function to prepare deployment and return generated function code
Generate all global definition code for inference
Generate initialization code for global network inputs and outputs
Generate code to include platform-dependent includes
Generate the actual inference function for the entire network
Generate initialization code, including static memory allocation and other setup tasks
Return the BYTE size of all static network parameters (weights, biases, parameters,...)
Returns total size of the network, consisting of all parameters and intermediate buffer size
importDeeployState
(folderPath, fileName)Override this container's graph and context with loaded compressed artifacts
inputs
()Return a list of all VariableBuffers that are also global inputs of the network
lower
(graph)Apply the lowering optimize
midEnd
()API hook to be used after finalizing kernel selection; hoist transient buffers, and perform low-level code optimizations (e.g. tiling and static memory allocation).
numberOfOps
(verbose)Returns the total number of operations per network inference
outputs
()Return a list of all VariableBuffers that are also global outputs of the network
parse
([default_channels_first])Parses the full network by iteratively exploring mapping and binding options with backtracking
prepare
([verbose])API hook to perform the entire deployment process to the point where generated code may be extracted
Attributes
Return the worst-case buffer size occupied by the network implementaiton
- lower(graph: Graph) Graph
Apply the lowering optimize
- Parameters:
graph (gs.Graph) – Unmodified input neural network graph
- Returns:
Neural network graph that is deployable with the DeploymentPlatform’s Mapping
- Return type:
gs.Graph
- backEnd(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None))
API hook to generate code once kernel implementations are picked and tiling, memory allocation, and other low-level optimizations have been done.
- Parameters:
verbose (CodeGenVerbosity) – Control verbosity of generated code
- bind() bool
Bind the entire network layer-by-layer
- Returns:
Return true if binding was successful
- Return type:
bool
- Raises:
RuntimeError – Raises a RuntimeError if the network has not been parsed of there exists no valid binding
- codeTransform(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None))
Apply code transformations on every layer’s execution block
- Parameters:
verbose (CodeGenVerbosity) – Control code generation verbosity
- Raises:
RuntimeError – Raises a RuntimeError if the entire network is not bound
- exportDeeployState(folderPath: str, fileName: str)
Export compressed network context and neural network graph
- Parameters:
folderPath (str) – path to directory where to save context and graph
fileName (str) – prefix to use when saving artifacts
- frontEnd()
API hook to prepare the graph to be deployed and build the initial NetworkContext
- generateBufferAllocationCode() str
Generates code to allocate space for the global input and output buffer of the network
- Returns:
Allocation code for global IO buffers
- Return type:
str
- Raises:
RuntimeError – Raises a RuntimeError if network is not parsed and bound
- generateBufferDeAllocationCode() str
Generates code to deallocate all global buffers
- Returns:
Code to deallocate buffers
- Return type:
str
- Raises:
RuntimeError – Raises a RuntimeError if network is not parsed and bound
- generateBufferInitializationCode() str
Generates code for all forward-declaration of buffers used during inference
- Returns:
Returns forward-declaration code
- Return type:
str
- Raises:
RuntimeError – Raises a RuntimeError if network is not parsed and bound
- generateEngineInitializationCode() str
Generate initialization code for all compute engines
- Returns:
Initialization code for all engines
- Return type:
str
- generateFunction(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None)) str
Helper function to prepare deployment and return generated function code
- generateGlobalDefinitionCode() str
Generate all global definition code for inference
- Returns:
Global Definition code
- Return type:
str
- Raises:
RuntimeError – Raises a RuntimeError if network is not parsed and bound
- generateIOBufferInitializationCode() str
Generate initialization code for global network inputs and outputs
- Returns:
Initialization code
- Return type:
str
- Raises:
RuntimeError – Raises a RuntimeError if network is not parsed and bound
- generateIncludeString() str
Generate code to include platform-dependent includes
- Returns:
Include code
- Return type:
str
- generateInferenceCode() str
Generate the actual inference function for the entire network
- Returns:
The full inference method
- Return type:
str
- Raises:
ValueError – Raises a RuntimeError if network is not parsed and bound
- generateInferenceInitializationCode() str
Generate initialization code, including static memory allocation and other setup tasks
- Returns:
Initialization code
- Return type:
str
- Raises:
RuntimeError – Raises a RuntimeError if network is not parsed and bound
- getParameterSize() int
Return the BYTE size of all static network parameters (weights, biases, parameters,…)
- Returns:
Size of all network parameters
- Return type:
int
- Raises:
RuntimeError – Raises a RuntimeError if network is not parsed and bound
- getTotalSize() int
Returns total size of the network, consisting of all parameters and intermediate buffer size
- Returns:
Total network size
- Return type:
int
- Raises:
RuntimeError – Raises a RuntimeError if network is not parsed and bound
- importDeeployState(folderPath: str, fileName: str)
Override this container’s graph and context with loaded compressed artifacts
- Parameters:
folderPath (str) – Path to the artifact directory
fileName (str) – prefix of the saved artifacts
- inputs() List[VariableBuffer]
Return a list of all VariableBuffers that are also global inputs of the network
- Returns:
Global inputs
- Return type:
List[VariableBuffer]
- midEnd()
API hook to be used after finalizing kernel selection; hoist transient buffers, and perform low-level code optimizations (e.g. tiling and static memory allocation)
- numberOfOps(verbose: bool) int
Returns the total number of operations per network inference
- Parameters:
verbose (bool) – Control whether the number of operations are printed to STDOUT for each operator
- Returns:
Number of operations (1 MAC = 2 Ops) per network inference
- Return type:
int
- Raises:
RuntimeError – Raises a RuntimeError if network is not parsed and bound
- outputs() List[VariableBuffer]
Return a list of all VariableBuffers that are also global outputs of the network
- Returns:
Global outputs
- Return type:
List[VariableBuffer]
- parse(default_channels_first: bool = True) bool
Parses the full network by iteratively exploring mapping and binding options with backtracking
- Parameters:
default_channels_first (bool) – Whether the default data layout is CxHxW or HxWxC
- Returns:
Returns a boolean to indicate whether parsing was successful
- Return type:
bool
- Raises:
RuntimeError – Raises a RuntimeError if backtracking was exhausted without finding a mapping solution
- prepare(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None))
API hook to perform the entire deployment process to the point where generated code may be extracted
- Parameters:
verbose (CodeGenVerbosity) – Control verbosity of generated code
- property worstCaseBufferSize
Return the worst-case buffer size occupied by the network implementaiton