Deeploy.EngineExtension.NetworkDeployers.EngineColoringDeployer.EngineColoringDeployerWrapper

class Deeploy.EngineExtension.NetworkDeployers.EngineColoringDeployer.EngineColoringDeployerWrapper(deployer: ~Deeploy.DeeployTypes.NetworkDeployer, engineMapperCls: ~typing.Type[~Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper] = <class 'Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper'>)

Bases: EngineColoringDeployer, NetworkDeployerWrapper

Methods

__init__(deployer: ~Deeploy.DeeployTypes.NetworkDeployer, engineMapperCls: ~typing.Type[~Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper] = <class 'Deeploy.EngineExtension.OptimizationPasses.TopologyOptimizationPasses.EngineColoringPasses.EngineMapper'>) None

Initialize a new NetworkDeployer

Parameters:
  • graph (gs.Graph) – The raw neural network graph to be deployed, e.g. an output from Quantlib

  • deploymentPlatform (DeploymentPlatform) – The target deployment platform

  • inputTypes (Dict[str, Type[Pointer]]) – A mapping of global network inputs to Deeploy datatypes

  • loweringOptimizer (TopologyOptimizer) – A topology optimizer used to transform the network into a representation that can be mapped to NodeMappers

  • scheduler (Callable[[gs.Graph], Schedule]) – Method to topologically sort the graph into the order of execution

  • name (str) – Prefix to avoid name conflicts between Deeploy code and other code

  • default_channels_first (bool) – Whether data layout is CxHxW, i.e. channels are first, or HxWxC, i.e. channels are last

  • deeployStateDir (str) – Directory where intermediate states are saved

__init__(deployer[, engineMapperCls])

Initialize a new NetworkDeployer

backEnd([verbose])

API hook to generate code once kernel implementations are picked and tiling, memory allocation, and other low-level optimizations have been done.

bind()

Bind the entire network layer-by-layer

codeTransform([verbose])

Apply code transformations on every layer's execution block

exportDeeployState(folderPath, fileName)

Export compressed network context and neural network graph

frontEnd()

API hook to prepare the graph to be deployed and build the initial NetworkContext

generateBufferAllocationCode()

Generates code to allocate space for the global input and output buffer of the network

generateBufferDeAllocationCode()

Generates code to deallocate all global buffers

generateBufferInitializationCode()

Generates code for all forward-declaration of buffers used during inference

generateEngineInitializationCode()

Generate initialization code for all compute engines

generateFunction([verbose])

Helper function to prepare deployment and return generated function code

generateGlobalDefinitionCode()

Generate all global definition code for inference

generateIOBufferInitializationCode()

Generate initialization code for global network inputs and outputs

generateIncludeString()

Generate code to include platform-dependent includes

generateInferenceCode()

Generate the actual inference function for the entire network

generateInferenceInitializationCode()

Generate initialization code, including static memory allocation and other setup tasks

getParameterSize()

Return the BYTE size of all static network parameters (weights, biases, parameters,...)

getTotalSize()

Returns total size of the network, consisting of all parameters and intermediate buffer size

importDeeployState(folderPath, fileName)

Override this container's graph and context with loaded compressed artifacts

inputs()

Return a list of all VariableBuffers that are also global inputs of the network

lower(graph)

Apply the lowering optimize

midEnd()

API hook to be used after finalizing kernel selection; hoist transient buffers, and perform low-level code optimizations (e.g. tiling and static memory allocation).

numberOfOps(verbose)

Returns the total number of operations per network inference

outputs()

Return a list of all VariableBuffers that are also global outputs of the network

parse([default_channels_first])

Parses the full network by iteratively exploring mapping and binding options with backtracking

prepare([verbose])

API hook to perform the entire deployment process to the point where generated code may be extracted

Attributes

bound

parsed

prepared

transformed

worstCaseBufferSize

Return the worst-case buffer size occupied by the network implementaiton

backEnd(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None))

API hook to generate code once kernel implementations are picked and tiling, memory allocation, and other low-level optimizations have been done.

Parameters:

verbose (CodeGenVerbosity) – Control verbosity of generated code

bind() bool

Bind the entire network layer-by-layer

Returns:

Return true if binding was successful

Return type:

bool

Raises:

RuntimeError – Raises a RuntimeError if the network has not been parsed of there exists no valid binding

codeTransform(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None))

Apply code transformations on every layer’s execution block

Parameters:

verbose (CodeGenVerbosity) – Control code generation verbosity

Raises:

RuntimeError – Raises a RuntimeError if the entire network is not bound

exportDeeployState(folderPath: str, fileName: str)

Export compressed network context and neural network graph

Parameters:
  • folderPath (str) – path to directory where to save context and graph

  • fileName (str) – prefix to use when saving artifacts

frontEnd()

API hook to prepare the graph to be deployed and build the initial NetworkContext

generateBufferAllocationCode() str

Generates code to allocate space for the global input and output buffer of the network

Returns:

Allocation code for global IO buffers

Return type:

str

Raises:

RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateBufferDeAllocationCode() str

Generates code to deallocate all global buffers

Returns:

Code to deallocate buffers

Return type:

str

Raises:

RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateBufferInitializationCode() str

Generates code for all forward-declaration of buffers used during inference

Returns:

Returns forward-declaration code

Return type:

str

Raises:

RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateEngineInitializationCode() str

Generate initialization code for all compute engines

Returns:

Initialization code for all engines

Return type:

str

generateFunction(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None)) str

Helper function to prepare deployment and return generated function code

generateGlobalDefinitionCode() str

Generate all global definition code for inference

Returns:

Global Definition code

Return type:

str

Raises:

RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateIOBufferInitializationCode() str

Generate initialization code for global network inputs and outputs

Returns:

Initialization code

Return type:

str

Raises:

RuntimeError – Raises a RuntimeError if network is not parsed and bound

generateIncludeString() str

Generate code to include platform-dependent includes

Returns:

Include code

Return type:

str

generateInferenceCode() str

Generate the actual inference function for the entire network

Returns:

The full inference method

Return type:

str

Raises:

ValueError – Raises a RuntimeError if network is not parsed and bound

generateInferenceInitializationCode() str

Generate initialization code, including static memory allocation and other setup tasks

Returns:

Initialization code

Return type:

str

Raises:

RuntimeError – Raises a RuntimeError if network is not parsed and bound

getParameterSize() int

Return the BYTE size of all static network parameters (weights, biases, parameters,…)

Returns:

Size of all network parameters

Return type:

int

Raises:

RuntimeError – Raises a RuntimeError if network is not parsed and bound

getTotalSize() int

Returns total size of the network, consisting of all parameters and intermediate buffer size

Returns:

Total network size

Return type:

int

Raises:

RuntimeError – Raises a RuntimeError if network is not parsed and bound

importDeeployState(folderPath: str, fileName: str)

Override this container’s graph and context with loaded compressed artifacts

Parameters:
  • folderPath (str) – Path to the artifact directory

  • fileName (str) – prefix of the saved artifacts

inputs() List[VariableBuffer]

Return a list of all VariableBuffers that are also global inputs of the network

Returns:

Global inputs

Return type:

List[VariableBuffer]

lower(graph: Graph) Graph

Apply the lowering optimize

Parameters:

graph (gs.Graph) – Unmodified input neural network graph

Returns:

Neural network graph that is deployable with the DeploymentPlatform’s Mapping

Return type:

gs.Graph

midEnd()

API hook to be used after finalizing kernel selection; hoist transient buffers, and perform low-level code optimizations (e.g. tiling and static memory allocation)

numberOfOps(verbose: bool) int

Returns the total number of operations per network inference

Parameters:

verbose (bool) – Control whether the number of operations are printed to STDOUT for each operator

Returns:

Number of operations (1 MAC = 2 Ops) per network inference

Return type:

int

Raises:

RuntimeError – Raises a RuntimeError if network is not parsed and bound

outputs() List[VariableBuffer]

Return a list of all VariableBuffers that are also global outputs of the network

Returns:

Global outputs

Return type:

List[VariableBuffer]

parse(default_channels_first: bool = True) bool

Parses the full network by iteratively exploring mapping and binding options with backtracking

Parameters:

default_channels_first (bool) – Whether the default data layout is CxHxW or HxWxC

Returns:

Returns a boolean to indicate whether parsing was successful

Return type:

bool

Raises:

RuntimeError – Raises a RuntimeError if backtracking was exhausted without finding a mapping solution

prepare(verbose: CodeGenVerbosity = CodeGenVerbosity(tilingProfiling=None))

API hook to perform the entire deployment process to the point where generated code may be extracted

Parameters:

verbose (CodeGenVerbosity) – Control verbosity of generated code

property worstCaseBufferSize

Return the worst-case buffer size occupied by the network implementaiton