matrix transpose
Module: Matrix Functions
Modules
Name |
---|
matrix transpose Kernels |
Functions
Name | |
---|---|
void | plp_mat_trans_f32(const float restrict pSrc, uint32_t M, uint32_t N, float restrict pDst) Glue code for matrix transpose of 32-bit floating-point matrices. |
void | plp_mat_trans_f32_parallel(const float restrict pSrc, uint32_t M, uint32_t N, uint32_t nPE, float restrict pDst) Glue code for parallel matrix transpose of 32-bit floating-point matrices. |
void | plp_mat_trans_i16(const int16_t restrict pSrc, uint32_t M, uint32_t N, int16_t restrict pDst) Glue code for matrix transpose of 16-bit integer matrices. |
void | plp_mat_trans_i16_parallel(const int16_t restrict pSrc, uint32_t M, uint32_t N, uint32_t nPE, int16_t restrict pDst) Glue code for parallel matrix transpose of 16-bit integer matrices. |
void | plp_mat_trans_i32(const int32_t restrict pSrc, uint32_t M, uint32_t N, int32_t restrict pDst) Glue code for matrix transpose of 16-bit integer matrices. |
void | plp_mat_trans_i32_parallel(const int32_t restrict pSrc, uint32_t M, uint32_t N, uint32_t nPE, int32_t restrict pDst) Glue code for parallel matrix transpose of 16-bit integer matrices. |
void | plp_mat_trans_i8(const int8_t restrict pSrc, uint32_t M, uint32_t N, int8_t restrict pDst) Glue code for matrix transpose of 16-bit integer matrices. |
void | plp_mat_trans_i8_parallel(const int8_t restrict pSrc, uint32_t M, uint32_t N, uint32_t nPE, int8_t restrict pDst) Glue code for parallel matrix transpose of 16-bit integer matrices. |
Detailed Description
This module contains the glue code for matrix transpose. The kernel codes (kernels) are in the Module matrix transpose Kernels.
The transpose of a matrix of shape MxN is another matrix of shape NxM, where the the matrix is flipped:
pDst[n, m] = pSrc[m, n]
There are functions for integer 32- 16- and 8-bit data types, as well as for floating-point. These functions can also be used for fix-point matrices.
Functions Documentation
function plp_mat_trans_f32
void plp_mat_trans_f32(
const float *__restrict__ pSrc,
uint32_t M,
uint32_t N,
float *__restrict__ pDst
)
Glue code for matrix transpose of 32-bit floating-point matrices.
Parameters:
- pSrc Points to the input matrix of shape MxN
- M Height of the input matrix and width of the output matrix
- N Width of the input matrix and height of the output matrix
- pDst Points to the output matrix of shape NxM
Return: none
Par: This function will use plp_mat_trans_i32s_xpulpv2 for its computation.
Glue code for matrix transpose of a 32-bit float*ing-point matrices.
function plp_mat_trans_f32_parallel
void plp_mat_trans_f32_parallel(
const float *__restrict__ pSrc,
uint32_t M,
uint32_t N,
uint32_t nPE,
float *__restrict__ pDst
)
Glue code for parallel matrix transpose of 32-bit floating-point matrices.
Parameters:
- pSrc Points to the input matrix of shape MxN
- M Height of the input matrix and width of the output matrix
- N Width of the input matrix and height of the output matrix
- nPE Number of cores to use for computation
- pDst Points to the output matrix of shape NxM
Return: none
Par: This function will use plp_mat_trans_i32p_xpulpv2 for its computation.
Glue code for parallel matrix transpose of a 32-bit float*ing-point matrices.
function plp_mat_trans_i16
void plp_mat_trans_i16(
const int16_t *__restrict__ pSrc,
uint32_t M,
uint32_t N,
int16_t *__restrict__ pDst
)
Glue code for matrix transpose of 16-bit integer matrices.
Parameters:
- pSrc Points to the input matrix of shape MxN
- M Height of the input matrix and width of the output matrix
- N Width of the input matrix and height of the output matrix
- pDst Points to the output matrix of shape NxM
Return: none
Glue code for matrix transpose of a 16-bit integer matrices.
function plp_mat_trans_i16_parallel
void plp_mat_trans_i16_parallel(
const int16_t *__restrict__ pSrc,
uint32_t M,
uint32_t N,
uint32_t nPE,
int16_t *__restrict__ pDst
)
Glue code for parallel matrix transpose of 16-bit integer matrices.
Parameters:
- pSrc Points to the input matrix of shape MxN
- M Height of the input matrix and width of the output matrix
- N Width of the input matrix and height of the output matrix
- nPE Number of cores to use for computation
- pDst Points to the output matrix of shape NxM
Return: none
Glue code for parallel matrix transpose of a 16-bit integer matrices.
function plp_mat_trans_i32
void plp_mat_trans_i32(
const int32_t *__restrict__ pSrc,
uint32_t M,
uint32_t N,
int32_t *__restrict__ pDst
)
Glue code for matrix transpose of 16-bit integer matrices.
Parameters:
- pSrc Points to the input matrix of shape MxN
- M Height of the input matrix and width of the output matrix
- N Width of the input matrix and height of the output matrix
- pDst Points to the output matrix of shape NxM
Return: none
Glue code for matrix transpose of a 32-bit integer matrices.
function plp_mat_trans_i32_parallel
void plp_mat_trans_i32_parallel(
const int32_t *__restrict__ pSrc,
uint32_t M,
uint32_t N,
uint32_t nPE,
int32_t *__restrict__ pDst
)
Glue code for parallel matrix transpose of 16-bit integer matrices.
Glue code for parallel matrix transpose of a 32-bit integer matrices.
function plp_mat_trans_i8
void plp_mat_trans_i8(
const int8_t *__restrict__ pSrc,
uint32_t M,
uint32_t N,
int8_t *__restrict__ pDst
)
Glue code for matrix transpose of 16-bit integer matrices.
Parameters:
- pSrc Points to the input matrix of shape MxN
- M Height of the input matrix and width of the output matrix
- N Width of the input matrix and height of the output matrix
- pDst Points to the output matrix of shape NxM
Return: none
Glue code for matrix transpose of a 8-bit integer matrices.
function plp_mat_trans_i8_parallel
void plp_mat_trans_i8_parallel(
const int8_t *__restrict__ pSrc,
uint32_t M,
uint32_t N,
uint32_t nPE,
int8_t *__restrict__ pDst
)
Glue code for parallel matrix transpose of 16-bit integer matrices.
Parameters:
- pSrc Points to the input matrix of shape MxN
- M Height of the input matrix and width of the output matrix
- N Width of the input matrix and height of the output matrix
- nPE Number of cores to use for computation
- pDst Points to the output matrix of shape NxM
Return: none
Glue code for parallel matrix transpose of a 8-bit integer matrices.
Updated on 2023-03-01 at 16:16:32 +0000