matrix transpose

Modules

Name
matrix transpose Kernels

Functions

	Name
void	plp_mat_trans_f32(const float restrict pSrc, uint32_t M, uint32_t N, float restrict pDst) Glue code for matrix transpose of 32-bit floating-point matrices.
void	plp_mat_trans_f32_parallel(const float restrict pSrc, uint32_t M, uint32_t N, uint32_t nPE, float restrict pDst) Glue code for parallel matrix transpose of 32-bit floating-point matrices.
void	plp_mat_trans_i16(const int16_t restrict pSrc, uint32_t M, uint32_t N, int16_t restrict pDst) Glue code for matrix transpose of 16-bit integer matrices.
void	plp_mat_trans_i16_parallel(const int16_t restrict pSrc, uint32_t M, uint32_t N, uint32_t nPE, int16_t restrict pDst) Glue code for parallel matrix transpose of 16-bit integer matrices.
void	plp_mat_trans_i32(const int32_t restrict pSrc, uint32_t M, uint32_t N, int32_t restrict pDst) Glue code for matrix transpose of 16-bit integer matrices.
void	plp_mat_trans_i32_parallel(const int32_t restrict pSrc, uint32_t M, uint32_t N, uint32_t nPE, int32_t restrict pDst) Glue code for parallel matrix transpose of 16-bit integer matrices.
void	plp_mat_trans_i8(const int8_t restrict pSrc, uint32_t M, uint32_t N, int8_t restrict pDst) Glue code for matrix transpose of 16-bit integer matrices.
void	plp_mat_trans_i8_parallel(const int8_t restrict pSrc, uint32_t M, uint32_t N, uint32_t nPE, int8_t restrict pDst) Glue code for parallel matrix transpose of 16-bit integer matrices.

Detailed Description

This module contains the glue code for matrix transpose. The kernel codes (kernels) are in the Module matrix transpose Kernels.

The transpose of a matrix of shape MxN is another matrix of shape NxM, where the the matrix is flipped:

pDst[n, m] = pSrc[m, n]

There are functions for integer 32- 16- and 8-bit data types, as well as for floating-point. These functions can also be used for fix-point matrices.

Functions Documentation

function plp_mat_trans_f32

void plp_mat_trans_f32(
    const float *__restrict__ pSrc,
    uint32_t M,
    uint32_t N,
    float *__restrict__ pDst
)

Glue code for matrix transpose of 32-bit floating-point matrices.

Parameters:

pSrc Points to the input matrix of shape MxN
M Height of the input matrix and width of the output matrix
N Width of the input matrix and height of the output matrix
pDst Points to the output matrix of shape NxM

Return: none

Par: This function will use plp_mat_trans_i32s_xpulpv2 for its computation.

Glue code for matrix transpose of a 32-bit float*ing-point matrices.

function plp_mat_trans_f32_parallel

void plp_mat_trans_f32_parallel(
    const float *__restrict__ pSrc,
    uint32_t M,
    uint32_t N,
    uint32_t nPE,
    float *__restrict__ pDst
)

Glue code for parallel matrix transpose of 32-bit floating-point matrices.

Parameters:

pSrc Points to the input matrix of shape MxN
M Height of the input matrix and width of the output matrix
N Width of the input matrix and height of the output matrix
nPE Number of cores to use for computation
pDst Points to the output matrix of shape NxM

Return: none

Par: This function will use plp_mat_trans_i32p_xpulpv2 for its computation.

Glue code for parallel matrix transpose of a 32-bit float*ing-point matrices.

function plp_mat_trans_i16

void plp_mat_trans_i16(
    const int16_t *__restrict__ pSrc,
    uint32_t M,
    uint32_t N,
    int16_t *__restrict__ pDst
)

Glue code for matrix transpose of 16-bit integer matrices.

Parameters:

pSrc Points to the input matrix of shape MxN
M Height of the input matrix and width of the output matrix
N Width of the input matrix and height of the output matrix
pDst Points to the output matrix of shape NxM

Return: none

Glue code for matrix transpose of a 16-bit integer matrices.

function plp_mat_trans_i16_parallel

void plp_mat_trans_i16_parallel(
    const int16_t *__restrict__ pSrc,
    uint32_t M,
    uint32_t N,
    uint32_t nPE,
    int16_t *__restrict__ pDst
)

Glue code for parallel matrix transpose of 16-bit integer matrices.

Parameters:

pSrc Points to the input matrix of shape MxN
M Height of the input matrix and width of the output matrix
N Width of the input matrix and height of the output matrix
nPE Number of cores to use for computation
pDst Points to the output matrix of shape NxM

Return: none

Glue code for parallel matrix transpose of a 16-bit integer matrices.

function plp_mat_trans_i32

void plp_mat_trans_i32(
    const int32_t *__restrict__ pSrc,
    uint32_t M,
    uint32_t N,
    int32_t *__restrict__ pDst
)

Glue code for matrix transpose of 16-bit integer matrices.

Parameters:

pSrc Points to the input matrix of shape MxN
M Height of the input matrix and width of the output matrix
N Width of the input matrix and height of the output matrix
pDst Points to the output matrix of shape NxM

Return: none

Glue code for matrix transpose of a 32-bit integer matrices.

function plp_mat_trans_i32_parallel

void plp_mat_trans_i32_parallel(
    const int32_t *__restrict__ pSrc,
    uint32_t M,
    uint32_t N,
    uint32_t nPE,
    int32_t *__restrict__ pDst
)

Glue code for parallel matrix transpose of 16-bit integer matrices.

Glue code for parallel matrix transpose of a 32-bit integer matrices.

function plp_mat_trans_i8

void plp_mat_trans_i8(
    const int8_t *__restrict__ pSrc,
    uint32_t M,
    uint32_t N,
    int8_t *__restrict__ pDst
)

Glue code for matrix transpose of 16-bit integer matrices.

Parameters:

pSrc Points to the input matrix of shape MxN
M Height of the input matrix and width of the output matrix
N Width of the input matrix and height of the output matrix
pDst Points to the output matrix of shape NxM

Return: none

Glue code for matrix transpose of a 8-bit integer matrices.

function plp_mat_trans_i8_parallel

void plp_mat_trans_i8_parallel(
    const int8_t *__restrict__ pSrc,
    uint32_t M,
    uint32_t N,
    uint32_t nPE,
    int8_t *__restrict__ pDst
)

Glue code for parallel matrix transpose of 16-bit integer matrices.

Parameters:

pSrc Points to the input matrix of shape MxN
M Height of the input matrix and width of the output matrix
N Width of the input matrix and height of the output matrix
nPE Number of cores to use for computation
pDst Points to the output matrix of shape NxM

Return: none

Glue code for parallel matrix transpose of a 8-bit integer matrices.

Updated on 2023-03-01 at 16:16:32 +0000