Snitch Runtime
Loading...
Searching...
No Matches
sync.h File Reference

This file provides functions to synchronize Snitch cores. More...

#include <math.h>

Go to the source code of this file.

Functions

volatile uint32_t * snrt_mutex ()
 Get a pointer to a mutex variable.
 
void snrt_mutex_acquire (volatile uint32_t *pmtx)
 Acquire a mutex, blocking.
 
void snrt_mutex_ttas_acquire (volatile uint32_t *pmtx)
 Acquire a mutex, blocking.
 
void snrt_mutex_release (volatile uint32_t *pmtx)
 Release a previously-acquired mutex.
 
void snrt_cluster_hw_barrier ()
 Synchronize cores in a cluster with a hardware barrier, blocking.
 
void snrt_inter_cluster_barrier ()
 Synchronize one core from every cluster with the others.
 
void snrt_global_barrier ()
 Synchronize all Snitch cores.
 
void snrt_partial_barrier (snrt_barrier_t *barr, uint32_t n)
 Generic software barrier.
 
uint32_t snrt_global_all_to_all_reduction (uint32_t value)
 Perform a global sum reduction, blocking.
 
void snrt_global_reduction_dma (double *dst_buffer, double *src_buffer, size_t len)
 Perform a sum reduction among clusters, blocking.
 

Detailed Description

This file provides functions to synchronize Snitch cores.

Function Documentation

◆ snrt_cluster_hw_barrier()

void snrt_cluster_hw_barrier ( )
inline

Synchronize cores in a cluster with a hardware barrier, blocking.

Note
Synchronizes all (both DM and compute) cores. All cores must invoke this function, or the calling cores will stall indefinitely.

◆ snrt_global_all_to_all_reduction()

uint32_t snrt_global_all_to_all_reduction ( uint32_t value)
inline

Perform a global sum reduction, blocking.

All cores participate in the reduction and synchronize globally to wait for the reduction to complete. The synchronization is performed via snrt_global_barrier.

Parameters
valueThe value to be summed.
Returns
The result of the sum reduction.
Note
Every Snitch core must invoke this function, or the calling cores will stall indefinitely.

◆ snrt_global_barrier()

void snrt_global_barrier ( )
inline

Synchronize all Snitch cores.

Synchronization is performed hierarchically. Within a cluster, cores are synchronized through a hardware barrier (see snrt_cluster_hw_barrier). Clusters are synchronized through a software barrier (see snrt_inter_cluster_barrier).

Note
Every Snitch core must invoke this function, or the calling cores will stall indefinitely.

◆ snrt_global_reduction_dma()

void snrt_global_reduction_dma ( double * dst_buffer,
double * src_buffer,
size_t len )
inline

Perform a sum reduction among clusters, blocking.

The reduction is performed in a logarithmic fashion. Half of the clusters active in every level of the binary-tree participate as as senders, the other half as receivers. Senders use the DMA to send their data to the respective receiver's destination buffer. The receiver then reduces each element in its destination buffer with the respective element in its source buffer. It then proceeds to the next level in the binary tree.

Parameters
dst_bufferThe pointer to the calling cluster's destination buffer.
src_bufferThe pointer to the calling cluster's source buffer.
lenThe amount of data in each buffer.
Note
The destination buffers must lie at the same offset in every cluster's TCDM.

◆ snrt_inter_cluster_barrier()

void snrt_inter_cluster_barrier ( )
inline

Synchronize one core from every cluster with the others.

Implemented as a software barrier.

Note
One core per cluster must invoke this function, or the calling cores will stall indefinitely.

◆ snrt_mutex_acquire()

void snrt_mutex_acquire ( volatile uint32_t * pmtx)
inline

Acquire a mutex, blocking.

Test-and-set (TAS) implementation of a lock.

Parameters
pmtxA pointer to a variable which can be used as a mutex, i.e. to which all cores have a reference and at a memory location to which atomic accesses can be made. This can be declared e.g. as static volatile uint32_t mtx = 0;.

◆ snrt_mutex_ttas_acquire()

void snrt_mutex_ttas_acquire ( volatile uint32_t * pmtx)
inline

Acquire a mutex, blocking.

Same as snrt_mutex_acquire but acquires the lock using a test and test-and-set (TTAS) strategy.

◆ snrt_partial_barrier()

void snrt_partial_barrier ( snrt_barrier_t * barr,
uint32_t n )
inline

Generic software barrier.

Parameters
barrpointer to a barrier variable.
nnumber of harts that have to enter before released.
Note
Exactly the specified number of harts must invoke this function, or the calling cores will stall indefinitely.