NDP TX synchronization
When transmitting data from the SW to the device (TX DMA direction), the driver receives data stored in RAM from the user. Then tells the DMA Module where they are and how big. Once the DMA Module has finished transmitting the data from the RAM, it signals the driver, which frees the data from the RAM. It tells the user, that the transmition is completed and waits for more data to be transmitted.
Pointers description
lib-drv sync:
___C__ ___B___ ___A___
/ \ / \ / \
lib: HWPTR RHP SWPTR
older >----|-------|---------|--------|> newer
drv: HWPTR SWPTR
empty space for new data
data in NDP buffer, not yet synced with driver
hardware is aware this data, but not transfered yet.
Example of TX synchronization run
start: actual position HWPTR + SWPTR is undefined:
lib: ? ? ? older >···································> newer
first lock: libnfb requests the driver to lock part of the NDP buffer. Typically requests maximum space (buffer_size - 1). The position of HWPTR doesn’t matter, decisive is only described length:
lib: HWPTR SWPTR older >|---------------------------------|> newer
pointer sync: the driver:
checks if there is no actual lock
checks the boundaries of request
sets sync.hwptr by last position assigned to hardware
clamps sync.swptr according to free space:
older >··|-----------------------------|··> newer drv: HWPTR SWPTR
fill & wait: Software fills the data placeholders and calls ndp_tx_burst_put. This doesn’t flush the data and the lock is not given in.
lib: HWPTR RHP SWPTR older >··|————————|--------------------|··> newer
fill & flush: Software fills the further data placeholders and calls ndp_tx_burst_put & ndp_tx_burst_flush. This flushes the data, but software still have the lock:
lib: RHP+HWPTR SWPTR older >················|---------------|··> newer
desc fill: the driver creates the descriptors from packet headers and passes the SDP into hardware:
older >··|—————————————|---------------|··> newer drv: HDP SDP
try lock: libnfb requests to lock additional data space:
lib: SWPTR RHP+HWPTR older >-----|··········|------------------> newer
clamp: driver partially rejects and clamp this request, because the requested space in ring buffer is not yet freed:
lib: SWPTR RHP+HWPTR older >--|·············|------------------> newer
HDP update: hardware transmits some data and updates the HDP:
older >-----|——————————|------------------> newer drv: HDP SDP
try lock: same as in step 7:
lib: SWPTR RHP+HWPTR older >-----|··········|------------------> newer
sync: drivers detect the HDP update and sets the SWPTR according to maximal free space:
lib: SWPTR HP+HWPTR older >-----|··········|------------------> newer
Example of TX multiple writers
start condition: app0 have the lock, app1 just started
app1 try lock: app1 request for the lock:
app1:HWPTR SWPTR older >|---------------------------------|> newer
app1 sync: driver checks, that the lock exists and is not held by app1 and refuses to lock (swptr == hwptr):
app1: SWPTR+HWPTR older >---------|-------------------------> newer
app0 fill & wait: Software fills the data placeholders and calls ndp_tx_burst_put. This doesn’t flush the data and the lock is not given in.
lib: HWPTR RHP SWPTR older >·········|————————|--------------------|··> newer
app0 fill & flush & unlock: Software fills the further data placeholders and calls ndp_tx_burst_put & ndp_tx_burst_flush. This flushes the data, and give in the lock:
lib: HWPTR+SWPTR older >······················|···················> newer
desc fill: the driver creates the descriptors from packet headers and passes the SDP into hardware:
drv: HWPTR SWPTR older >········|—————————————|---------------|··> newer drv: HDP SDP
app1 try lock: same as in step 2:
app1:HWPTR SWPTR older >|---------------------------------------|> newer
app1 sync: driver checks, that the lock doesn’t exists and let the request of app1 go through:
app1: SWPTR HWPTR older >-------|··············|------------------> newer
Function call map
This section contains a cheatsheet of how functions call each other when using DPDK, Libnfb and NDP driver.
Each of these parts uses different structure to contain a block of memory space for packet.
Between the DMA Module and the NDP driver “descriptors” are used.
Descriptors are optimized for minimal PCI overhead.
The NDP driver and the Libnfb pass these infromation using the “Header Buffer” and “Offset Buffer” (the two buffers share control and thus function as a single buffer).
The buffers and the pointers to these buffers are accessible both to the NDP driver and (through vmap
) to the Libnfb.
This is basically the only way these two sides communicate.
Otherwise they run independently in parallel.
The Libnfb comunicates with the user using the “ndp_packet” structure. In DPDK memory blocks are managed using the structure called “Mbuf”.
TX
1 rte_eth_tx_burst (dpdk/lib/librte_ethdev/rte_ethdev.h)
2 -+
3 |
4 V
5 nfb_eth_ndp_tx (dpdk/drivers/net/nfb/nfb_tx.h)
6 - set as dev->tx_pkt_burst
7 - main TX transmition function
8 - finds out the number of new succesfully sent packets
9 - frees the corresponding number of Mbufs (sent packets are freed from the memory)
10 - copies info from input array of Mbufs to its Mbuf Buffer (to be able to free them later) and to an array of ndp_packets
11 ==============-+
12 |
13 V
14 (nc_)ndp_(v2_)tx_burst_get (swbase/libnfb/include/netcope/ndp_tx.h)
15 - copiest info from ndp_packets to Header and Offset Buffer
16 - shifts rhp
17 ==========-+
18 |
19 V
20 (nc_)ndp_tx_burst_put (swbase/libnfb/include/netcope/ndp_tx.h)
21 -+
22 -+ |
23 | |
24 V V
25 (nc_)ndp_(v2_)tx_burst_flush (swbase/libnfb/include/netcope/ndp_tx.h)
26 - sets sync.hwptr and sync.swptr to rhp (unlocks the TX Channel for other applications)
27
28 ndp_channel_txsync (drivers/kernel/drivers/nfb/ndp/channel.c)
29 -+
30 |
31 V
32 ndp_ctrl_tx_set_swptr (swbase/drivers/kernel/drivers/nfb/ndp/ctrl_ndp.c)
33 - is set as ndp_ctrl_tx_ops.set_swptr
34 - creates descriptors from new items in Header and Offset Buffer
35 - shifts sdp propagates it to HW