The PCIe module
The PCIe module handles all PCIe communication. Its task is to forward/transform PCIe transactions for the DMA controller and the MI bus. The architecture of the PCIe module is divided into two main parts: PCIE_CORE and PCIE_CTRL. Its diagram is shown below.
Note
The PCIe module can support more than one PCIe endpoint. In this case, the individual parts of the PCIe module are appropriately duplicated for each PCIe endpoint. There is also bifurcation support for some PCIe HARD IPs.
Selecting a PCIe configuration
Before running the FPGA firmware compilation, the target PCIe configuration can be selected using the makefile parameter PCIE_CONF
. Without this parameter, the card default configuration is automatically selected. Only some FPGA cards support multiple PCIe configurations. If you enter an unsupported value (for example: PCIE_CONF=1xGen1x16
), the console will list the supported configurations on the target FPGA card.
Examples of some allowed configurations:
PCIE_CONF=1xGen3x16
– Single PCIe slot in Gen3 x16 mode.PCIE_CONF=2xGen4x8x8
– Two PCIe slots in Gen4 x8x8 (bifuracation) mode.PCIE_CONF=2xGen5x8x8
– Two PCIe slots in Gen5 x8x8 (bifuracation) mode.PCIE_CONF=1xGen3x8LL
– Single PCIe slot in Gen3 x8 Low-Latency mode (for Xilinx UltraScale+ only).
The PCIe Core (PCIE_CORE)
The PCIe Core varies according to the PCIe Hard IP or FPGA used. The PCIe Core contains the instance(s) of the used PCIe Hard IP, an adapter for converting the AXI/Avalon-ST buses to the MFB buses, the Vendor-Specific Extension Capability (VSEC) registers (implemented in the PCI_EXT_CAP module) containing mainly the DeviceTree firmware description and additional configuration logic. Thus, the main purpose of the PCIe Core is to unify the buses and provide the necessary information about the active PCIe link.
Supported PCIe Hard IP
A list of the supported PCIe Hard IPs is below. You can select the target architecture by setting the NDK parameter PCIE_MOD_ARCH
. According to this parameter, the correct PCIE_CORE module variant is used and the VHDL generic PCIE_ENDPOINT_TYPE
is set appropriately.
The PCIe Control unit (PCIE_CTRL)
The PCIe Control unit always includes the MI Transaction Controller (MTC), which transforms the associated PCIe memory transactions into read or write requests on the MI bus. In the case of a read request, the MI response is also transformed back into a PCIe completition transaction and sent back to the host PC. PCIe transactions from the BAR0 address space are allocated to the MTC module. If the NDK uses a DMA controller that requires its own BAR, the PCIe transactions from the DMA-BAR address space (BAR2) are routed directly to the DMA module. This functionality must be enabled via the DMA_BAR_ENABLE
parameter.
Note
We assume that 64-bit PCIe BARs are used, meaning that half of them are available at most (BAR0, BAR2, and BAR4). You can find more information in the PCIe specification.
By default, this unit also contains the PTC module, which transforms memory requests (in a simplified format) coming from the DMA into the desired PCIe format and vice versa. The PTC module also implements a completion buffer and handles the allocation of the PCIe TAGs, etc. The PTC can be disabled using the PTC_DISABLE
parameter, in which case the DMA requests (in the PCIe transaction format) are directly forwarded to the PCIe Hard IP and vice versa.
The PCIe module entity
- ENTITY PCIE IS
- Generics
Generic
Type
Default
Description
=====
BAR base address configuration
=====
=====
BAR0_BASE_ADDR
std_logic_vector(31 downto 0)
X”01000000”
BAR1_BASE_ADDR
std_logic_vector(31 downto 0)
X”02000000”
BAR2_BASE_ADDR
std_logic_vector(31 downto 0)
X”03000000”
BAR3_BASE_ADDR
std_logic_vector(31 downto 0)
X”04000000”
BAR4_BASE_ADDR
std_logic_vector(31 downto 0)
X”05000000”
BAR5_BASE_ADDR
std_logic_vector(31 downto 0)
X”06000000”
EXP_ROM_BASE_ADDR
std_logic_vector(31 downto 0)
X”0A000000”
=====
MFB configuration
=====
=====
CQ_MFB_REGIONS
natural
2
CQ_MFB_REGION_SIZE
natural
1
CQ_MFB_BLOCK_SIZE
natural
8
CQ_MFB_ITEM_WIDTH
natural
32
RC_MFB_REGIONS
natural
2
RC_MFB_REGION_SIZE
natural
1
RC_MFB_BLOCK_SIZE
natural
8
RC_MFB_ITEM_WIDTH
natural
32
CC_MFB_REGIONS
natural
2
CC_MFB_REGION_SIZE
natural
1
CC_MFB_BLOCK_SIZE
natural
8
CC_MFB_ITEM_WIDTH
natural
32
RQ_MFB_REGIONS
natural
2
RQ_MFB_REGION_SIZE
natural
1
RQ_MFB_BLOCK_SIZE
natural
8
RQ_MFB_ITEM_WIDTH
natural
32
=====
Other configuration
=====
=====
DMA_PORTS
natural
2
Total number of DMA_EP, DMA_EP=PCIE_EP or 2*DMA_EP=PCIE_EP
PCIE_ENDPOINT_TYPE
string
“P_TILE”
Connected PCIe endpoint type
PCIE_ENDPOINT_MODE
natural
0
Connected PCIe endpoint mode: 0=x16, 1=x8x8, 2=x8
PCIE_ENDPOINTS
natural
1
Number of PCIe endpoints
PCIE_CLKS
natural
2
Number of PCIe clocks per PCIe connector
PCIE_CONS
natural
1
Number of PCIe connectors
PCIE_LANES
natural
16
Number of PCIe lanes in each PCIe connector
PCIE_GEN
natural
4
PCIe generation number
CARD_ID_WIDTH
natural
0
Width of CARD/FPGA ID number
PTC_DISABLE
boolean
false
Disable PTC module and allows direct connection of the DMA module to the PCIe IP RQ and RC interfaces.
DMA_BAR_ENABLE
boolean
false
Enable CQ/CC interface for DMA-BAR, condition DMA_PORTS=PCIE_ENDPOINTS
XVC_ENABLE
boolean
false
Enable of XCV IP, for Xilinx only
MISC_TOP2PCIE_WIDTH
natural
1
Width of MISC signal between Top-Level FPGA design and PCIE core logic
MISC_PCIE2TOP_WIDTH
natural
1
Width of MISC signal between PCIE core logic and Top-Level FPGA design
DEVICE
string
“STRATIX10”
FPGA device
Port
Type
Mode
Description
=====
CLOCKS AND RESETS
=====
=====
PCIE_SYSCLK_P
std_logic_vector(PCIE_CONS*PCIE_CLKS-1 downto 0)
in
Clock from PCIe port, 100 MHz
PCIE_SYSCLK_N
std_logic_vector(PCIE_CONS*PCIE_CLKS-1 downto 0)
in
PCIE_SYSRST_N
std_logic_vector(PCIE_CONS-1 downto 0)
in
PCIe reset from PCIe port
INIT_DONE_N
std_logic
in
nINIT_DONE output of the Reset Release Intel Stratix 10 FPGA IP
PCIE_USER_CLK
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
out
PCIe user clock and reset
PCIE_USER_RESET
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
out
DMA_CLK
std_logic
in
DMA module clock and reset
DMA_RESET
std_logic
in
=====
PCIE SERIAL INTERFACE
=====
=====
PCIE_RX_P
std_logic_vector(PCIE_CONS*PCIE_LANES-1 downto 0)
in
Receive data
PCIE_RX_N
std_logic_vector(PCIE_CONS*PCIE_LANES-1 downto 0)
in
PCIE_TX_P
std_logic_vector(PCIE_CONS*PCIE_LANES-1 downto 0)
out
Transmit data
PCIE_TX_N
std_logic_vector(PCIE_CONS*PCIE_LANES-1 downto 0)
out
=====
Configuration status interface (PCIE_USER_CLK)
=====
=====
PCIE_LINK_UP
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
out
PCIe link up flag per PCIe endpoint
PCIE_MPS
slv_array_t(PCIE_ENDPOINTS-1 downto 0)(3-1 downto 0)
out
PCIe maximum payload size
PCIE_MRRS
slv_array_t(PCIE_ENDPOINTS-1 downto 0)(3-1 downto 0)
out
PCIe maximum read request size
PCIE_EXT_TAG_EN
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
out
PCIe extended tag enable (8-bit tag)
PCIE_10B_TAG_REQ_EN
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
out
PCIe 10-bit tag requester enable
PCIE_RCB_SIZE
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
out
PCIe RCB size control
CARD_ID
slv_array_t(PCIE_ENDPOINTS-1 downto 0)(CARD_ID_WIDTH-1 downto 0)
in
Card ID / PCIe Device Serial Number
=====
DMA RQ MFB+MVB interface (PCIE_CLK or DMA_CLK)
=====
PTC ENABLE: MFB+MVB bus for transferring RQ PTC-DMA transactions. MFB+MVB bus is clocked at DMA_CLK. PTC DISABLE: MFB bus only for transferring RQ PCIe transactions (format according to the PCIe IP used). Compared to the standard MFB specification, it does not allow gaps (SRC_RDY=0) inside transactions and requires that the first transaction in a word starts at byte 0. MFB bus is clocked at PCIE_CLK.
DMA_RQ_MFB_DATA
slv_array_t(DMA_PORTS-1 downto 0)(RQ_MFB_REGIONS*RQ_MFB_REGION_SIZE*RQ_MFB_BLOCK_SIZE*RQ_MFB_ITEM_WIDTH-1 downto 0)
in
DMA_RQ_MFB_META
slv_array_t(DMA_PORTS-1 downto 0)(RQ_MFB_REGIONS*PCIE_RQ_META_WIDTH-1 downto 0)
in
DMA_RQ_MFB_SOF
slv_array_t(DMA_PORTS-1 downto 0)(RQ_MFB_REGIONS-1 downto 0)
in
DMA_RQ_MFB_EOF
slv_array_t(DMA_PORTS-1 downto 0)(RQ_MFB_REGIONS-1 downto 0)
in
DMA_RQ_MFB_SOF_POS
slv_array_t(DMA_PORTS-1 downto 0)(RQ_MFB_REGIONS*max(1,log2(RQ_MFB_REGION_SIZE))-1 downto 0)
in
DMA_RQ_MFB_EOF_POS
slv_array_t(DMA_PORTS-1 downto 0)(RQ_MFB_REGIONS*max(1,log2(RQ_MFB_REGION_SIZE*RQ_MFB_BLOCK_SIZE))-1 downto 0)
in
DMA_RQ_MFB_SRC_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
in
DMA_RQ_MFB_DST_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
out
DMA_RQ_MVB_DATA
slv_array_t(DMA_PORTS-1 downto 0)(RQ_MFB_REGIONS*DMA_UPHDR_WIDTH-1 downto 0)
in
DMA_RQ_MVB_VLD
slv_array_t(DMA_PORTS-1 downto 0)(RQ_MFB_REGIONS-1 downto 0)
in
DMA_RQ_MVB_SRC_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
in
DMA_RQ_MVB_DST_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
out
=====
DMA RC MFB+MVB interface (PCIE_CLK or DMA_CLK)
=====
PTC ENABLE: MFB+MVB bus for transferring RC PTC-DMA transactions. MFB+MVB bus is clocked at DMA_CLK. PTC DISABLE: MFB bus only for transferring RC PCIe transactions (format according to the PCIe IP used). Compared to the standard MFB specification, it does not allow gaps (SRC_RDY=0) inside transactions and requires that the first transaction in a word starts at byte 0. MFB bus is clocked at PCIE_CLK.
DMA_RC_MFB_DATA
slv_array_t(DMA_PORTS-1 downto 0)(RC_MFB_REGIONS*RC_MFB_REGION_SIZE*RC_MFB_BLOCK_SIZE*RC_MFB_ITEM_WIDTH-1 downto 0)
out
DMA_RC_MFB_META
slv_array_t(DMA_PORTS-1 downto 0)(RC_MFB_REGIONS*PCIE_RC_META_WIDTH-1 downto 0)
out
DMA_RC_MFB_SOF
slv_array_t(DMA_PORTS-1 downto 0)(RC_MFB_REGIONS-1 downto 0)
out
DMA_RC_MFB_EOF
slv_array_t(DMA_PORTS-1 downto 0)(RC_MFB_REGIONS-1 downto 0)
out
DMA_RC_MFB_SOF_POS
slv_array_t(DMA_PORTS-1 downto 0)(RC_MFB_REGIONS*max(1,log2(RC_MFB_REGION_SIZE))-1 downto 0)
out
DMA_RC_MFB_EOF_POS
slv_array_t(DMA_PORTS-1 downto 0)(RC_MFB_REGIONS*max(1,log2(RC_MFB_REGION_SIZE*RC_MFB_BLOCK_SIZE))-1 downto 0)
out
DMA_RC_MFB_SRC_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
out
DMA_RC_MFB_DST_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
in
DMA_RC_MVB_DATA
slv_array_t(DMA_PORTS-1 downto 0)(RC_MFB_REGIONS*DMA_DOWNHDR_WIDTH-1 downto 0)
out
DMA_RC_MVB_VLD
slv_array_t(DMA_PORTS-1 downto 0)(RC_MFB_REGIONS-1 downto 0)
out
DMA_RC_MVB_SRC_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
out
DMA_RC_MVB_DST_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
in
=====
DMA CQ MFB interface - DMA-BAR (PCIE_CLK)
=====
MFB bus for transferring CQ DMA-BAR PCIe transactions (format according to the PCIe IP used). Compared to the standard MFB specification, it does not allow gaps (SRC_RDY=0) inside transactions and requires that the first transaction in a word starts at byte 0.
DMA_CQ_MFB_DATA
slv_array_t(DMA_PORTS-1 downto 0)(CQ_MFB_REGIONS*CQ_MFB_REGION_SIZE*CQ_MFB_BLOCK_SIZE*CQ_MFB_ITEM_WIDTH-1 downto 0)
out
DMA_CQ_MFB_META
slv_array_t(DMA_PORTS-1 downto 0)(CQ_MFB_REGIONS*PCIE_CQ_META_WIDTH-1 downto 0)
out
DMA_CQ_MFB_SOF
slv_array_t(DMA_PORTS-1 downto 0)(CQ_MFB_REGIONS-1 downto 0)
out
DMA_CQ_MFB_EOF
slv_array_t(DMA_PORTS-1 downto 0)(CQ_MFB_REGIONS-1 downto 0)
out
DMA_CQ_MFB_SOF_POS
slv_array_t(DMA_PORTS-1 downto 0)(CQ_MFB_REGIONS*max(1,log2(CQ_MFB_REGION_SIZE))-1 downto 0)
out
DMA_CQ_MFB_EOF_POS
slv_array_t(DMA_PORTS-1 downto 0)(CQ_MFB_REGIONS*max(1,log2(CQ_MFB_REGION_SIZE*CQ_MFB_BLOCK_SIZE))-1 downto 0)
out
DMA_CQ_MFB_SRC_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
out
DMA_CQ_MFB_DST_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
in
=====
PCIE CC MFB interface - DMA-BAR (PCIE_CLK)
=====
MFB bus for transferring CC DMA-BAR PCIe transactions (format according to the PCIe IP used). Compared to the standard MFB specification, it does not allow gaps (SRC_RDY=0) inside transactions and requires that the first transaction in a word starts at byte 0.
DMA_CC_MFB_DATA
slv_array_t(DMA_PORTS-1 downto 0)(CC_MFB_REGIONS*CC_MFB_REGION_SIZE*CC_MFB_BLOCK_SIZE*CC_MFB_ITEM_WIDTH-1 downto 0)
in
DMA_CC_MFB_META
slv_array_t(DMA_PORTS-1 downto 0)(CC_MFB_REGIONS*PCIE_CC_META_WIDTH-1 downto 0)
in
DMA_CC_MFB_SOF
slv_array_t(DMA_PORTS-1 downto 0)(CC_MFB_REGIONS-1 downto 0)
in
DMA_CC_MFB_EOF
slv_array_t(DMA_PORTS-1 downto 0)(CC_MFB_REGIONS-1 downto 0)
in
DMA_CC_MFB_SOF_POS
slv_array_t(DMA_PORTS-1 downto 0)(CC_MFB_REGIONS*max(1,log2(CC_MFB_REGION_SIZE))-1 downto 0)
in
DMA_CC_MFB_EOF_POS
slv_array_t(DMA_PORTS-1 downto 0)(CC_MFB_REGIONS*max(1,log2(CC_MFB_REGION_SIZE*CC_MFB_BLOCK_SIZE))-1 downto 0)
in
DMA_CC_MFB_SRC_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
in
DMA_CC_MFB_DST_RDY
std_logic_vector(DMA_PORTS-1 downto 0)
out
=====
MI32 interfaces (MI_CLK)
=====
MI - Root of the MI32 bus tree for each PCIe endpoint (connection to the MTC) MI_DBG - MI interface to PCIe registers (currently only debug registers)
MI_CLK
std_logic
in
MI_RESET
std_logic
in
MI_DWR
slv_array_t (PCIE_ENDPOINTS-1 downto 0)(32-1 downto 0)
out
MI_ADDR
slv_array_t (PCIE_ENDPOINTS-1 downto 0)(32-1 downto 0)
out
MI_BE
slv_array_t (PCIE_ENDPOINTS-1 downto 0)(32/8-1 downto 0)
out
MI_RD
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
out
MI_WR
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
out
MI_DRD
slv_array_t (PCIE_ENDPOINTS-1 downto 0)(32-1 downto 0)
in
MI_ARDY
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
in
MI_DRDY
std_logic_vector(PCIE_ENDPOINTS-1 downto 0)
in
MI_DBG_DWR
std_logic_vector(32-1 downto 0)
in
MI_DBG_ADDR
std_logic_vector(32-1 downto 0)
in
MI_DBG_BE
std_logic_vector(32/8-1 downto 0)
in
MI_DBG_RD
std_logic
in
MI_DBG_WR
std_logic
in
MI_DBG_DRD
std_logic_vector(32-1 downto 0)
out
MI_DBG_ARDY
std_logic
out
MI_DBG_DRDY
std_logic
out
=====
MISC SIGNALS (the clock signal is not defined)
=====
=====
MISC_TOP2PCIE
slv_array_t(PCIE_ENDPOINTS-1 downto 0)(MISC_TOP2PCIE_WIDTH-1 downto 0)
in
Optional signal for MISC connection from Top-Level FPGA design to PCIE core.
MISC_PCIE2TOP
slv_array_t(PCIE_ENDPOINTS-1 downto 0)(MISC_PCIE2TOP_WIDTH-1 downto 0)
out
Optional signal for MISC connection from PCIE core to Top-Level FPGA design.