Introducing the 2x QSFP28 FMC

Up to 100Gbps per port, on a wide range of FPGA dev boards

May 28, 202611 minutes

Introducing the 2x QSFP28 FMC

In this post I want to tell you about our latest-and-greatest new product: the 2x QSFP28 FMC. This is an FPGA Mezzanine Card (FMC) with a high-pin-count (HPC) connector and as the name suggests it has 2 QSFP28 cages for mating with QSFP/QSFP+/QSFP28 modules. Each QSFP28 cage connects to four gigabit transceiver lanes and supports up to 28Gbps per lane - that means that each cage can support an aggregated link speed of up to 100Gbps.

A quick look

photo of two FMCs - one connected to the VEK280 with QSFP28 loopback modules plugged in, the other sitting beside it component-side up

This was the test setup we used to do our IBERT testing - more on that further down. On the top side of the board we put the QSFP28 cages and all of the active components. Most of the ICs are level translators, but there is also a jitter-attenuating clock multiplier (Si5328) and its 114.285MHz crystal. It has a 3.3VDC switching regulator that provides power to the QSFP28 modules derived from the 12VDC input from the FMC connector. If you want a labelled view of which part is which, just check out the detailed description page of the datasheet.

the same setup as above, but giving a better look at the bottom side (solder side) of the FMC

The bottom side is where we put most of the labelling, the test points and the bicolor user LEDs. The LEDs are bottom-entry parts but they’re soldered on the top side of the board. I like to keep as few parts as possible on the bottom side of the board so that there’s less chance of them breaking off if the card gets dropped or something falls on it (not a rare occurance on the typical cluttered desk of an electrical engineer).

closeup of the component-side-up FMC

The image above gives you a closer look at the top side. You can see the 2x QSFP28 cages, the high pin count FMC connector, and the supporting circuitry between them.

What’s on the board

Here’s a quick summary of the features:

  • 2x QSFP28 cages compatible with QSFP, QSFP+ and QSFP28 modules
  • 4 lanes per port at up to 25Gbps each (100Gbps aggregate per port)
  • Jitter-attenuating clock multiplier ( Si5328) with support for recovered clock and SyncE applications
  • Voltage translators supporting a wide range of VADJ I/O voltages (1.2V to 3.3V)
  • High pin count FMC connector, pinout conforming to the VITA 57.1 FMC standard
  • Bicolor user LEDs (one per port) that you can drive from the FPGA
  • Test points to aid debugging (they connect to the QSFP28 module’s low-speed management I/Os)

The FMC connector provides power and presents the following I/O to the FPGA: 8 gigabit serial lanes (4 per port), QSFP management I/O (MODPRS_L, RESET_L, LPMODE, INT_L, MODSEL_L), I2C buses for the EEPROM, each QSFP module and the clock multiplier, an LVDS recovered clock from the FPGA into the clock multiplier, an LVDS configurable clock back out, a clock loss alarm, and the drive signals for the user LEDs.

One thing worth pointing out: the 2x QSFP28 FMC needs 8 gigabit transceivers to support both ports (DP0-DP7), so it requires a high pin count (HPC) or FMC+ connector. Low pin count (LPC) connectors only route a single transceiver (DP0) - they won’t work with this board. If your carrier only has DP0-DP3 routed on its HPC, you’ll still be able to use Port 0.

What can you do with it?

Here are a few application ideas that we designed this for:

  • 100G Ethernet: The most obvious one. Insert a QSFP28 module and you’ve got 100GbE (4x 25G lanes). With two cages on the board, that’s potentially 200G of Ethernet bandwidth from a single FMC.
  • 40G Ethernet: Plug in a QSFP+ module and you have 40GbE (4x 10G lanes). Useful when your carrier doesn’t quite have the transceiver speed for 100G, or when you just need 40G.
  • Multiple lower-speed Ethernet links: Each lane in a QSFP cage is independent, so a single QSFP28 cage can support 4x 25G or 4x 10G Ethernet links via breakout cables. Two cages gives you 8 independent Ethernet links - handy for switch/router type applications, network test equipment, or anywhere you need a lot of ports.
  • InfiniBand and other high-speed protocols: The QSFP/QSFP28 form factor is widely used for InfiniBand (EDR at 25Gbps per lane, FDR at 14Gbps), Fibre Channel and other high-speed serial protocols.
  • Synchronous Ethernet (SyncE): The Si5328 jitter-attenuating clock multiplier makes the board well-suited to SyncE applications where network nodes need to be precisely synchronized to a master clock. I’ve been meaning to do a ref design for this because the Quad SFP28 FMC is also SyncE capable.
  • High-speed data acquisition / streaming: If you’ve got an RFSoC or other data-heavy FPGA design and you need to stream large volumes of data out to a host or another node somewhere (could even be kilometers away), QSFP28 is perfect for this. You can use Ethernet, Aurora, or a custom protocol.
  • Chip-to-chip / board-to-board links: Two of these FMCs connected back-to-back over QSFP28 cables (passive copper DACs work great over short distances) gives you a clean 4-lane 28Gbps interconnect between two FPGA boards. You can already do this with our MCIO PCIe FMC but with QSFP28 you can use an optical link and put some real distance between the boards if you need to.

28Gbps signal integrity

This is the second Opsero product designed for 28Gbps serial links (the first was the Quad SFP28 FMC) and this time we put even more effort (and expense!) into getting the signal integrity where it should be (see the lessons learned below). By the way, 28Gbps is pretty much the limit of these FMC connectors, so they’re useless for AMD Versal Prime 112Gbps (PAM4) GTM gigabit transceivers - but I’m going off track. A quick summary of the choices:

  • Material: Panasonic Megtron-6 laminate and prepreg, designed for high-speed and low-loss. The high-speed traces are short (this is a small FMC after all) but why take the risk of using FR4 at these frequencies?
  • Stackup: 12 copper layers with grounded reference planes either side of the high-speed signal layers.
  • Routing: Curved high-speed traces (no angles), and saw-tooth bumps for P/N length matching where required.
  • Backdrilling: Used to remove the via stubs on all of the high-speed traces.
  • Voids and vias: Antipads around the HS via barrels, four ground return vias placed symmetrically around every layer transition, and we kept the number of layer transitions to a minimum.

If you want the full reasoning behind these choices, my Quad SFP28 FMC post goes into more detail and points to the DesignCon papers that we leaned on.

The proof is in the eye diagrams. Simulations are nice but when you make physical products, they have to actually work.

The eye diagrams

We ran IBERT on the VCK190 development board with the 2x QSFP28 FMC and a pair of QSFP28 loopback modules plugged into both cages. Each lane was driven with a PRBS31 pattern at 28Gbps, error free.

IBERT results at 28Gbps for the 2x QSFP28 FMC

Wide open eyes on all 8 lanes at 28Gbps. We’ve also tested this on the VEK280, and we have customers using it on other target boards already.

Compatible boards

We’ve verified compatibility for the 2x QSFP28 FMC on quite a few AMD/Xilinx development boards including:

  • Zynq UltraScale+: ZCU102 (HPC0 and HPC1), ZCU106 (HPC0), ZCU111, ZCU208, ZCU216, Avnet UltraZed EV
  • UltraScale+: VCU118
  • Versal: VCK190, VEK280, VHK158, VMK180, VPK120, VPK180

What I mean by “verified” is that we’ve made sure that those boards satisfy all of the requirements of the FMC card, including VADJ voltage, and GT and I/O pin assignments as well as mechanical constraints. Hardware tests have so far been limited to VCK190, VEK280 and ZCU216 but when we get the ref design up-and-running, that list will grow longer.

On the boards with FMC+ connectors and 28Gbps-capable transceivers (the RFSoCs and the Versal boards), you can hit the full 2x 100G. On the HPC carriers limited to 10Gbps transceivers, you’ll still get 2x 40G, which is still a good deal of bandwidth.

The full table with notes, supported port counts and link speeds is on the compatibility page of the datasheet. We’ll be releasing reference designs for these boards over the coming months.

Lessons learned

Getting a new product designed, built and put on the market is not always easy. I like to share the stories and the lessons learned when I can.

If you’re connected to me on LinkedIn, you may have noticed that I started sharing photos and making announcements of this product early last year - so what happened? Problems happened.

The first prototypes looked good. Up to 10Gbps per lane, they were good. However, when we pushed it up to 28Gbps, bit errors started to appear, and two lanes completely dropped out. Clearly, we had a signal integrity problem.

Lesson 1: Minimize use of the outer layers for high-speed traces

After breaking-out from the pins, drop to the inner layers as soon as you can. Actually I already knew about this best-practice from one of Don Telian’s DesignCon papers, but I didn’t take it seriously enough (obviously). So Don’s reasoning is that the outer copper layers of the PCB have a “roughness” to them that the inner layers don’t have. This roughness improves component adherence - but it also contributes to losses on high frequency traces. The inner layers are smoother and less lossy, so use them for the large majority of the trace path. But this wasn’t the cause of my problem.

In my case, those two lanes, that “completely dropped out” at 28Gbps, were traces that I had extended on the top layer into a region where the QSFP28 loopbacks had a metallic lip that hovered just above the surface of the PCB. Those traces (microstrips) were designed for 100 ohm differential impedance and the dimensions are calculated with the assumption that they have open air above them - not metal. Basically this metal lip was changing the impedance of those traces in that small section and causing a significant discontinuity at 28Gbps, hence reflections and the lanes dropping out. I broke the metal lips off - the lanes came up.

Metallic lip on the QSFP28 modules

The image above shows two photos: on the left, a profile shot of the QSFP28 loopback module showing the protective metallic lip; on the right, the same profile shot but of the QSFP28 module plugged into a socket.

So I would add to Don Telian’s guidance by saying that you shouldn’t route too far on outer layers because you can’t always control what goes on up there - maybe there’ll be a wire harness sitting close to your traces, maybe you’ll need a heat-sink or a shield in that area later, who knows? On inner layers you can control what is above and below your traces; on outer layers you cant, and anything that gets too close to those traces can significantly change its characteristics at these frequencies.

Lesson 2: At 28Gbps, simulate

So with a re-spin I could fix the worst two lanes. Great. But I still had bit errors to clean up on the other lanes. I needed EM simulation and advice from a signal integrity expert.

I took this problem to E3 Designers, a company that was highly recommended to me. I spoke with Dan Binnun, the lead designer and SI specialist. Dan analysed the key elements: the PCB stackup, the skew compensation (P/N length matching), the transitional vias, the QSFP28 socket and the FMC connector. He came back to me with detailed recommendations for how to design the skew compensation, the transitional vias, and the escape routing for each connector. All of these recommendations were backed up by Ansys EM simulations.

Here are some of the layout improvements that led to a single successful re-spin and our stunning IBERT results above:

  • Skew compensation using a saw-tooth pattern. In other high speed designs, we’d always used curved bumps, so I was surprised by this recommendation. But the “teeth” were much easier to work with and produced tigher length matching. It’s not obvious from the screen-shot, but the trace widths of the “teeth” are slightly larger than the trace widths of the main striplines. This is calculated to minimize the impedance disruption. Saw-tooth skew compensation pattern
  • Improved QSFP28 escape routing. Considering our need to keep a certain distance between the P/N vias, and the fact that a few of its pins are very close to cage mounting holes, the QSFP28 connector is surprisingly tight, and knowing the best escape pattern is not obvious. QSFP28 escape routing pattern
  • Improved FMC escape routing for the gigabit transceivers. This escape pattern is actually a recommendation from Samtec. This is another non-intuitive break-out pattern and Dan Binnun’s simulations confirmed that it did produce the best results. The main advantage of this breakout is that it aligns all the ground and signal vias on every odd row so that all the even rows are left open, creating wide paths to draw out multiple GT lanes away from the FMC connector. FMC gigabit transceiver escape routing pattern

The message that I want to convey here is: be careful at 28Gbps+ because tiny details can come back to bite you. In fact, Dan would say that you should be simulating for even lower speeds, like 16Gbps. At these frequencies, leaning on best practices might not be enough, and you’ll likely save yourself some time and money by working with a company like E3 Designers before the layout even begins.

Where to get one

The 2x QSFP28 FMC is available directly from the Opsero website. It’s also listed at Digi-Key - the initial Digi-Key stock sold out quickly, but more stock will be coming online next week.

As always, if you have any questions about the board, compatibility with your carrier, or anything else, feel free to reach out.