Key Challenges and Innovations for 800G and 1.6T Networking

This article is Part 2 of a two-part series on 1.6T Networks and Hyperscale Data Centers

In Part One, we explored the vital role of data centers and edge computing in enabling emerging technologies. The demand for faster data speeds that emerging technologies place on these networks have created a need for 800G and 1.6T transceivers. I also explained the basics of data center infrastructure and how the Institute of Electrical and Electronics Engineers (IEEE) and Optical Internetworking Forum (OIF) create the standards that govern the physical layer transceivers and interfaces that connect data centers and edge computing networks.

In Part Two, we will look at the innovations required to increase Ethernet speeds from 400 Gb/s (400G) to 800G and 1.6 Tb/s (1.6T) to meet the high demands of emerging technologies. The main challenges to faster Ethernet speeds are:
• Speeds/capacity
• Signal integrity
• Power efficiency

Increasing Speed and Data Capacity

One could use more parallel lanes to increase a network’s aggregate data rate. Today’s 400G systems use 56 GBd PAM4 lanes (112 Gb/s). The first generation of 800G will likely consist of eight 112 Gb/s lanes, totaling an aggregate data rate of 800 Gb/s. However, doubling the data rate per lane is more efficient than adding more lanes. Increasing the data rate can be done by increasing either the symbol (or baud) rate or the bits per symbol.

Increasing the symbol rate transmits bits through the channel faster, potentially increasing signal degradation. Increasing the pulse amplitude modulation scheme (PAM-N) sends more bits per symbol, but the margin for error is lower and thresholds tighter.

The IEEE and OIF will consider the tradeoffs of each method of implementation when defining the 800G and 1.6T standards. Both groups have set out to define 800G and 1.6T over 224 Gb/s lanes.

Here are some of the challenges and potential solutions to achieving 224 Gb/s lane rates:

Faster Switch Silicon SerDes

Faster networking switch chips are essential to increasing lane speeds. High-speed application-specific integrated circuits (ASICs) enable low-latency switching between elements in a server rack and the data center. From 2010 to 2022, switch silicon bandwidth increased from 640 Gb/s to 51.2 Tb/s after multiple improvements in CMOS process technology.

The SerDes (serializer/deserializer) speed and the number of SerDes (I/O pins) define a chip’s bandwidth. A 51.2 Tb/s chips has 512 lanes of 100 Gb/s SerDes. That chip can support 128 ports (four lanes of 100 Gb/s) of 400G Ethernet. The next generation of switch silicon will double the bandwidth once again. 102.4T switches will have 512 lanes of 200 Gb/s SerDes. These switches will support 800G and 1.6T over 224 Gb/s lanes.

Higher symbol Rate (PAM)

Increasing the symbol rate can cause signal degradation as the data moves faster through the channel. As maintaining the signal integrity of high-speed digital communications has become more complex, the standards organizations have moved to higher modulation schemes. 400G Ethernet uses four level pulse amplitude modulation (PAM4) SerDes to achieve a 100 Gb/s data rate at a symbol rate of 50 GBd/s. PAM signaling allowed 400G networks to use four 100 Gb/s lanes instead of eight 50 Gb/s lanes.

There is a tradeoff to PAM signaling. Sending more bits per symbol lowers the margin for noise for each symbol. With non-return-to-zero (NRZ) signaling, the threshold range of voltage distinguishing a zero bit from a one bit is higher. As the number of bits per symbol increase, the threshold gets smaller and noise immunity is reduced. Levels of noise that would not close an eye diagram at 50 GBd/s NRZ (meaning the receiver can clearly distinguish between bit levels) can cause trouble to a receiver trying to interpret a 50 GBd/s PAM4 symbol.

Figure 1: PAM4 signals have smaller eye heights and therefore tighter design margins regarding noise and jitter.

Currently, the industry is likely to retain PAM4 commonality and instead look at other methods of maintaining data integrity at high speeds. But future generations of the standard may utilize higher modulation schemes (PAM6 or PAM8).

Maintaining Signal Integrity with Forward Error Correction (FEC)

In most high-speed data standards, finely tuned equalizers in the transmitter and receiver ensure that signals transmitted through a channel can be interpreted on the other end, compensating for signal degradation in the channel. However, as faster speeds push physical limits further, more complex approaches become necessary. One such solution is forward error correction.

Forward error correction is a technique of transmitting redundant data to help a receiver piece together a signal that may have corrupted bits. FEC algorithms are usually good at recovering data frames when random errors occur, but are less effective against burst errors, when entire frames are lost. Losing whole data frames makes it harder for the receiver to reconstruct the signal.

224 Gb/s transceivers need stronger FEC algorithms to successfully transmit and receive data. Each FEC architecture has tradeoffs and advantages of coding gain, overhead, latency, and power efficiency. Test and measurement developers are working on FEC-aware receiver test solutions to identify when frame losses occur and help debug them.

Figure 2: Types of FEC architectures and their tradeoffs. Credit: Cathy Liu, Broadcom

Solving Power Efficiency with Optical Modules

Perhaps the most difficult challenge facing data centers is power consumption. Data centers consume around 1% of the world’s total generated power. Data center operators need to scale processing capacity without proportionally increasing the power consumption. A key component of power efficiency is the optical module.

While power consumption per bit has decreased over time, power consumption per optical module has increased with each successive generation. 100G quad small form factor pluggable (QSFP28) modules used less than 5W of power, but 400G QSFP-DD (double density) modules use up to 14W.

As optical module designs mature, they can become more efficient. 800G QSFP-DD modules are expected to debut with a power consumption of 17W and decrease to 10W as the technology matures. Generally, power consumption per bit is decreasing. However, with an average of 50,000 optical modules in each data center, the increasingly high average power consumption of the modules is still a concern.

Figure 3: Power consumption of optical module generations. Credit: Supriyo Dey, Eoptolink

To increase power efficiency, developers are working on alternative optical modules. Co-packaged optics move the optical module to the ASIC, eliminating the optical retimer, and performing optoelectronic conversion inside the package. The tradeoff is that the power dissipation is concentrated inside the ASIC package. Although pluggable optics are likely to continue to be used in 800G systems, later versions of the 800G or 1.6T standard may utilize co-packaged optics when the technology matures.

Figure 4: Pluggable and co-packaged optics. Credit: Tony Chan Carusone, Alphawave IP

Timeline to 800G and 1.6T

While there is no way of predicting the future exactly, we can make some observations based on the current state of networking R&D. 2022 saw the final release of the OIF’s 112 Gb/s standard and the IEEE’s 802.3ck (400G) standard. These standards will provide the groundwork for defining 800G over 112 Gb/s lanes. The first 51.2T switch silicon was released in 2022, enabling 64 800 Gb/s ports, and validation began on the first 800G transceivers.

This year, the standards organizations will release the first versions of the IEEE 802.3df and OIF 224 Gb/s standards, which will give developers a better indication of how 800G and 1.6T systems might be constructed using 112 Gb/s and 224 Gb/s lanes. The 800G rollout over eight 112 Gb/s lanes is coming.

In the next two years, expect the IEEE and OIF to finalize the physical layer standards and look for more news about co-packaged optics, 1.6T transceivers, and 224 Gb/s SerDes switch silicon. These developments will set the stage for the final validation push for 800G and 1.6T using 224 Gb/s lanes.

Figure 5: Projected timeline for 800G and 1.6T developments

For now, 400G is undergoing mass deployment. Operators will upgrade hyperscale data centers to support the current wave of demand, but ultimately, they only buy time until the next inevitable speed grade. By 2025, we could see 448 Gb/s SerDes chips (100T ASICs) on the market. We could be talking about 3.2T networks by then.

Data centers will always need more efficient and scalable data technologies. 1.6T networks will enable 5G, AI, and IoT applications to fully mature by processing mountains of data at lightning speeds. Today’s developers have their sights on that near future, already hard at work crafting the invisible backbone of tomorrow’s connected society.

But don't just take it from me. Take the 1.6T Ethernet in the Data Center course on Keysight University to get exclusive insights from a variety of industry experts from across the networking ecosystem.