Sailesh: What are some of the biggest macro trends influencing data center design today?
Rami: One of the most apparent trends is the notion of the data center evolving in how it’s designed to optimize every piece around purpose-built computing. The industry used to build on the classic Von Neumann model in which the processing, memory, storage and IO units are all locked together as a unit with x86 architecture CPUs (Intel Xeon) dominating the processing part. Think of the classic “big rack” architecture where workload scaling simply meant adding more of the same homogeneous racks to the installation. The rise of workload-based computing, driven initially by cloud services and more recently by artificial intelligence, has spurred new innovations in data center architectures. What we're seeing now are architectures with more heterogeneous computing, more purpose-built types of computing architectures that incorporate everything from systems-on-chip and GPUs to Tensor processors. There's this whole new range of chips that are much better suited to specific workloads and applications such as speech and pattern recognition, visual inferencing and data analytics. The interlocked resources of memory, storage and IO are also being broken out to become more fungible resources that can be shared dynamically at rack-level scale and beyond.
Sailesh: What are the technology drivers behind this trend?
Rami: One of the main drivers is the move to a much more distributed computing architecture where a growing number of microtasks are farmed out to specialized and more optimized types of compute resources. This is great for improving data center efficiency, but it’s also putting a lot of emphasis on the interconnects that link all of these resources and move data in and out of memory, storage and between all the different processing elements. This area is where my group focuses heavily. We don't do the CPU, GPU, memory or storage, but what we make are the chips that move all the data around whether it's from SSDs, DRAM or the network to the xPU elements and back.
Sailesh: Tell us more about these interconnect challenges.
Rami: Where that's concerned, we’re looking at better ways of data movement that break out of the Von Neumann interlock that I mentioned. The first principles are bandwidth, bandwidth and more bandwidth. The modern datacenter must be built to scale from peta- to exa- to zettabyte capabilities. Heterogeneously distributed computing models coupled with the massive amounts of data needed by modern neural network training algorithms place a huge strain on the interconnect webbing of a datacenter. The second principle is data integrity. Modern datacenters are modeling the old world telecom networks in terms of five 9’s reliability and data security. The third is layering fungibility of resources into the interconnect. New protocols such as CXL, or Compute Express Link, are paving the way for breaking out traditionally captive resources like memory and unlocking true composability in the datacenter. And, finally, a little further into the future is adding compute power into the interconnect structure itself and bringing intelligence closer to where the data resides.
Sailesh: And what does that look like to the data center designer?
Rami: The datacenter architect of 10 years ago focused on building the most efficient “jack of all trades” datacenter they possibly could. Workload optimization was possible but not so easy, and the number of workloads to optimize for was comparatively small. Flexibility and scalability could only be realized by overprovisioning the number of servers, and so power delivery and cooling became dominating concerns. Today, the options available to optimize datacenter resources to a set of workloads or tasks are myriad. A mix of CPU, GPU and FPGA resources can be used to create the most efficient data processing farms possible for a variety of applications. Those resources can also be containerized and offered as services directly to customers with a high degree of customization as popularized by Amazon’s Elastic Compute Cloud Services. Layered around these heterogeneous compute resources are memory, storage and IO components that are increasingly a shared resource for all of them. This “composable” aspect of the infrastructure resources is what is driving the interconnect webbing as a frontier of innovation where Renesas is leading.
Sailesh: What other developments are you seeing in the area of data center processors?
Rami: We're seeing a much richer field of computing SOCs, some of which are being designed in-house by the big data center guys. And Nvidia now has a whole range of different GPU options that they're presenting into the data center model, so we're seeing that universe expanding more and more. We’re also seeing a constant race to increase processor core counts, which are now in the hundreds in some cases. In addition to extending the number of cores, it's also about optimizing the different CPU SKUs appropriately for specific workloads.
Sailesh: Is this creating ripple effects for interconnect design?
Rami: The general complaint from the industry is that while the compute core count is scaling rapidly, memory, storage and IO scaling are not keeping pace. Memory is by far the worst offender as DRAM process migration becomes increasingly challenging. And the way the CPU guys are trying to manage that gap is by adding more and more memory channels in parallel, which just doesn't scale nearly as fast as people would like it to. So, we're building chips that allow DIMMS and DRAM to run with much higher bandwidth than they would natively attached by running them through our own memory interconnect. Another approach to address this longer term is CXL which moves away from the more traditional DDR bus to a serial high-speed interconnect that rides on top of PCIe Gen 5 electrical with protocol hooks that enable load-store memory semantics and cache coherency between multiple compute resources. On the optical side of the world, where we also play, there's a move to transition from the more traditional intra-data-center interconnect, which is based on PAM-4 signaling, to something that looks more like a long-haul optical link, or what we call a Coherent Lite. In this case, we do optical QAM modulation and coherent detection, which enhances the transmission bit rates and distances.
Sailesh: And finally, you mentioned hyperscale computing companies are pulling more processor design in house. How might that affect companies like Renesas?
Rami: This may seem counter-intuitive, but it’s actually providing more opportunities for us. The big companies that host public clouds are starting to build their own silicon internally with the intent to create differentiation from the traditional CPU players and their competitors. This has been an outstanding opportunity to ensure that we're working hand-in-hand with them to enable that differentiation with hooks in our own silicon. It used to be that we’d just meet with a couple of CPU or memory guys and we’d have the whole picture. But now there's a much larger set of people we have to touch during product definition and design. It’s created a much richer ecosystem for us and more opportunities for us to differentiate. We have 20 years of leadership experience with all of these partners and now have the added benefit of strong engagements from other business units to act as force multipliers for the larger Renesas presence.