Featured post

Transaction Recording In Verilog Or System Verilog

As there is not yet a standard for transaction recording in Verilog or VHDL, ModelSim includes a set of system tasks to perform transac...

Saturday, 29 October 2011

Xilinx’s New Virtex-7 2000T FPGA with equivalent of 20 million ASIC gates

Xilinx has announced the first shipments of its Virtex-7 2000T Field Programmable Gate Array (FPGA). The Virtex-7 2000T is the world’s highest-capacity programmable logic device – it contains 6.8 billion transistors, providing customers access to 2 million logic cells. This is equivalent to 20 million ASIC gates, which makes these devices ideal for system integration, ASIC replacement, and ASIC prototyping and emulation.
This capacity is made possible by Xilinx’s Stacked Silicon Interconnect technology – also referred to as 2.5D ICs. The simplest packaging technology is to have a single die in the package. The next step up the “complexity ladder” is to have multiple die is the same package, but for all of these die to be attached directly to the package substrate. In this case, compared to the tracks on the die, the tracks on the package substrate are relatively large, slow, and driving signals onto them consumes a lot of power.
What Xilinx are doing is to go one more step up the technology ladder to use a special layer of silicon known as a "silicon interposer" combined with Through-Silicon Vias (TSVs). In this first incarnation of the technology, four FPGA die are attached to the silicon interposer, which – in addition to connecting the FPGAs to each other – provides connections to the package as illustrated below.


In the case of the Virtex-7 2000T, the FPGA die are implemented at the 28 nm technology node, while the passive silicon interposer is implemented at the 65 nm technology node. Implementing the large silicon interposer at this higher node reduces costs and increases yield without significantly degrading performance.
One way to think about this is that the silicon interposer essentially adds four additional tracking layers that can be used to connect the FPGAs to each other with more than 10,000 connections between each pair of adjacent die!
On top of this, Through-Silicon Vias (TSVs) are used to pass signals through the silicon interposer to C4 bumps on the bottom of the interposer. These bumps are then used to connect the interposer to the package substrate.

A view of Xilinx’s Virtex-7 2000T device showing the
packaging substrate (bottom), silicon interposer (middle),
and four FPGA die (top).

Compared with having to use standard I/O connections to integrate two FPGAs together on a circuit board, this stacked silicon interconnect technology is said to provide over 100X the die-to-die connectivity bandwidth-per-watt, at one-fifth the latency, without consuming any of the FPGAs' high-speed serial or parallel I/O resources.
Of particular interest to designers is the fact that, despite being composed of four die, the Virtex-7 2000T preserves the traditional FPGA use model in that users will program the device as one extremely large FPGA with the Xilinx tool flow and methodology.
Xilinx’s first application of 2.5D IC stacking gives customers twice the capacity of competing devices and leaps ahead of what Moore’s Law could otherwise offer in a monolithic 28-nanometer (nm) FPGA. Xilinx says that its customers can use Virtex-7 2000T FPGAs to replace large capacity ASICs to achieve overall comparable total costs in a third of the time, creating integrated systems that increase system bandwidth and reduce power by eliminating I/O interconnect, and accelerating the prototyping and emulation of advanced ASIC systems.

xilinx-bc-02A top and bottom view of Xilinx’s Virtex-7 2000T
the world’s highest-capacity FPGA using
Stacked Silicon Interconnect technology.

“The Virtex-7 2000T FPGA marks a major milestone in Xilinx’s history of innovation and industry collaboration,” said Victor Peng, Xilinx Senior Vice President, Programmable Platforms Development. “Of significance to our customers is the fact that Stacked Silicon Interconnect technology offers capacities that otherwise wouldn’t be possible in an FPGA for at least another process generation. They can immediately add new functionality to existing designs while forgoing an ASIC, cost reduce a 3 or 5 FPGA solution into a single FPGA or move ahead with prototyping and building system emulators using our largest FPGAs at least a year earlier than typical for a new generation.”
Historically, the largest devices that make up an FPGA family are the last to be made available to customers.  This is a result of the time it takes a new semiconductor process to ramp up and support the yields per wafer that make the largest devices economically viable. Xilinx’s Stacked Silicon Interconnect technology overcomes the challenges of yielding defect-free, large monolithic die by building the world’s largest capacity programmable logic device from four separate FPGA die interconnected upon a passive silicon interposer.
“ARM is pleased to work with Xilinx in deploying the class-leading Virtex-7 2000T device into our validation infrastructure,” said John Goodenough, Vice President Design Technology and Automation, ARM. “The new device underpins a flexible, yet targeted, emulation architecture and delivers a significant capacity improvement, allowing us to more easily run complete system verification and validation for our next generation processors.”
The Virtex-7 2000T device also provides equipment manufacturers with an integration platform that will help them overcome the challenges of lowering power while increasing performance and capabilities. By eliminating the I/O interfaces between different ICs on a circuit board, a system’s overall power consumption can be reduced considerably.
Consider the following example provided by Xilinx that compares a single Virtex-7 2000T with four of the largest monolithic ICs as illustrated below:



Actually, this is not really a fair comparison, because in terms of capacity the Virtex-7 2000T is equivalent to only around two of the largest monolithic ICs. But even comparing to two monolithic ICs results in a significant power advantage. (Having said this, I’d be interested to know just what was being exercised in this example – Logic? Memory? DSP slices? SERDES channels? – and at what frequency.)
Customers can also lower bill-of-material, test and development cycle costs when fewer IC devices are required on a circuit board. Because the die align side by side on a silicon interposer, this technology avoids the power and reliability issues that can result from stacking multiple dies on top of each other.  As was previously noted, the interposer includes over 10,000 high speed interconnects between each die enabling the high-performance integration required for a wide range of applications.
The Virtex-7 2000T FPGA gives customers the capacity, performance and power typically only found in large capacity ASICs, with the added benefits of re-programmability. In addition to having 1,954,560 logic cells, the Virtex-7 2000T device includes configurable logic blocks totaling 305,400 CLB slices and max distributed RAM of 21,550 Kbits. It has 2,160 DSP slices, 1,292 x 36Kb BRAMs (giving a total of 46,512 Kb of BRAM), 24 clock management tiles, four PCIe blocks and 36 GTX transceivers (each capable of 12.5 Gbits/second). It also has 24 I/O banks and a total of 1,200 user I/Os.
For the growing number of systems and markets where economics work against ASIC development, the Virtex-7 2000T FPGA offers a unique, scalable alternative to the risk of re-spins and more than $50 million in non-reoccurring engineering (NRE) costs of a 28nm custom-made IC.
All Xilinx 28nm devices – Artix-7, Kintex-7, Virtex-7 FPGAs, and the Zynq-7000 EPP – share a unified architecture that supports design and IP reuse within and across families. They are all built on TSMC’s 28nm HPL (low power with HKMG) process to deliver FPGAs that consume 50 percent less static power than competing devices. Because lower static power becomes increasingly important as device capacity goes up, 28nm HPL is a key factor behind the Virtex-7 2000T device’s lower power consumption compared to designs implemented in multiple FPGAs.


Please provide valuable comments and suggestions for our motivation. Feel free to write down any query if you have regarding this post.