## A Nanocore/CMOS Hybrid System-on-Package (SoP) Architecture for Future Nanoelectronic Systems

Roshan Weerasekera, Jian Liu, Li-Rong Zheng and Hannu Tenhunen Laboratory of Electronics and Computer Systems (LECS) KTH Microelectronics and Information Technology, 164 40 Kista, Sweden. roshan|jianliu|lrzheng|hannu@imit.kth.se

#### **ABSTRACT**

Recent results showed that when the minimum feature size used in semi-conductor device fabrication moves to sub nanometre scale, several physical and economic limits jeopardize the device behaviour, binary logic, and the lithography techniques currently used. To surpass this "brick-wall" and continue the Moore's Law forever, novel nano-electronic devices are becoming more popular and promising. But, interconnecting nano-devices into complex electronic systems has not yet been demonstrated. In this paper, we propose a Nanocore/CMOS Hybrid System-on-Package (SoP) architecture which is suitable for any emerging nanotechnology.

*Keywords*: AET cell, error-tolerant, hybrid, system-on-package, nanocore, nanoelectronic

#### 1 INTRODUCTION

During the last fifty years the continuous reduction of transistor gate length provided enormous benefits for all information technologies, including information storage, processing and transfer, in increasing their performance and reliability. However, the recent studies put forward some serious challenges when further reduction on device gate length reaches towards sub-50nm feature size. These quantum effects and the growing complexity give rise to profound impact on Integrated Circuit (IC) design, and these effects would impact on determining the best architecture at design time, due to fast moving market demand [1]. As a result, the maintenance of Moore's law for the next decade (2010-2020) does not appear to be feasible with simple development of major technologies and devices available today.

Nanoelectronics has achieved several breakthroughs and promises to overcome many of the limitations intrinsic to current semiconductor devices and manufacturing. Resonant Tunnelling Diodes (RTDs), Single Electron Tunnelling Devices (SETs), and Quantum Cellular Automata (QCAs) are few examples of some successful nanoelectronic devices [1][2]. But, interconnecting nanodevices to larger circuits is still one of the most difficult challenges [3]. Key challenges for interconnecting nanoelectronics can be summarized as follows:

- 1. *Interconnect delay*: The interconnect delay for a 0.07um wide copper wire (100um long) is about 0.7ns, which is thousand times larger than a gate delay of nano device; i.e. the performance gain obtained by devices will totally collapse because of the excessive interconnect delay;
- 2. *Signal integrity*: The single electron (or a packet of electrons) current are so tiny that every kind of randomly distributed background charges and their fluctuations (induced by thermal, switching, crosstalk etc) could destroy the information carried by the logic gates;
- 3. **Defect-free chip**: the fabrication process in nanotechnology is inherently non-deterministic and prone to be higher defect rates. It is likely that defect-free nano-electronic chips will be impossible at tera-scale integration density.
- 4. **Power management:** the extremely low power consumption for each device doesn't mean low power consumption of the whole chip. In fact, tera-scale integration will result in near 100W/cm<sup>2</sup> power density; the question is how to send hundreds Amperes of current at below 1V supply or even in mV, and how can we manage the power supply noise so that the single electron signals will not be destroyed.

We believe that future nano-system could be build either on nano-CMOS style of technologies, or completely new, emerging approaches, which are not known at the time of this publication. However, in recent years, mesh type of programmable circuits, for example Neuromorphic network in a crossbar structure, have been proposed as a viable solution to circuit level architectures, not at system level, because it does not touch the design complexity problem. Despite the progress of nano-interconnects and selfassembly, there is no viable physical solution for the global interconnections of a large nano-system by these wires, particularly when performance is demanded. Both the hybrid approaches proposed in [3] and [7] elucidates the integration of circuitry with the prefabricated CMOS substrate, but a method to achieve system level fault tolerance has not been discussed.

We propose a nanocore/CMOS hybrid system-on-packaging (SoP) approach that can smartly cope with these limitations and hence effectively utilize emerging advanced new technologies.

## 2 STRATEGY OF FAULT TOLERANT ARCHITECTURE

As we know today's microelectronic circuit design relies heavily on abstractions due to increased design complexity. This abstraction-based design methodology has been successful primarily due to accurate modelling and simulation at different abstraction levels. Initiated by this, our primary strategy is therefore to create new abstractions at architecture level, and this architecture must be scalable. This scalability implies:

- (a.) architectural and system scalability with respect to complexity.
- (b.) performance scalability with respect to geometrical scaling in underlying device and circuit structures, and
- (c.) design effort scalability with respect to increased functionality.

Furthermore, dealing with the physical effects for producing error- and fault-free pre-tested circuits will be increasingly difficult and cost intensive. Thus, our idea is to partly move error and fault tolerance issues to system design and architecture level from today's low level testing and testability design.

Especially, in the case of networks-on-a-chip type of array structures, possibilities for dynamic management of these issues can be established as part of the on-chip services provided by the chip hardware and firmware. For implementing "infinitely" scalable systems, fine-grained homogeneous array processing seems to be a reasonable solution. In homogenous array style solutions, the cell functionality can easily be relocated due to identical seed cells if reconfiguration is needed for error-tolerance or system optimization reasons. Of course, this also requires methodology for handling of error-tolerance issues. When designing a dynamically reconfigurable and error-tolerant system only at hardware level, basic array cells become too complicated and time-consuming to design and very inefficient to implement. We believe these issues can be tackled by developing software type system objects for implementing such properties mainly at system/architecture level, and leaving some support functionality to be implemented at lower levels.

There are two alternative approaches exist for implementing reconfigurable dynamic platform architectures: the Cellular Neural Network (CNN) and Autonomous Error-Tolerant (AET) cellular network architecture as used in this work. The CNN array concept is already used successfully for some parallel computation tasks with current technologies using both analogue and digital processing. Scaling these implementations to future tera-scale nano-systems is yet to be investigated, particularly the intercellular interconnect schemes. In our AET concept, cells are physically autonomous and flexible, and the overall network is homogenous with identical cells and constant pattern symmetric wiring - implying strict constraints for intercellular connection schemes. Both CNN

and AET approaches seem to have potential to become mainline scenarios for future nanoscale systems [6].

## 3 NANOCORE/CMOS HYBRID SOP ARCHITECTURE

#### 3.1 AET cell structure

A single AET cell, depicted in Figure 1, is of hexagonal shape that increases symmetry and homogenousity, because hexagon is the most complex polygon which can be used to fill a plane regularly. A plane filled with hexagons remains similar always when it is rotated only by 60°. That's why a hexagon is a natural choice when implementing algorithms that could spread fractally. There are six symmetrical directions to proceed in comparison with three or four directions when using triangular or square shapes for single cells. Each hexagonal (AET) cell consists of a Nano-core and CMOS cell-peripherals and their interface circuits.



Figure 1: Structure of AET cell (a) Cross-sectional view (b)

The nanocore is dedicated for local computing and it could be non-FET or non-Silicon. The efficient architecture for nanocore is basically not well known at the time of this publication, but there is some promising work in the literature proposing crossbar architectures. For our method, nanocore should not be strict to a particular nanotechnology, bur we assume that it is nano devices in a crossbar. Error control coding can be implemented at the

interface of nano-core and CMOS, most likely will be in a threshold logic type of circuits (majority voting logic). However, Reference [7], describes that it is advantageous to drive the nanocroe with a different signal swing than the CMOS operating voltage levels, therefore CMOS level shifters has been proposed to drive the crossbar input lines, and sense amplifiers to restore the crossbar output signal to CMOS voltage levels. The nano/CMOS interface in AET cell, discussed in this work, might be circuits of that nature.



Figure 2: Cellular Cluster [6]



Figure 3: Fractal Cluster [6]

### 3.2 Cellular Cluster and the Fractal Cluster

If future system function designers will still access inside a cell (mainly rely on IP re-use), reasonably assuming at ten million gates level as in today's VLSI chips, it seems one extra abstraction level (i.e. the cell) is not enough for a trillion-device chip. We propose to create another abstraction, namely cellular cluster (Figure 2). A nano-system will therefore be networked by thousands of

fractal cellular clusters (Figure 3). The cell structures are similar as [6], and the main difference in ours is that all the cells will be implemented in a nanocore/CMOS hybrid architectures, i.e. the core of each AET cell could be non-FET, non-Silicon nano-devices, while all cell I/Os and intercellular communication are implemented in CMOS substrate with fixed connections.

The cellular cluster should be able to monitor its environment through sensors and take actions, for example switch off a failure cell and assign its task to an empty cell. Then we propose the cell of origin as Cluster manager, which senses its neighbouring cells and interact with other clusters. Each cellular cluster consists of a homogonous cellular array fabric, completing for certain functions such as memory, signal processing, computing. Thus, the whole chip is somewhat like a human brain in which each region (and their cells) is dedicated to a certain function such as language, memory and sight, but we expect that AET cellular network will be more powerful as it is networked high-performance by dedicated inter-cellular interconnections.

#### 3.3 Co-existance of Nano devices and CMOS

Since non-silicon devices such as molecular electronics is used for the nanocore, our idea is to use system-on-package (SoP) solution [10] . Here, we use silicon as the integration platform on which CMOS cellular I/Os and other circuits can be fabricated. Nano-cores are grown on the top of these CMOS circuits. They are interconnected via vertical nano-wires. Essential tasks are then design and conceive of new seamless interconnections such as microcoaxial vias and package structures which are compatible with future nano-fabrication technologies.

## 3.4 Fault Tolerance and Reconfigurability

Our AET cellular network is based on autonomously working cells and cell clusters. The network is used for system configuration management. It aims to map application functionality to implementation architectures, providing mechanisms for error-tolerance, self-protection and re-configuration, as required. This implies that a cellular cluster should be able to monitor its environment through sensors and take actions, for example, switch off a failure cell, find an empty cell and establish a new communication link. Therefore, it should contain application information, an autonomous controller for decision, performance analysis and logic re-configurability. We propose to assign a cell of origin as the cluster manager, who is responsible for these tasks and interact with other clusters. Cell temperature and power consumption are two basic environmental parameters that will be monitored. In order to effectively protect the system from failure, we propose an innovative power distribution strategy that employs local power plants implanted in each autonomous cell. Each local power plant is equipped with

power regular circuits, power switching circuits, current and temperature monitoring circuits [5]. By such, it becomes failure-aware and self-protected and also works independently, thus providing error-tolerance capability for the whole system. Adjacent cells will be kept informed each other on their status so that new communication links can be re-configured when needed. All these circuits are preferred to be implemented in the CMOS cell peripherals in order to guarantee their quality and predictability.

# 3.5 Interconnections and Intracellular communication

The AET cells are physically autonomous and flexible, and the overall network is homogenous (with identical cells within a cluster) and constant-pattern symmetric wiring implying strict constraints for intercellular connection schemes. Despite of the locality of computing within each cell, the overall performance of the AET cellular network is very much dependent on the performance of the intercellular and inter-cluster interconnects. In this task, we develop a hierarchical analysis approach that can optimally define cell size and cluster size for a specific technology so that best performance can be achieved [8]. It is reasonable to assume that each AET cell contains a nano-core that can be synthesizable by design automation tools or we assume that each AET cell is an autonomous synchronous subsystem. Based on this, we can estimate circuit complexity and size of a nano-core as well as its performance. Bandwidth of intercellular and inter-cluster interconnects will be defined [9].

Because nano-devices are lack of significant voltage gain, we propose CMOS as cell I/Os for intercellular communications. The second reason is that CMOS offers better wires than nano-wires and this is very important for high-performance inter-cellular communications. Despite of better wires, the interconnections will be very prone to error due to surrounding noise. We plan to use spectrum modulation/demodulation schemes to overcome the noise issue. Furthermore, we propose to use a feedback loop for the communication link. This feedback loop is not necessary to be high speed, but will send information about signal quality for the established link and hence, it allows signaling circuits to self-adjust via transmitter/receiver equalization. Finally, the overall networked-system is not necessary to be synchronized, however, over 10GHz clock will still be used for global reference or local clocking net in each cell.

#### 4 SUMMARY AND DISCUSSION

For different nano-technologies, we need to identify what are the critical issues when change to another technology and what should be modified. It is important to generalize the design method rather than do some specific design examples. As our AET cell consists of cell I/Os in CMOS and a nano-core, most critical part will be the

interface circuits between them when a technology is altered. For example, when a nano-core changes from nano-CMOS to non-FET devices such as quantum dots array, the interface will be totally different. The first one still uses physical wires but in smaller size. The interface circuits will be like a majority voting multiplex whereas in the second one, the quantum dots are interconnected via near-field-coupling, no wires exist. The interface will hence be completely new.

#### 5 CONCLUSION

It is well understood that major challenges of the future is dealing with the complexity of systems comprising billions of devices and their interconnections. We propose to AET cell based highly scalable system architecture which reduces the design complexity problem too. The nanocore/CMOS hybrid SoP architecture proposed in this work exploit the benefits of both emerging nanotechnologies and the existing CMOS technology. Furthermore, it can be extended to any up-and-coming nanotechnology.

#### REFERENCES

- [1] International Technology Roadmap for Semiconductors, 2003, (http://public.itrs.net).
- [2] R. Compano et.al., "Technology Roadmap for Nanoelectronics", European Commission, 2000
- [3] K.K. Likharev, "CMOL: A New Concept for Nanoelectronics", 12<sup>th</sup> Int. Symp. Nanostructures: Physics and Technology, Russia, June 21-25, 2004
- [4] Karl F. Goser et. al., "Aspects of Systems and Circuits for Nanoelectronics," Proceedings of IEEE, 85, 558-573, 1997.
- [5] T. Nurmi et. al., "Power Management of the Autonomous Error-Tolerent Cell" ASIC/SoC Conf., Sep, 2002
- [6] T. Valtonen et. al., "An autonomous Error-Tolerant Cell for scalable Network on Chip Architectures" 19th Norchip Conf., Nov, 2001
- [7] Matthew M. Zeigler et. al."CMOS/NANO Co-Design for Crossbar-Based Molecular Electronic Systems", IEEE Tran. On Nanotechnology, 2,217-230, Dec. 2003.
- [8] J. Liu et. al., "Interconnect intellectual property for Network-on-Chip(NoC)", Journal of Systems architecture, 50, 65-79, Feb 2004.
- [9] D. Pamunuwa et. al., "Maximizing Throughput over Parallel Wire Structures in the Deep Submicrometer Regime", IEEE transactions on VLSI Systems, 11,2, 224-243, April 2003
- [10] L. -R. Zheng et. al.,"Cost- performance trade-off analysis in Radio and mixed-signal System-on-Package design", *IEEE Trans.on Adv. Package*, Special Issue on SoP, 27(2), May 2004.