# An Architecture of Quantum CPU

Víctor H. Téllez <sup>1,3</sup>, Antonio Campero<sup>2</sup>, Cristina Iuga<sup>2</sup>, Gonzalo Duchen <sup>3</sup>

<sup>1</sup>Electrical Engineering Departament, <sup>2</sup>Chemical Departament, DCBI, Universidad Autonoma Metropolitana Iztapalapa, Av. San Rafael Atlixco 186, Col. Vicentina, Iztapalapa 09340 D.F. Mexico, email: vict@xanum.uam.mx

<sup>3</sup>SEPI, ESIME Culhuacan, Av. Santa Ana 1000, Col. San Francisco Culhuacan, C.P. 04430, D.F.

## **ABSTRACT**

We present a quantum CPU (central processing unit) which is based on the quantum gate set. Using the Von Neuman model, the quantum CPU has development on a classical Digital Signal Processor (DSP TI6711). The DSP simulated different gates, as the Feynman gate, and the set of the different gates make possible the quantum CPU. Generically, hardware of the quantum CPU are modeled in terms of quantum spins (qubits) that evolve in time according to the time-dependent Schrodinger equation (TDSE). Furthermore, we will try quantum error-correcting code to fight de coherence and operational errors.

Keywords: processing unit simulator, Short's factorization, , parallel processing, pipeline processing

#### INTRODUCTION

The organization of a digital computer talks about to the logical units that compose it (like the ALU, Registers, Control, the unit of memory and the unit of input/output), the functions that make, her operation and the form in which they are related and communicates with others. The architecture of the Quantum Central Processing Unit(QCPU) focuses in the form to construct each one of these units logics so that they make the functions specified by his organization, as well as the way in which these units are going to communicate to interact among them.

However, simulations often require more computational power than is usually available on sequential computers. Therefore, we have developed the simulation method for parallel computers. That is, we have developed a general-purpose simulator for quantum algorithms and circuits on the parallel computer, Symmetric Multi-Processor.

A quantum computer consists of quantum set gates which obey the laws of quantum mechanics. The complexity of a quantum system is exponential with respect to the number of particles. Performing computation using these quantum particles results in an exponential amount of calculation in a polynomial amount of space and time [1] [2] [3]. This quantum parallelism is only applicable in a limited domain. Prime factorization is one such problem which can make effective use of quantum parallelism [4]. This is an

important problem because the security of the RSA publickey cryptosystem relies on the fact that prime factorization is computationally difficult [5]. Errors limit the effectiveness of any physical realization of a quantum computer. These errors can accumulate over time and render the calculation useless [6]. The simulation of quantum circuits is a useful tool for studying the feasibility of quantum computers [7]. Simulations inject errors at each step of the calculation and can track their accumulation. Because of the exponential behavior of quantum systems, simulating them on conventional computers requires an exponential amount of operations and storage. For this reason, to simulate problems of interesting size we must employ the use of parallel supercomputers. In this paper we describe a pipeline simulator (using the Digital Signal Processor) which models the accumulation of errors in a Quantum Central Processing Unit (QCPU).

## **METHODOLOGY**

The von Neumann architecture is a computer design model that uses a single storage structure to hold both instructions and data. The term describes such a computer, which implements a Universal Turing machine, and the common "referential model" of specifying sequential architectures, in contrast with parallel architectures.

The separation of storage from the processing unit is implicit in the von Neumann architecture. The term "stored-program computer" is generally used to mean a computer of this design (Figure 1).

Iit's possible to be observed in figure 1, the components that they integrate to the model are:

Arithmetic Logic Unit (ALU), Memory, and Control, it is also observed that to the ALU they integrate a registry called accumulator, and we can see the input and output. The memory and control it's an important component of the Von Neuman model.

For the development of the ALU, we use an adder, using the Feynman gate. A 2\*2 Feynman gate, also called CNOT, output is given by the equation  $(A, B) \div (P = A, Q = A r B)$ .

This gate is one through because it passes one of its inputs. Every linear reversible function can be built by using only 2\*2 Feynman gates and inverters (Figure 2).



Figure 1. Von Neuman Model

The detailed cost of a reversible gate depends on any particular realization technology of quantum logic. According to Perkowski et al. , it is assumed that the [8] cost of every 2\*2 is the same. A 1\*1 cost nothing, since it can be always included to arbitrary 2\*2 gate that precedes or follows it. Thus, every permutation quantum gate will be build from 1\*1 and 2\*2 quantum primitives and its cost calculated as a total sum of 2\*2 gates.



Figure 2. Feynman gate, used as adder for Von Neuman Model

Part of the ALU it's some gates set like:

**Toffoli gate:** A 3\*3 Toffoli gate, also called controlled controlled-NOT (CCNOT), output is given by the equation  $(A, B, C) \div (P = A, Q = B, R = ABr C)$ . This gate is two through because two of its outputs are identical of its inputs. Using the well known realization of Toffoli gate with truly quantum 2\*2 primitives, as shown in Figure 3 according to Perkowski et al., the cost of Toffoli gate is [10] five 2\*2 gates, or simply, 5. In Figure 3 V is a square-root-of-NOT gate (unitary matrix V) and V is its hermitian.

Thus +VV creates a unitary matrix of NOT gate and VV = 1 (an + + identity matrix, describing just a quantum wire).



Figure 3. Toffoli gate

**Fredkin gate:** A 3\*3 Fredkin gate output is given by the equation  $(A, B) \div (P = A, Q = A*B + AC, R = A*C + AB)$ . This gate is conservative, that is, the Hamming weight (number of logical ones) of its input equals the Hamming weight of its output. The Fredkin gate is sometimes called the controlled permutation gate (Fig. 4). The cost of Fredkin gate is exactly the same as the cost of Toffoli gate and not more than of it, as some [8] authors assumed and as may be suggested by classical binary schema of such gates, where the Toffoli gate includes a single Davio gate, while the Fredkin gate includes two multiplexers.



Figure 4. Fredkin Gate

**The Register Accumulator** on the Von Neuman Model is the Qbit state in the register give for the next equation [9]:

$$\left|\phi\right\rangle = \sum_{i=0}^{2^n-1} \alpha_i \left|i\right\rangle \text{ where } \alpha_i \in C, \sum_{i=0}^{2^n-1} \left|\alpha_i\right|^2 = 1$$
 (1)

That is, the state of an n-qubit register is represented by a unit-length complex vector on  $H_{2^n}$ . In a classical computer, to store a complex number  $\alpha = x + iy$ , one require to store a pair of real numbers (x, y).

The system are modeled in term of Quantum Spin (QBits), that envolve in time according to the time-dependant Schrödinger equation (TDSE)

$$i\frac{\partial}{\partial t}|\phi(t)\rangle = H(t)|\phi(t)\rangle$$
 (1)

in units such that  $\hbar = 1$  and where

$$|\Phi(t)| = a(\downarrow, \downarrow, \dots, \downarrow; t) |\downarrow, \downarrow, \dots, \downarrow + + a(\uparrow, \downarrow, \dots, \downarrow; t) |\uparrow, \downarrow, \dots, \downarrow + \dots + a(\uparrow, \uparrow, \dots, \uparrow; t) |\uparrow, \uparrow, \dots, \uparrow \qquad (2)$$

describes the state of the whole QC at the time t. The complex coefficients  $a(\downarrow, \downarrow, ..., \downarrow; t), ..., a(\uparrow, \uparrow, ..., \uparrow; t)$  completely specify the state of the quantum system. The time-dependent Hamiltonian H(t) take the form [11]

$$H(t) = -\sum_{j,k=1}^{L} \sum_{\alpha=x,y,z} J_{j,k,\alpha}(t) S_{j}^{\alpha} S_{k}^{\alpha}$$

$$-\sum_{j=1}^{L} \sum_{\alpha=x,y,z} \left( h_{j,\alpha,0}(t) + h_{j,\alpha,1}(t) sen(f_{j,\alpha}t + \varphi_{j,\alpha}) \right) S_{j}^{\alpha}$$
(3)

Where the first sum runs over all pairs P of QBits,  $S_j^\alpha$  denotes the  $\alpha$ -th component of the spin ½ operator representing the j-th QuBit,  $J_{j,k,\alpha}(t)$  determines the strength of the interaction between the QBits labeled j and k,  $h_{j,\alpha,0}(t)$  and  $h_{j,\alpha,1}(t)$  are static and periodic field acting on the j-th spin respectively. The frequency and the phase of the periodic field are denoted by  $f_{j,\alpha}$  and  $\varphi_{j,\alpha}$ . The number of QBits is L and the dimension of the Hilbert space  $D=2^L$ . Hammiltonian (Eq 3) is sufficiently general to capture the salient features of the most physical models of the QC. Interactions between QBits that involve different spin components have been left out in equation 3 because we are no aware of a candidate technology of QC where these would be important. Incorporating these interactions requires some trivial additions to the simulator program.

Procedures to construct unconditionally stable, accurate and efficient algorithms to solve the TDSE of a wide variety of continuum and lattice models have been reviewed elsewhere [12, 13]. A detailed account of the applications of this approach to two-dimensional quantum spin models can be found in [14]. According to the equation 2 the time evolution of the QC, the solution of TDSE (Equation 1), is determined by the unitary transformation

$$U(t+\tau,t) = \exp_{+}\left(-i\int_{t}^{t+\tau} H(u)du\right)$$
 (4),

where exp denotes the time-ordered exponential function. Using the semi-group property of  $U(t+\tau,t)$  we can write

$$U(t+\tau,t) = U(t+m\delta,t+(m-1)\delta)...U(t+2\delta,t+\delta)U(t+\delta,t)$$
 (5)

where  $\tau = m\delta(m \ge 1)$ . In general the first step is to replace each  $U(t+(n+1)\delta,t+n\delta)$  by symmetries Suzuki product-formula approximation [13,14].

#### DEVELOPMENT

For the development of the project, we use the DSP TI TMS6711, with architectural optimizations to speed up processing. This DSP can be connected on classical personal computer for transfer the data between them, and the architectural features is the next:

#### Program flow:

- Floating-point unit integrated directly into the data-path.
- Pipelined architecture
- Highly parallel accumulator and multiplier
- Special looping hardware. Low-overhead or Zerooverhead looping capability

#### Memory architecture:

- DSPs often use special memory architectures that are able to fetch multiple data and/or instructions at the same time:
- Harvard architecture
- Modified von Neumann architecture
- Use of direct memory access
- Memory-address calculation unit

#### Data operations:

- Arithmetic is often used to speed up arithmetic processing.
- Single-cycle operations to increase the benefits of pipelining.

#### Instruction sets:

- Multiply-accumulate (MAC) operations, which are good for all kinds of matrix operations, such as convolution for filtering, dot product, or even polynomial evaluation (see Horner scheme, also fused multiply-add).
- Instructions to increase parallelism: SIMD, VLIW, superscalar architecture.
- Specialized instructions for modulo addressing in ring buffers and bit-reversed addressing mode for FFT cross-referencing.
- Digital signal processors sometimes use timestationary encoding to simplify hardware and increase coding efficiency

This DSP can be programed with a friendly Software (Code Composer Studio) with the with the algorithms of the quantum mechanics systems and numerical methods to solve the states of the QBits.

## **DISCUSION**

This work presents a proposal on a QCPU, using the Von Neuman Model. Where all the components of the Von Neuman Model can be simulated, but the only difficult on make the simulation's is the integration of all units that integrate it (like Memory, ALU, and control). The, control unit, in classical Von Neuman Model, is a unit sequencer that handled all units. But in QCPU, these units will be complex, due to the parallelism that the QBit's work. And for this moment we don't have a candidate model for this unit.

## **CONCLUSION**

We keep working on make a better proposal for a QCPU, the next step it's:

- to make a better selection on control unit hardware
- make a good selection of Qgates for the operation of quantum circuits
- THE set instruction for the good operation of the QCPU
- To improve the simulation on better classical computers

#### REFERENCES

- [1] Timoshenko and Woinowski-Krieger, "Theory of Plates and Shells," McGraw-Hill, 415-4215, 1959.
- [2] D. Snow, M. Major and L. Green, Microelectronic Engr. 30, 969, 1996.
- [3] Landauer, R., 1961. Irreversibility and heat generation in the computing process. IBM J. Res. Develop., 5: 183-191.
- [4] The Physics of Quantum information, Edited by D. Bouwmeester, A. Ekert,
- and A. Zeilinger (Springer-Verlag, Berlin, 2000).
- [5] C. H. Bennett and D. P. DiVincenzo, Nature (London) 404, 247 (2000).
- [6] A. Barenco et al., Phys. Rev. A 52, 3457 (1995).
- [7] M. A. Nielsen and Isaac L. Chuang, Phys. Rev. Lett. 79, 321 (1997).
- [8] G. Vidal, L. Masanes and J. I. Cirac, Phys. Rev. Lett. 88, 047905 (2002)
- [9] K. Obenland and A. Despain. A parallel quantum computer simulator, 1998.
- [10] Perkowski, M. et al., 2003. A hierarchical approach to computer-aided design of quantum circuit. In: 6<sup>th</sup> Intl. Symp. on Rep. Methodol. Future Comp. Technol., pp: 201-209.
- [11]H. D. Raedt, "Product formula algorithm for solving the time dependant Shoeringer equation", Comp. Phys. Rep. 7, 1-72, 1987.
- [12]H. D. Raedt, "Computer simulation of Quantum phenomena in Nano-Scale device", 107-146, Annual

reviews of computational Physics IV, Ed. D. Stauffer, World Scientific, 1996.

[13]P. de Vries, H D. Raedt, "Solution of the time-dependant Shoeringer equation for two bidimensional spin ½ Heisenberg systems", Phys. Rev B. 47-7929-7937, 1993. M. Suzuki, S. Miyashita, A. Kuroda, "Monte Carlo [14] Simulation of Quantum Spin Systems", Prog. Theor.

Phys., 58, 1377-1387, 1977.