We propose a quantum algorithm for the lattice Boltzmann (LB) method to simulate fluid flows in the low Reynolds number regime. First, we encode the particle distribution functions (PDFs) as probability amplitudes of the quantum state and demonstrate the need to control the state of the ancilla qubit during the initial state preparation. Second, we express the LB algorithm as a matrix-vector product by neglecting the quadratic non-linearity in the equilibrium distribution function, wherein the vector represents the PDFs, and the matrix represents the collision and streaming operators. Third, we employ classical singular value decomposition to decompose the non-unitary collision and streaming operators into a product of unitary matrices. Finally, we show the importance of having a Hadamard gate between the collision and the streaming operations. Our approach has been tested on linear/linearized flow problems such as the advection-diffusion of a Gaussian hill, Poiseuille flow, Couette flow, and lid-driven cavity problems. We provide counts for two-qubit controlled-NOT and single-qubit U gates for test cases involving 9–12 qubits with grid sizes ranging from 24 to 216 points. While the gate count aligns closely with theoretical limits, the high number of two-qubit gates on the order of necessitates careful attention to circuit synthesis.
I. INTRODUCTION
Recent advancements in quantum computing algorithms and the availability of actual quantum processor units (QPUs) offer an exciting opportunity for the field of computational fluid dynamics (CFD) to explore the potential of quantum computing. The CFD workflow generally involves three key steps: mesh generation (for a given geometry), flow modeling (governing equations, discretization, and numerical methods), and post-processing of the results (analysis and visualization). Each of these tasks can demand high computational resources in terms of memory and speed, depending on the problem being solved.
For instance, a three-dimensional direct numerical simulation (DNS) of turbulent flow for a Reynolds number of would require approximately grid points.21,41 Storing a flow variable at each of these points using 64-bit precision floating values would require approximately bits of memory, which is equivalent to eight thousand petabytes ( GB) of memory. Quantum superposition and the encoding of classical information on a quantum processor as probability amplitudes suggest the number of qubits required would be approximately 60 [ ]. Thus, in terms of computing memory, there is a promising opportunity to perform large-scale CFD simulations on QPUs.17 However, the QPUs are still in their early stages of development (limited numbers of noisy qubits), and the potential benefits of quantum speed-up have yet to be fully established.8,30
Present-day qubits in quantum computers can be affected by errors and imperfections, such as decoherence, gate errors, and measurement errors. In particular, decoherence times are an important factor in determining the performance and feasibility of quantum computations, wherein decoherence refers to the process by which a qubit loses its quantum properties and behaves more classically due to environmental interaction. Thus, the current noisy intermediate-scale quantum (NISQ) devices are characterized by having a limited number of qubits and relatively short coherence times, which impact their computational capabilities.
Typical efforts to develop a quantum CFD involve either (i) the application of Schrödinger equation to describe fluid motion or (ii) the development of algorithms for solving classical fluid mechanic equations for implementation on QPUs. In the former case, an analogy between the Navier–Stokes equation (NSE) and the Schrödinger equation has been derived through Madelung transformation.28 The latter case focuses on the applicability of existing CFD algorithms portable to QPUs, which received much more attention due to the extensive advancement in CFD algorithms over the past few decades. The main steps to perform CFD on QPUs involve encoding classical data (such as velocity and pressure fields), designing circuits representing differential or integral operators, and employing measurement techniques to convert quantum data back into classical variables.
The encoding of classical data requires the preparation of quantum states, which begins with initializing all qubits in their ground state. This is followed by the application of a specific set of quantum gates designed to manipulate the qubits into the desired quantum state. The resulting state vector contains the probability amplitudes corresponding to the classical data. A key challenge in preparing an arbitrary quantum state for a system comprising n qubits is the exponential increase in the number of required two-qubit controlled-NOT (CNOT) gates. Bharadwaj and Sreenivasan6 have investigated two approaches relevant to CFD that demonstrate reduced CNOT gate requirements, suggesting potential applicability in NISQ devices. In transient flow simulations, reinitialization of the quantum state is necessary at each time step to incorporate updated information, which involves applying a series of quantum gates. This reinitialization process becomes increasingly resource-intensive as both the system size and the number of iterations increase. Moreover, quantum measurements are essential for extracting classical information at each iteration; however, these measurements lead to the collapse of the quantum state, necessitating further reinitialization. This phenomenon introduces additional computational overhead, as multiple measurements may be needed to obtain reliable results, particularly when dealing with complex nonlinear partial differential equations.6,8,38,41
In CFD, the governing equations can be discretized and expressed in matrix-vector form. For explicit time-marching schemes, the discretized equation will be of the form , where matrix A represents the discrete differential/integral operator and vector represents the flow variable at time t. In this context, the matrix-vector product can be viewed as a quantum state evolution, where qubits in a registry are initialized with state vector and result in state vector containing the probability amplitudes in the chosen basis state. If the matrix A is not unitary, it can be decomposed as a linear combination of Pauli matrices, or a Hamiltonian can be constructed as which can then be converted to a unitary matrix10,15,27 as .
For implicit schemes, the equation becomes . In this case, quantum linear solver algorithms such as the Harrow–Hassidim–Lloyd (HHL) algorithm6,7,14,19,25 or the variational quantum linear solver (VQLS)9,20,31 can be used. VQLS belongs to a class of hybrid algorithms known as variational quantum algorithms (VQAs) that utilize classical optimization schemes on CPUs to provide circuit parameters implementable on NISQ devices. Thus, VQLS performs two tasks simultaneously: quantum circuit parametrization and solution to linear system of equations. In summary, any CFD algorithm can be viewed as a Hamiltonian simulation on QPUs.
Instead of discretizing the NSE, an alternative approach known as the lattice Boltzmann method (LBM), which is based on kinetic theory, has been shown to effectively solve the NSE in the incompressible limit. It is important to note that LBM is not restricted to incompressible flows; it can also simulate compressible flows.1,16,43 LBM describes the evolution of particle distribution functions (PDFs) on a Cartesian grid with a predefined lattice structure. The moments of these PDFs yield the macroscopic flow variables of primary interest, such as density and velocity. Due to particle interactions with their nearest neighbors, LBM demonstrates high parallel efficiency on GPUs. However, the drawback comes in terms of computational memory. For instance, in an isothermal and incompressible flow simulation with N grid points and lattice directions, storing variables for present and future time steps in LBM will be , whereas, in traditional schemes, it would be ; D-number of physical dimension. The most common lattice structures for 2D and 3D simulations are D2Q9 and D3Q19, which give the scaling factor and 19, respectively.
Before the development of the quantum lattice Boltzmann method (QLBM), Yepez44 proposed the quantum lattice-gas model (QLG) for fluid flow simulations that can be implemented using a hybrid quantum-classical nuclear magnetic resonance (NMR) computing methodology, commonly known as Type-II quantum computer (T2QC). Developed in the early 2000s, a T2QC consists of an array of small quantum computers linked by classical communication channels. Several QLG algorithms were developed for T2QC.4,26,45 Additionally, Pravia et al.32 successfully implemented the quantum lattice gas algorithm for the one-dimensional diffusion equation on T2QC. By using two-qubit processors assigned to each of the 16 grid points, they found that the experimental measurements at seven computational time steps aligned well with both the simulation and analytical solutions. This motivated further developments in the QLBM.
Typically, the LBM algorithm is divided into collision and streaming operations. The collision operator is local and has a non-linearity in the velocity, whereas the streaming operator is non-local and linear. Significant work on QLBM can be categorized based on (i) tackling the non-linearity,21,35,40 (ii) circuit design of QLBM,11,12 (iii) implementation of streaming without collision,36,42 and (iv) encoding.37 The fundamental challenge in QLBM involves handling non-linearity in the equilibrium distribution function and its computation through unitaries, which are linear operators.41 One way to tackle non-linearity is to perform Carleman linearization (CL) of the LB equations, resulting in an infinite dimensional linear system.21,22,25,34,35 At second-order truncation, the qubit requirement will increase by a factor of two before CL. Instead of CL which utilize amplitude encoding, Ref. 40 designed a quantum circuit based on computational basis encoding, which performs quantum floating point arithmetic and thereby evaluates the non-linear term. Excluding the non-linear term, Mezzacapo et al.29 developed a quantum simulator suitable for trapped ion and superconducting quantum systems and attempted to solve a two-dimensional advection-diffusion equation (ADE). The collision operator was decomposed as a sum of unitaries, and the streaming operator using the D2Q4 lattice was represented as shifted diagonal matrices. Later, Budinski11 presented a complete circuit with single/ two qubit gates for solving ADE with the D1Q3 and D2Q5 lattice model along with linearized equilibrium distribution function (EDF) and constant flow velocity assumptions. An extension to stream function-vorticity formulation of NSE using QLBM was also attempted with the D2Q5 lattice.12 In contrast to amplitude encoding, which occupies the whole of Hilbert space, Schalkers and Möller37 argued that quantum parallelism is best exploited when we utilize the least of Hilbert space. Instead of PDFs, if the velocity were encoded in qubits, neither amplitude encoding nor computational basis encoding technique would permit the collision and streaming to be unitary at once.
Based on the above discussion, it is evident that there is no comprehensive quantum model for LBM that encompasses collision, streaming, boundary conditions, and external forcing terms for solving various fluid flow problems in the existing literature. Moreover, treating boundary conditions as a separate operator, dealing with non-orthogonal states, and the possibility of taking measurements at each time step present challenges. As discussed earlier, the measurement-induced collapse of the quantum state further complicates matters, as each measurement alters the state and can potentially introduce errors. Thus, performing simulating without intermediate initialization/measurements is an active field of research in QCFD.41
In this study, we propose the QLB algorithm to solve the 2D low Reynolds number flows with external forcing and wall boundary conditions. Our approach involves representing the streaming and boundary conditions as a unified matrix operator. The objective of this study is to develop QLBM as a Hamiltonian simulation. Our work primarily focuses on implementing the QLBM without making any intermediate assumptions, except for neglecting the non-linear term in the equilibrium distribution function. We note that upon neglecting the non-linearity, the linear lattice Boltzmann equation will not recover the incomplete Navier–Stokes equations. However, the linear model is not affected by the nonlinear error terms in the momentum equation.24 Thus, the test cases considered herein are linear/linearized flows with low Mach ( ) and low Reynolds ( ) flows. The quantum circuit presented here for the collision and streaming remain same throughout the simulation. However, initialization of the quantum state based on the constructed PDFs at the start of each simulation time step is required.
II. LATTICE BOLTZMANN METHOD
A. Boundary conditions
III. QUANTUM LB ALGORITHM
The proposed quantum algorithm consists of three major steps, viz., (i) matrix-vector representation of the LB algorithm, (ii) decomposition of non-unitary LB matrices into unitaries using the singular value decomposition (SVD), and (iii) development of quantum circuit with suitable encoding of PDFs. In the following, we will explain each of these steps in turn.
A. Matrix-vector representation
Let be the total number of grid points, where and are the number of grid points along x and y axes, respectively. Then the total number of DFs will be , where is the number of directions defined in the DmQn lattice model. At first, the DFs are arranged in a vector form according to their lattice direction, , where the superscript refers to the direction. The index of any element in can be written as , where is the co-ordinates and is the direction. Next, the entries for the collision and streaming matrices will be derived, wherein the row and column indices of the constructed matrix follow the same order in which the DFs are arranged. In general, every element of the resultant vector from the matrix-vector product can be expressed as a linear combination of the elements of the input vector with weights being the corresponding row. Thus, each can be written as , where represents the element at the ith row and jth column of the matrix.
1. Collision
2. Streaming and boundary conditions
The streaming matrix S consists of blocks with the size of each block . In the absence of boundary conditions, the streaming operator shifts the DFs along the direction of propagation. This results in a permutation matrix, i.e., a square binary matrix where each row/column contains one entry equal to 1, the remaining entries are zero. As mentioned earlier, the index of each row i can be decomposed as . For a fully periodic domain, the column index j such that can be computed by replacing the co-ordinate by , where L is the domain length. In the case of a wall boundary, the column index will be computed as , where and is the reflection of the direction .
As an illustration, we choose the D2Q9 lattice model, which is a commonly used lattice for two dimensional LB fluid flow simulations.24 The lattice parameters are given in Table I. The first block of matrix S will be identity, as the lattice velocity will be zero. In a fully periodic case, the number of elements in the streaming matrix will be the size of diagonal entries and the row major ordering of grid points provide the shifted elements closer to the diagonal [Fig. 1(a)]. The matrix shapes in Fig. 1 are obtained for the test cases presented in Sec. IV with , and the grid sizes are given in Table IV.
Lattice parameters for the D2Q9 model.
i . | . | . |
---|---|---|
0 | 4/9 | |
1,2,3,4 | 1/9 | |
5,6,7,8 | 1/36 |
i . | . | . |
---|---|---|
0 | 4/9 | |
1,2,3,4 | 1/9 | |
5,6,7,8 | 1/36 |
Streaming matrices for test cases with qubits using the D2Q9 lattice model. (a) A fully periodic domain, (b) periodic along the x axis and bounded by stationary walls in the y axis, (c) periodic along the x axis and bounded by stationary (bottom) and moving walls (top) in the y axis, and (d) bounded by a moving top wall and stationary walls on other three sides.
Streaming matrices for test cases with qubits using the D2Q9 lattice model. (a) A fully periodic domain, (b) periodic along the x axis and bounded by stationary walls in the y axis, (c) periodic along the x axis and bounded by stationary (bottom) and moving walls (top) in the y axis, and (d) bounded by a moving top wall and stationary walls on other three sides.
B. Unitary decomposition
Neither the collision nor the streaming matrix generated in Sec. III A is unitary. There are several methods to decompose non-unitary matrices into unitary matrices. One common approach is to express them as a weighted sum of unitaries, known as the linear combination of unitaries (LCU) method,15 or as a product of unitaries, such as through singular value decomposition (SVD).18 The number of terms required in an LCU representation of a matrix generally depends on three factors: (i) the size of the matrix, (ii) the sparsity or structure of the matrix, and (iii) the precision and algorithmic context of the LCU application. For a general matrix, the worst-case scenario typically requires terms. However, for sparse matrices, the number of terms can be proportional to the number of non-zero entries, resulting in , where k is the number of non-zero terms. In contrast to LCU, the number of matrices produced in SVD is always three with two of them being unitary. However, Bharadwaj and Sreenivasan8 utilized a four-unitaries approach, wherein a non-unitary matrix can be approximated as a sum of four unitaries using an expansion parameter. Thus, finding the efficient decomposition of a given non-unitary matrix is highly crucial and is a challenging area of research.2,5,8
In the present work, we use the singular value decomposition (SVD) to transform collision/streaming matrices as a combination of unitary matrices. Specifically, a matrix A can be decomposed into , where U, V are unitary and D is a diagonal matrix but need not be an unitary. Again, this presents two approaches for representing D: either as a sum or as a product of unitaries. Previously, Schlimgen et al.38 introduced a dilation-based algorithm based on SVD (product of unitaries) to simulate a non-unitary diagonal operator. In our work, we utilized LCU to decompose the diagonal matrix as a sum of two unitary matrices. Before we decompose D, the norm of the elements in D may not be unity. Hence, we normalize it and store the factor for the post multiplication of resultant vector. Now, D can be decomposed as , where are unitary matrices given by . Thus, A will be written in terms of unitaries . Consequently, the total number of unitary operators in our quantum LB algorithm is eight. In contrast, in the generic LCU approach, this count depends on the size of the matrix, which scales with the problem size.
C. Quantum algorithm
The algorithm consists of three steps: encoding, circuit construction, and measurement. Herein, we encode the PDFs as the sole variable using the amplitude encoding technique. The collision and streaming operations are performed via the quantum circuits built from the unitaries. Finally, we measure and evaluate the probability amplitudes resulting in the PDFs for the next time step. The distinct advantage of the present algorithm is that the circuit is built once for the entire simulation. This means the circuit built at the start of the simulation can be iterated using a for loop, with the statevector re-initialized at each time step of the simulation. The complete circuit model for implementing QLBM is given in Fig. 2.
Quantum circuit for LBM, wherein the collision and streaming blocks are repeated for n time steps.
Quantum circuit for LBM, wherein the collision and streaming blocks are repeated for n time steps.
1. Encoding
2. Quantum circuits
However, before proceeding to streaming after collision, the Hadamard needs to be applied on the ancilla, i.e., circuit has to be applied. In the Appendix, we show its necessity. Upon performing streaming the real part in the first block of statevector results in the desired PDF for the next simulation time step.3
3. Measurement and flow variables
IV. RESULTS AND DISCUSSION
The proposed algorithm is implemented using IBM's Qiskit quantum computing SDK. Since we are interested in obtaining PDFs, we used the state vector simulator in Qiskit-Aer package.33 The unitaries derived in Eq. (14) and (16) are applied in the Qiskit-simulator as a custom gate operation. Four test cases with different boundary conditions were chosen for study. The D2Q9 lattice is used unless otherwise mentioned.
A. Advection-diffusion of a Gaussian hill
Comparison of concentration profiles obtained from QLBM with analytical solution24 for the advection-diffusion of Gaussian hill of grid size at different time steps. The parameters are: relaxation time , , , and .
Comparison of concentration profiles obtained from QLBM with analytical solution24 for the advection-diffusion of Gaussian hill of grid size at different time steps. The parameters are: relaxation time , , , and .
Comparison of concentration contours obtained from QLBM with analytical solution (bold lines)24 for the advection-diffusion of Gaussian hill of grid size . The chosen parameters are specified in caption of Fig. 3.
B. Plane Poiseuille flow
Relative error norm obtained for the Poiseuille flow test case with different grid sizes.
. | Grid . | # Timesteps . | (%) . |
---|---|---|---|
9 | 500 | 2.34 | |
10 | 700 | 1.35 | |
11 | 900 | 0.6 | |
12 | 1200 | 0.56 |
. | Grid . | # Timesteps . | (%) . |
---|---|---|---|
9 | 500 | 2.34 | |
10 | 700 | 1.35 | |
11 | 900 | 0.6 | |
12 | 1200 | 0.56 |
The horizontal velocity profile of plane Poiseuille flow obtained from QLBM for four different qubit counts is compared with analytical solution.24 The values of are 0.001, 0.008, 0.0005, and 0.0003, respectively, for the 9, 10, 11, and 12. The grid size and the final time step are given in Table II.
The horizontal velocity profile of plane Poiseuille flow obtained from QLBM for four different qubit counts is compared with analytical solution.24 The values of are 0.001, 0.008, 0.0005, and 0.0003, respectively, for the 9, 10, 11, and 12. The grid size and the final time step are given in Table II.
C. Plane Couette–Poiseuille flow
The horizontal velocity profile of plane Couette flow obtained from QLBM for grid size of with is compared with analytical solution24 at the end of time step .
The horizontal velocity profile of plane Couette flow obtained from QLBM for grid size of with is compared with analytical solution24 at the end of time step .
D. Lid driven cavity
The lid-driven cavity is a standard CFD benchmark problem for validating various numerical schemes for low-speed flows. A square cavity with two side walls and the bottom are considered to be rigid, whereas the top wall is moving with a tangential velocity . Before proceeding with QLBM, we performed the classical LB simulation with and without non-linear terms in the equilibrium distribution function. We choose the grid size of . The lid velocity is chosen as and . This gives , and . This setup yielded no significant difference between the LB simulations with and without non-linearity at low Re. However, we observed a significant difference when , necessitating a larger grid size to maintain stability. Hence, in order to validate the QLBM simulation with the coarse grid in the absence of an analytical solution, we used icoFOAM solver from OpenFOAM-v2312 to obtain the reference solution.23 Simulation time of with is used in OpenFOAM, whereas QLBM simulation is performed for 1000 time steps. Figure 7 shows the streamlines obtained from QLBM after 1000 time steps. Figure 8 shows the comparison of QLBM results with OpenFOAM for horizontal ( ) and vertical ( ) velocity profiles extracted, respectively, along the y- and x-axes at the center of the domain. The relative difference is found to be .
Lid-driven cavity case in the grid. Streamlines for obtained from QLBM at the end of time step .
Lid-driven cavity case in the grid. Streamlines for obtained from QLBM at the end of time step .
Lid-driven cavity case in the grid. Comparison of x and y velocities from QLBM with OpenFOAM solution. The x and y velocities are obtained along the y and x axes, respectively.
Lid-driven cavity case in the grid. Comparison of x and y velocities from QLBM with OpenFOAM solution. The x and y velocities are obtained along the y and x axes, respectively.
E. Quantum gates and computational complexity
Implementing the present algorithm on real QPU hardware is challenging for several reasons. First, when applying Hadamard gates to all qubits including the ancilla, all possible states in the Hilbert space have equal probabilities. Since we encode PDFs onto these probabilities, we must prepare the state of qubits before the start of QLBM. The number of CNOT gates required for state preparation39 will be . In the present algorithm, at each time step, LBM is expressed as a product of six unitary matrices, three each for collision and streaming operation. Of the six unitaries, two are diagonal matrices. Each of these matrix decomposed as controlled-NOT (CNOT) and U-gates using Qiskit decomposition tool, which uses the quantum Shannon decomposition (QSD) algorithm. In QSD, the circuit for n-qubit operator is decomposed into three rotational gates and qubits operators; and this undergoes recursively until the circuit is made of single and two qubit gates.39 Table III shows the gate count obtained from the Poiseuille flow case with four different qubit counts. For each additional qubit, gate counts of both CNOT and U gate increases by a factor of 4, which matches with the theoretical estimate of . Each diagonal operator over qubits will be decomposed into gates, and the general unitary operator require gates. Thus, the total gate count for two diagonal matrices and four generic unitaries is of . Next, we report on the influence of different streaming-cum-boundary operators on the gate count. Table IV shows the gate count obtained for all test cases with . The difference is observed in the streaming part where the gate count of fully bounded case (Lid-cavity) is four times that of fully periodic case (ADE).
Gate count obtained from the Poiseuille flow test case solved for different grid sizes (single time step).
. | Grid points . | CNOT . | U . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . | . | . | . | . | . |
9 | 24 | 31 020 | 5 325 | 31 020 | 31 020 | 5 325 | 16 505 | 62 168 | 4 100 | 62 168 | 62 160 | 4 069 | 31 435 |
10 | 50 | 124 844 | 12 362 | 124 844 | 124 844 | 12 362 | 64 959 | 249 942 | 9 854 | 249 944 | 249 775 | 9 599 | 126 602 |
11 | 112 | 500 908 | 27 619 | 500 908 | 500 908 | 27 619 | 256 491 | 1 002 326 | 20 901 | 1 002 327 | 1 001 847 | 20 838 | 504 379 |
12 | 216 | 2 006 700 | 61 307 | 2 006 700 | 2 006 700 | 61 307 | 1 016 860 | 4 014 422 | 45 816 | 4 014 424 | 4 013 528 | 45 561 | 2 014 951 |
. | Grid points . | CNOT . | U . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . | . | . | . | . | . |
9 | 24 | 31 020 | 5 325 | 31 020 | 31 020 | 5 325 | 16 505 | 62 168 | 4 100 | 62 168 | 62 160 | 4 069 | 31 435 |
10 | 50 | 124 844 | 12 362 | 124 844 | 124 844 | 12 362 | 64 959 | 249 942 | 9 854 | 249 944 | 249 775 | 9 599 | 126 602 |
11 | 112 | 500 908 | 27 619 | 500 908 | 500 908 | 27 619 | 256 491 | 1 002 326 | 20 901 | 1 002 327 | 1 001 847 | 20 838 | 504 379 |
12 | 216 | 2 006 700 | 61 307 | 2 006 700 | 2 006 700 | 61 307 | 1 016 860 | 4 014 422 | 45 816 | 4 014 424 | 4 013 528 | 45 561 | 2 014 951 |
Gate count obtained for all test cases with (single time step).
Case . | Grid points . | CNOT . | U . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . | . | . | . | . | . |
ADE | 100 | 500 908 | 27 619 | 500 908 | 500 908 | 27 619 | 134 397 | 1 002 327 | 21 093 | 1 002 325 | 1 002 066 | 20 838 | 257 167 |
Poiseuille | 112 | 256 491 | 1 002 326 | 20 901 | 1 002 327 | 1 001 847 | 20 838 | 504 379 | |||||
Couette | 112 | 500 908 | 1 002 324 | 20 901 | 1 002 327 | 1 002 328 | 21 861 | 1 002 327 | |||||
Cavity | 100 | 500 908 | 1 002 326 | 21 093 | 1 002 328 | 1 002 328 | 21 861 | 1 002 322 |
Case . | Grid points . | CNOT . | U . | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | . | . | . | . | . | . | . | . | . | . | . | . | . |
ADE | 100 | 500 908 | 27 619 | 500 908 | 500 908 | 27 619 | 134 397 | 1 002 327 | 21 093 | 1 002 325 | 1 002 066 | 20 838 | 257 167 |
Poiseuille | 112 | 256 491 | 1 002 326 | 20 901 | 1 002 327 | 1 001 847 | 20 838 | 504 379 | |||||
Couette | 112 | 500 908 | 1 002 324 | 20 901 | 1 002 327 | 1 002 328 | 21 861 | 1 002 327 | |||||
Cavity | 100 | 500 908 | 1 002 326 | 21 093 | 1 002 328 | 1 002 328 | 21 861 | 1 002 322 |
Another challenging aspect of the quantum simulation described by generic unitary matrices is the system memory of classical computers. The quantum state described by the probability amplitudes will be stored as an array of complex numbers. A single qubit system described by two complex numbers requires two 64-bit precision floating values, i.e., bits bytes of memory. An arbitrary unitary matrix representing a quantum operation requires bits bytes of memory. In case of LBM with D2Q9 lattice with qubits, which can handle a grid size of , storing statevector and the single unitary matrix would require 0.017, and 18 000 Gigabytes (GB) of memory, respectively. Because our algorithm comprises of six unitaries, simulation with larger qubits are not feasible, unless the matrix can be built directly from the universal quantum gate set. It is worth to mention the importance of non-linear term on the computational requirements. For instance, the Carleman system at second order truncation35 requires , whereas the present work requires . Also, the computational cost of SVD on the classical computer is of , which we have not taken into account.
V. SUMMARY AND OUTLOOK
We presented a comprehensive quantum model for LBM that encompasses collision, streaming, boundary conditions, and external forcing terms for solving various fluid flow problems in the low Re regime of . As a result of neglecting the non-linear term in the equilibrium function, the LB algorithm is written in terms of a matrix-vector product. Furthermore, we combined the streaming operation and boundary conditions as a single matrix operator. We utilized classical SVD to decompose the non-unitaries, which followed by representing the diagonal operator as a sum of unitaries. Thus, we have eight unitary operators describing the LB algorithm for a single time step. Compared to the generic linear combination of the unitaries approach, where the number of unitaries depends on the problem size, the fixed number of unitaries is a key advantage of the present algorithm. The algorithm has been tested with four different CFD benchmark problems performed using the “statevector” simulator. The results are in good agreement with the reference solutions.
The present QLB algorithm has certain limitations. First, the quantum state evolution performed with any of eight unitary matrices generated in a single time step of QLBM results in a larger circuit depth of (minimum estimate among all test cases), which is not feasible in present day QPUs, as they are made of qubits with shorter decoherence times and their two qubit gates are noisy/error-prone. However, the quantum circuit presented here for the collision and streaming remain same throughout the simulation. This allows us to explore various optimization strategies to obtain the shallow circuit depth, which in turn has a potential to be implementable on NISQ devices. However, initialization of the quantum state based on the constructed PDFs at the start of each simulation time step is required. Second, simulations with more than 15 qubits using the D2Q9 model are not feasible due to higher memory requirements. Even though the first limitation is hardware-related, circuit depth can be optimized by building a parameterized quantum circuit or performing a unitary circuit synthesis with more efficient algorithms. This helps efficiently implement the Carleman system for LBM, which includes non-linear terms, as it requires at least twice the resources required by the present algorithm. Further works include testing the present algorithm in real hardware or simulation with noise models. This helps to quantify (i) the single qubit noise on the final distribution functions and (ii) affect the number of measurements required to construct the state vector.
ACKNOWLEDGMENTS
This research was kindly supported by the Quantum Computing Consortium that has been funded by the MAGNET program of the Israel Innovation Authority. The authors acknowledge the referees for the comments/suggestions which greatly improved the work.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
E. Dinesh Kumar: Conceptualization (equal); Methodology (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Steven H. Frankel: Conceptualization (equal); Funding acquisition (lead); Supervision (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available within the article.
APPENDIX: DERIVATIONS OF QUANTUM STATE EVOLUTION
Finally, the real part in the first block of state vector results in the desired PDF for the next simulation time step.