Over the past two decades, digital microfluidic biochips have been in much demand for safety-critical and biomedical applications and increasingly important in point-of-care analysis, drug discovery, and immunoassays, among other areas. However, for complex bioassays, finding routes for the transportation of droplets in an electrowetting-on-dielectric digital biochip while maintaining their discreteness is a challenging task. In this study, we propose a deep reinforcement learning-based droplet routing technique for digital microfluidic biochips. The technique is implemented on a distributed architecture to optimize the possible paths for predefined source–target pairs of droplets. The actors of the technique calculate the possible routes of the source–target pairs and store the experience in a replay buffer, and the learner fetches the experiences and updates the routing paths. The proposed algorithm was applied to benchmark suites I and III as two different test benches, and it achieved significant improvements over state-of-the-art techniques.
HIGHLIGHTS
Deep-reinforcement learning approach in a distributed framework for droplet transportation on digital microfluidic biochips.
Optimizing the droplet transportation time for homogeneous multiple droplets by sharing the common electrodes.
A sequence of intelligent trade-offs to handle the fluidic constraints for parallel droplet transportation.
I. INTRODUCTION
Recently, microfluidic technology has progressed rapidly and become an appropriate replacement for expensive and cumbersome benchtop laboratory equipment to facilitate higher throughput and accuracy.1 Biochips have emerged as pivotal devices for biomedical research, point-of-care analysis, and clinical diagnosis; they reduce reagent consumption, allow greater control for mixing and manipulating samples, and integrate and automate multiple assays in parallel to achieve higher precision and throughput. The fluid sample in a biochip can be in the form of either a continuous flow or discrete droplets: the former type of biochip is known as a continuous-flow microfluidic biochip, and the latter is a digital microfluidic biochip (DMFB).2–4 The advantages of DMFBs are individual sample addressing, reagent isolation, and compatibility with array-based techniques used for biochemical and biomedical applications,5,6 and they have applications in areas such as protein and glucose analysis, tissue engineering, and drug delivery and discovery.7–12
For different bioassay operations such as multiplexed in vitro diagnosis or polymerase chain reaction (PCR), droplets must be transported from a reservoir to a module, from one module to another, or from a module to a sink reservoir.13,14 This transportation process is known as routing. For multiplexed in vitro diagnosis, multiple droplets must be transported consecutively to reduce the use of memory resources13 (which are formed dynamically in the biochip to store the droplets temporarily), and route optimization increases the lifetime of a biochip by reducing its electrode damage.15
Because of fluidic constraints, a given electrode cannot transport multiple droplets simultaneously. Instead, optimal synchronized scheduling is required to preserve droplet discreteness and satisfy resource constraints. Some bioassay operations (e.g., PCR, colorimetric protein assay16) require droplets with dissimilar chemical properties, and droplets generated from different sources (with different chemical properties) are called heterogeneous droplets; their disadvantage is that while trying to reduce routing time and resources, it is difficult to avoid contamination or at least minimize the contamination zones. By contrast, droplets from the same source are called homogeneous droplets, and the main benefit of their routing is increased electrode sharing; the main advantages of this sharing process are the minimal use of electrode pins and shorter droplet arrival times.
Some bioassays (e.g., detecting a patient’s serum-blood-glucose level) require higher accuracy and precision;17 in those cases, samples are dispensed to the biochip to carry out the scheduled bioassay on each sample, thereby requiring homogeneous droplet routing. Therefore, herein we consider only homogeneous droplet routing. In transporting these droplets, some crucial situations may occur, such as (i) unwanted mixing due to sharing the same electrode, (ii) parallel movement in opposite directions, and (iii) droplet sticking to avoid unexpected mixing. The literature contains several techniques for solving these issues, but those techniques compromise on the entire routing time, the use of electrodes, or both.
Deep reinforcement learning (RL) is a popular machine learning (ML) approach used widely in web analysis, image processing, natural language processing, and data mining, among other areas. In RL, an agent learns from the environment by transforming different states according to the rewards. To the best of our knowledge, Liang et al.18 were the first to apply RL to droplet routing in DMFBs, and the technique has been extended to microelectrode dot array platforms.19,20 Herein, we report our development of a double RL-based optimized droplet routing algorithm based on an Ape-X deep Q-network (DQN) to overcome the issues of homogeneous droplet routing within a satisfactory time using fewer resources. The effectiveness of the proposed technique is measured in terms of latest arrival time, number of electrodes used, etc., and simulation results show the effectiveness of the proposed technique compared with state-of-the-art approaches. Our main contributions are as follows: (i) a unique technique for optimizing the transportation time of homogeneous multiple droplets by sharing common electrodes; (ii) a method that resolves homogeneous droplet routing issues with minimal electrode usage; (iii) a pipeline of intelligent techniques to handle fluidic constraints such as collision, friction, and deadlock.
The rest of this paper is organized as follows. In Sec. II, we review relevant previous work. In Sec. III, we describe the basic operations and constraints of DMFB routing. In Sec. IV, we present the proposed droplet-routing technique. In Sec. V, we summarize the outcomes of using the proposed method. Finally, we conclude in Sec. VI.
II. PREVIOUS WORK
In the past two decades, several researchers have tried to minimize paths for multi-droplet transportation while satisfying fluidic constraints. A DMFB uses electrowetting to manipulate nanoliter or picoliter droplets containing biochemical samples on a two-dimensional array of electrodes,8 and the literature contains some well-known biochip frameworks.5,21,22 Biochips are used for fast and secure analytics and bioassay operations, so the sample droplets must be transported from source to target as quickly as possible. Because some bioassays must be executed several times for more-accurate diagnostic analysis,17 the additional job of a biochip is to move the biological sample quickly and carefully to the desired position therein.
Yuh et al.23 proposed a droplet routing technique based on a network-flow algorithm to minimize electrode use for optimal fault tolerance and enhanced reliability and performance. First, they identified a set of noninterfering nets and considered optimal global-routing paths for the nets by the network-flow approach; next, they incorporated a detailed routing, designing a polynomial-time algorithm using the global-routing paths with a negotiation-based technique. However, the process suffered because the dependency graph was designed by considering only the fluidic constraints, and so the final schedule could not handle high-level routing. Cho and Pan24 presented a droplet routing algorithm for enhanced timing and fault tolerance by tuning the droplet movement greedily; they introduced a concession technique to resolve deadlock situations in which one droplet backed off to allow others to pass, but their approach could not accomplish efficient timing. Huang et al.25 proposed another droplet routing process by using a fast routability and performance-driven approach; they defined an entropy-based routing technique for better routability and introduced a routing compaction method using dynamic programming to reduce the latest arrival time.
Keszocze et al.26 introduced a method known as exact routing to address blockages in biochips; their approach also guaranteed exact solutions to determine a routing path with the fewest time steps. Pan and Samanta27 proposed a multi-droplet routing technique by calculating the Manhattan distance between source and target; they also reduced the pin count by using the dependency graph of nonconflicting electrodes, but resource utilization was compromised. Pan and Samanta28 also proposed an advanced strategy for DMFB droplet routing, one based on ant colony optimization: their algorithm generated two ants for each source, and these traversed horizontally and vertically in a rectangular bounding box constructed to restrict the ant movements; in the next phase, the ants moved based on a pheromone deposition function and resource utilization. The approach explored both single and multiple ant systems to address detours from conflicting zones toward the destination by searching for the best possible route.
Chakraborty et al.29 described a two-phase heuristic routing technique involving a concurrent routing approach based on an interfacing index function; they showed a priority-based scheduling technique using the routable ratio to avoid deadlocks in droplet routing, and they formulated the path of overlapping droplets as a satisfiability problem and solved it with an SAT-based engine. Bhattacharya et al.30 proposed a two-stage routing algorithm for simultaneous droplet transportation by resolving the faults and conflicts of nets; their main focus was transporting homogeneous droplets by sharing paths among multiple nets. Shi et al.31 applied a support vector machine-based algorithm with reliability constraint to DMFBs to minimize the number of control pin actuations. Finally, Liang et al.18 showed a deep RL-based technique for droplet transportation to bypass degraded electrodes for clinical diagnosis; they also applied a convolutional neural network for large DMFBs and found effective outcomes.
III. DMFB ROUTING
For a bioassay operation, a droplet must be moved from one electrode (source) to another electrode (destination) by an actuating chain of adjacent electrodes. Droplet transportation can be of two types: from source electrode to target electrode (known as a 2-pin net), or from two source electrodes to an intermediate electrode (which acts as a mixer) and then to the target (known as a 3-pin net); see Fig. 1(a). In a 3-pin net, the fluidic constraints ensure mathematically that only the relevant intermediate electrode is used for a particular mixing operation.
Fluidic constraints. Initially, we assume the positions of two independent droplets Dα and Dβ at time t as and , respectively, where x and y refer to rows and columns, respectively. The fluidic constraints on these two droplets guarantee that they never mix by maintaining a minimum distance between them.27 Fluidic constraints are of types as defined below:
static constraint: or
dynamic constraint: or or or
Figure 1(b) shows the electrodes blocked temporarily because of fluidic constraints for a droplet at position (x, y) in the biochip. The region of these blocked electrodes is known as the critical zone of the particular droplet.
Routing path compaction. Some fluid-handling operations (e.g., dispensing, mix-splitting, storage, etc.) must be executed to perform a critical bioassay in a DMFB, and these involve transporting multiple droplets between multiple sources and targets. Routing multiple droplets from sources to destinations faces major setbacks from collision, friction, deadlock, and crossover (only for heterogeneous droplet routing).30 Known as hard blockage, a protection zone can be incorporated to avoid the unwanted mixing of droplets during their movement in neighboring electrodes.30
Collision. In this situation, two droplets appear in the same electrode by violating the fluidic constraints as shown in Fig. 2(a).
Crossover. A crossover occurs when a single electrode is used to transport two droplets at different times while satisfying the fluidic constraints.
Stall. To overcome avoid a collision, one droplet must be halted in a safe electrode for a stipulated time to allow another droplet to pass the particular location where the fluidic constraints might be violated. This specific situation is known as stall, and an example of collision avoidance using stall is shown in Fig. 2(b).
Friction. If two droplets are moving through adjacent electrodes in opposite directions, one droplet could enter the other’s critical zone. Known as friction, this unique phenomenon is shown in Fig. 3(a), where the shaded squares show the friction points for the two droplets. Not all occurrences of friction can be handled using stall.
Detour. Manhattan distance can be used to find the optimal path length from source to destination for a specific droplet.27 To avoid hard blockage, collision, and friction, the technique of detour can be used. Instead of the shortest path (calculated by Manhattan distance), a longer path and more resources are used. An example of friction avoidance using detour is shown in Fig. 3(b), where the droplet with source–target pair (S2, D2) reaches the target electrode using a detour path.
Deadlock. Deadlock is when multiple (at least three) droplets are stuck because of having to satisfy the fluidic constraints. Call the three droplets α, β, and γ. In Fig. 4(a), α(S1, D1) and β(S3, D3) use neighboring columns for transportation. β uses a detour to avoid friction, but γ(S2, D2) creates a collision with the detour path of β; this situation could be overcome by detouring γ. At the same time, the detour path of γ may cause another collision with α, and such an occurrence might cause the deadlock of α, β, and γ. Deadlock could be overcome by exploring a further detour path, which increases resource utilization. Deadlock and its recovery using detour are illustrated in Fig. 4, where Fig. 4(b) shows the detour path of γ to avoid deadlock.
IV. PROPOSED ROUTING TECHNIQUE
We propose a double RL-based droplet routing algorithm based on an Ape-X DQN algorithm.32 The proposed droplet routing process is divided into two phases: (i) the distributed Ape-X DQN algorithm is used to establish the routes between all source–target pairs; (ii) deadlocks and collisions among routes are resolved using an intelligent trade-off involving stalling and detouring some droplets. RL is an ML paradigm that works through a cumulative reward system to solve sequential decision issues through learning optimal policies. The Ape-X DQN algorithm was formed by combining the DQN algorithm33 with the double Q-learning algorithm34 and the dueling network architecture35 to establish the learning algorithm and function approximator. We use the Ape-X approach to find droplet routes in DMFBs, and herein we apply this approach to an m × n rectangular DMFB environment because this DMFB board layout corresponds exactly to a maze with blockages.
Droplet routing in a DMFB is an NP-complete problem because of its multi-objective optimization behavior.36 Because the droplet routing problem is a state-space optimization problem, it also has more than one solution path for any source–target pair.37 A 32 × 32 DMFB involves 1024 possible states with four possible moves/actions for each state, so it is necessary to calculate, hold, and update a maximum of 1024 × 4 possible values. Temporal difference (TD) Q-learning is a popular ML approach and is implemented efficiently to solve droplet routing problems in large DMFBs.36 Herein, we combine the techniques proposed by Rajesh and Pyne36 and Horgan et al.32 for converting the DMFB board layout into a learning environment and defining the actor policy, respectively.
Various Q-learning approaches are implemented in state-of-the-art methods, but excessive Q-values are problematic for a limited DMFB architecture.36 To overcome the overestimation problem, Van Hasselt et al.34 designed the Double-DQN (DDQN) algorithm by combining double Q-learning with the DQN method to produce better-estimated values than those from DQN. Horgan et al.32 proposed an Ape-X DQN algorithm to scale up the learning of DQN and deep deterministic policy gradient (DDPG) by a distributed architecture. The algorithm had two parts, acting and learning: in the acting part, stepping through the environment, they evaluated a policy in a deep neural network and preserved the observed data in a replay buffer; in the learning part, data were sampled from the memory to update the policy parameters. Inspired by this, we have developed an Ape-X DQN-based droplet routing algorithm for DMFBs. To implement the algorithm, we have used the two key factors of a DQN-like target network and an experience replay buffer in our network formulation.
In DMFB routing, the DDQN network approximates and updates the Q-values of each state, while DQN is responsible for computing the Q-values for all possible actions by executing a specific state as input. The Ape-X DQN approach runs these DQN networks in a distributed architecture, with more than one actor running on a CPU to generate the Q-values of all possible actions of different states. The method is shown schematically in Fig. 5. The Ape-X DQN applies double Q-learning with multi-step bootstrap targets as the learning procedure using the following equation:
where t is the time index for each experience sampled from the replay buffer starting with state St with action value a, and θ and θ− are the parameters of the actual and target network, respectively. When the current state S transitions to the new state S′, the agent receives a reward r = R(S,A).
A. Environment setup
We consider the DMFB architecture in the form of a matrix or 2D array, with the positions of hard blockages assigned the value 0 because blocked electrodes cannot engage in droplet transportation except in a dedicated preplanned operation. The positions of free electrodes are assigned the value 1, and the predefined source and destination positions are . A 5 × 5 model is shown in Fig. 6. We model the DMFB droplet routing problem as a Markov decision process involving a state space (S), an action space (A) for all states, and a reward function r:S × A. Each electrode position of the DMFB is considered as an individual state, and the action A is defined as A = {Mup, Mdown, Mleft, Mright} because a droplet or agent can move in one of four directions from the present electrode. An agent receives the reward value for each state transition, and when it arrives at the destination state, the execution of the responsible actor is finished. The agent tries to maximize the accrued rewards for optimality by the given ɛ-greedy policy.
The proposed algorithm calculates the routes for all source–target pairs separately via the actors, and after determining all the routes, route compaction is executed. The route-finding phase starts from the source electrode of an agent by moving to each electrode and trying to reach the destination electrode. Each move of an agent follows the predefined reward scheme given in Table I, and we apply the reward structure shown in Fig. 7 as described by Rajesh and Pyne.36 Policy π is registered as π(s) = ak, where . By calculating the reward value, the agent either reaches the destination by traversing the electrodes or exceeds the precalculated threshold , which is calculated as 1.75 × Disi, where Disi is the Manhattan distance of agent i. We apply a higher threshold than the 1.5 × Disi used by Rajesh and Pyne36 to give this route finding process more flexibility and benefit the route compaction.
Reward function . | Reward/penalty value . | Operation . |
---|---|---|
r(M(di)) | +10.0 | Target electrode |
−0.05 | Open adjacent electrode | |
−0.20 | Visited electrode | |
−0.75 | Blocked module | |
−0.75 | Out-of-boundary electrode |
Reward function . | Reward/penalty value . | Operation . |
---|---|---|
r(M(di)) | +10.0 | Target electrode |
−0.05 | Open adjacent electrode | |
−0.20 | Visited electrode | |
−0.75 | Blocked module | |
−0.75 | Out-of-boundary electrode |
B. Training process
The training of network requires a large number of iterations in the simulation process because the numbers of environmental states (e) and Q-values (q) are huge. We create two types of network: actual and target. These two networks are augmented distinctly, with the actual network denoted by θ and the target network denoted by θ−. The experience replay buffer holds the status of each previous episode, which is formed using the four-tuple et = (st, at, rt, st+1) at time t, where st is the present state, at is the action, rt is the reward, and st+1 is the next state; state st+1 is the next state and is updated by taking the action at. The network is updated by applying TD learning with a multi-step bootstrap target, and the loss function is defined as
A queue is used to implement the experience replay buffer. A distributed in-memory with individual capacity b is used to store the replay data according to their key values. Here, we implement proportional prioritization38 using the TD error δk. The probability of sampling key value k is , where α is the control factor, and pk is the priority of k and is defined as pk = |δk|. To implement the multi-step (n-step) transition, each agent retains a circular buffer of capacity n to store the five-tuple (St, At, Rt:t+b, γt:t+b, q(St, *)). In each iteration, new data are appended, and the discount factor γt:t+b and slice return Rt:t+b are calculated for all records. The first element of the buffer is combined with the current state St+n and the estimated value q(St+n) to continue the n-step transition with valid data in case the buffer is full. The constructed data are first stored in a local buffer and then moved to the replay buffer.
The replay buffer uses three inputs, i.e., the model, the variable max_size, and the discount factor γ for n agents (droplets) (as shown in Fig. 5). The variable max_size is used to store the maximum count of allowable episodes. The neural network and replay memory are initialized with random weights and random samples, respectively. uses an ɛ-greedy policy to traverse the DMFB states (electrodes). The training process is executed via the episodes, and a counter training_count monitors each iteration of the distributed model. The training process calculates the Q-values using Eq. (1), with training carried out based on values between the predicted and target values. To stabilize the training process, we copy the parameters of the target network to those of the predicted network (θt) as specified by Rajesh and Pyne.36 At each iteration, an agent either applies the action with the maximum Q-value or takes a random action based on the ɛ-greedy policy. If the generated random number is less than ɛ, then the Q-value is chosen for the action; otherwise, the random value is taken. For example, ɛ = 0.3, then 30% of the moves are selected randomly and the rest are based on the highest Q-value. The execution process of an agent is terminated if the win rate reaches 100% or the path (sourcei, targeti) crosses the predefined limit 1.75× Disi.
C. Route compaction
The proposed algorithm finds the routes of all the source–target pairs for the DMFB droplet routing. The actors are responsible for training the network in the distributed architecture and providing the route of each droplet. In the ɛ-greedy policy, each source–target pair accomplishes p routes. We use p ≤ 4 for detour options for a particular route if it cannot avoid the fluidic constraints by using stalling. The algorithm checks all possible routes for non-observance of the fluidic constraints strictly as mentioned in Sec. III. To overcome a collision, stalling is applied to the shorter route (according to the Manhattan distance). Because friction and deadlock are not solved by applying stalling, detour is preferred, and for each constraint, the algorithm must recalculate a new route. After all the fluidic constraints are satisfied, for all the routes, the total used electrodes (UE), latest arrival time (TL), and average arrival time (TA) are calculated. The proposed algorithm is described in Algorithm 1 and provides the environment configuration, route establishment (route finding and compaction), and calculations of measuring factors. The algorithm is divided in two parts: steps 1–22 are for the actors, and the rest are for route compaction.
Require: 1. Number of nets (n), where n = x + 3y, x = count of 2-pin nets and y = count of 3-pin nets. 2. M hard blockages (module) with positions. 3. 2D DMFB layout with predefined source and destination of each net. |
Ensure: Compressed paths for all droplets |
1: Initialize the Ape-X DQN network parameters. ⊳ Parameters are discussed in Sec. V |
2: Initialize actual network and target network with respective parameters θ and θ−. |
3: Calculate the Manhattan distance Disi for each ni. |
4: Sort the nets in non-increasing order according to their Manhattan distances. |
5: for (each net i = 1 to n) do ⊳ Route finding for each Ni |
6: while (episode threshold or win rate is 100%) do |
7: Select an action ak = π(s) for state s using ɛ-greedy policy. |
8: Apply the action in the environment and store in . |
9: if (Local buffer size ≥ b) then |
10: Get the buffered data of b to φ. ⊳ Batch of multi-step transitions |
11: Calculate priorities of φ for experience to ρ. |
12: Update the replay buffer by (φ, ρ). |
13: end if |
14: end while |
15: Copy the prioritized sample from batch of transitions to (id, φ). |
16: Apply leaky rectified linear unit activation function as learning rule and compute the loss using Eq. (2). |
17: Update the network θt+1 by lt and θt. |
18: Calculate priorities for experience using |δk| and store in ρ. |
19: Update replay memory with (id, ρ). |
20: Remove old experience from replay memory. |
21: Enlist the route coordinates in path-lists . |
22: end for |
23: Initialize the variables TL = 0, TA = 0, UE = 0 and time counter t = 0. |
24: while (all droplets have not reached their corresponding target) do |
25: Move each droplet through the respective and update t = t + 1. |
26: Recalculate TL = 0, UE = 0. |
27: Check the fluidic constraints as described in Sec. III. |
28: if (violation found) then |
29: Apply stall and/or detour. |
30: Goto step 24. |
31: end if |
32: end while |
Require: 1. Number of nets (n), where n = x + 3y, x = count of 2-pin nets and y = count of 3-pin nets. 2. M hard blockages (module) with positions. 3. 2D DMFB layout with predefined source and destination of each net. |
Ensure: Compressed paths for all droplets |
1: Initialize the Ape-X DQN network parameters. ⊳ Parameters are discussed in Sec. V |
2: Initialize actual network and target network with respective parameters θ and θ−. |
3: Calculate the Manhattan distance Disi for each ni. |
4: Sort the nets in non-increasing order according to their Manhattan distances. |
5: for (each net i = 1 to n) do ⊳ Route finding for each Ni |
6: while (episode threshold or win rate is 100%) do |
7: Select an action ak = π(s) for state s using ɛ-greedy policy. |
8: Apply the action in the environment and store in . |
9: if (Local buffer size ≥ b) then |
10: Get the buffered data of b to φ. ⊳ Batch of multi-step transitions |
11: Calculate priorities of φ for experience to ρ. |
12: Update the replay buffer by (φ, ρ). |
13: end if |
14: end while |
15: Copy the prioritized sample from batch of transitions to (id, φ). |
16: Apply leaky rectified linear unit activation function as learning rule and compute the loss using Eq. (2). |
17: Update the network θt+1 by lt and θt. |
18: Calculate priorities for experience using |δk| and store in ρ. |
19: Update replay memory with (id, ρ). |
20: Remove old experience from replay memory. |
21: Enlist the route coordinates in path-lists . |
22: end for |
23: Initialize the variables TL = 0, TA = 0, UE = 0 and time counter t = 0. |
24: while (all droplets have not reached their corresponding target) do |
25: Move each droplet through the respective and update t = t + 1. |
26: Recalculate TL = 0, UE = 0. |
27: Check the fluidic constraints as described in Sec. III. |
28: if (violation found) then |
29: Apply stall and/or detour. |
30: Goto step 24. |
31: end if |
32: end while |
An example with 12 nets illustrates the simulated result of a complex bioassay on benchmark suite I.39 Figure 8(a) shows the estimated paths for a complex bioassay of case 1 from benchmark suite I with a 12 × 12 2D microarray involving 12 2-pin test droplets; red rectangles show possible collision regions, and green rectangles show possible friction sites. We developed a customized simulator for implementing the established algorithm for various given test cases. The simulated result is shown in Fig. 8(b), where the green arrows show the detour paths to avoid collision or friction; at a given electrode position, a number in black is the acceptable timestamp, and a number in red highlights a stall position and time.
V. RESULTS
The proposed Ape-X DQN-based droplet routing technique was executed in Python 3.5. The method was carried out in two phases to implement the routing of droplets for given nets. First, the technique identified possible paths using the Ape-X DQN algorithm, then the optimized solution was found after the completion of compaction. To train the network, its parameters were initialized as follows: two actors (with a single CPU) to put the data in the replay buffer; a discount factor of γ = 0.95; a TD error control of α = 0.6; a replay memory capacity of max_size = 1000. As the activation function, we used a leaky rectified linear unit (LReLU),40 and the capacity of the distributed memory was b = 250. The compaction started learning after waiting at least 50 000 iterations to update the replay buffer. The algorithm was executed using the Adam optimizer41 with a learning rate of 10−4, a decay of 0.95, and an exploration factor of ɛ = 0.01. In our experiment, we assumed p = 4.
The algorithm was simulated on the widely used testbenches39 named benchmark suite I and benchmark suite III. The results are compared with those from the following existing methods: a high-performance droplet routing algorithm (High performance), a genetic algorithm (GA), a fast routing algorithm (Fast routing), and a Q-learning algorithm (Q-learning)24,25,36,42 for benchmark suite I, and a heuristic approach (Heuristic), an exact routing algorithm (Exact routing), and a multi-level approach (Multi-level routing)43–45 for benchmark suite III.
Table II gives the simulated results based on benchmark suite I, and Table III compares them with those from the methods in Refs. 24, 25, 36, and 42. As given in Table III, the proposed technique outperforms the existing ones in most cases: the results show average improvements of 33.27%, 3.13%, 22.93%, and 1.85% in latest arrival time and 9.75%, 31.74%, 4.26%, and 0.34% in total used electrodes over High performance, GA, Fast routing, and Q-learning, respectively. Table IV gives the obtained results of different bioassays, i.e., invitro_1, invitro_2, protein_1, and protein_2 (benchmark suite III), with 11, 15, 64, and 78 sub-problems, respectively; the results highlight the better performance of the proposed technique in terms of latest arrival time and average arrival time compared to Heuristic, Exact routing, and Multi-level routing.
Test name . | Test bench . | Nets . | Blockage % . | TLa . | TAb . | UEc . |
---|---|---|---|---|---|---|
Test 1 | 12 × 12_1 | 12 | 5.6 | 32 | 5.04 | 62 |
Test 2 | 12 × 12_2 | 12 | 6.2 | 33 | 9.85 | 62 |
Test 3 | 12 × 12_3 | 12 | 7.6 | 35 | 9.35 | 57 |
Test 4 | 12 × 12_4 | 12 | 7.6 | 26 | 5.30 | 59 |
Test 5 | 16 × 16_1 | 16 | 6.6 | 32 | 9.54 | 96 |
Test 6 | 16 × 16_2 | 16 | 5.5 | 34 | 9.65 | 95 |
Test 7 | 16 × 16_3 | 16 | 10.5 | 37 | 16.33 | 94 |
Test 8 | 16 × 16_4 | 16 | 10.2 | 36 | 9.25 | 95 |
Test 9 | 16 × 16_5 | 16 | 15.2 | 35 | 8.67 | 92 |
Test 10 | 16 × 16_6 | 16 | 15.2 | 38 | 9.21 | 92 |
Test 11 | 24 × 24_1 | 24 | 11.1 | 43 | 9.52 | 214 |
Test 12 | 24 × 24_2 | 24 | 10.1 | 50 | 17.45 | 219 |
Test 13 | 24 × 24_3 | 24 | 15.5 | 48 | 19.67 | 220 |
Test 14 | 24 × 24_4 | 24 | 15.8 | 49 | 12.65 | 218 |
Test 15 | 24 × 24_5 | 24 | 20.7 | 53 | 18.33 | 210 |
Test name . | Test bench . | Nets . | Blockage % . | TLa . | TAb . | UEc . |
---|---|---|---|---|---|---|
Test 1 | 12 × 12_1 | 12 | 5.6 | 32 | 5.04 | 62 |
Test 2 | 12 × 12_2 | 12 | 6.2 | 33 | 9.85 | 62 |
Test 3 | 12 × 12_3 | 12 | 7.6 | 35 | 9.35 | 57 |
Test 4 | 12 × 12_4 | 12 | 7.6 | 26 | 5.30 | 59 |
Test 5 | 16 × 16_1 | 16 | 6.6 | 32 | 9.54 | 96 |
Test 6 | 16 × 16_2 | 16 | 5.5 | 34 | 9.65 | 95 |
Test 7 | 16 × 16_3 | 16 | 10.5 | 37 | 16.33 | 94 |
Test 8 | 16 × 16_4 | 16 | 10.2 | 36 | 9.25 | 95 |
Test 9 | 16 × 16_5 | 16 | 15.2 | 35 | 8.67 | 92 |
Test 10 | 16 × 16_6 | 16 | 15.2 | 38 | 9.21 | 92 |
Test 11 | 24 × 24_1 | 24 | 11.1 | 43 | 9.52 | 214 |
Test 12 | 24 × 24_2 | 24 | 10.1 | 50 | 17.45 | 219 |
Test 13 | 24 × 24_3 | 24 | 15.5 | 48 | 19.67 | 220 |
Test 14 | 24 × 24_4 | 24 | 15.8 | 49 | 12.65 | 218 |
Test 15 | 24 × 24_5 | 24 | 20.7 | 53 | 18.33 | 210 |
Latest arrival time (TL).
Average arrival time (TA).
Total used electrodes (UE).
. | High performance24 . | GA42 . | Fast routing25 . | DDQN36 . | Proposed . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Test name . | TL . | Improve-menta . | UE . | Improve-mentb . | TL . | Improve-mentc . | UE . | Improve-mentd . | TL . | Improve-mente . | UE . | Improve-mentf . | TL . | Improve-mentg . | UE . | Improve-menth . | TL . | UE . |
Test 1 | 100 | 68.00 | 67 | 7.46 | 36 | 11.11 | 92 | 32.61 | 39 | 17.95 | 73 | 15.07 | 30 | −6.67 | 60 | −3.33 | 32 | 62 |
Test 2 | ⋯ | ⋯ | ⋯ | ⋯ | 32 | −3.13 | 86 | 27.91 | 47 | 29.79 | 65 | 4.62 | 31 | −6.45 | 62 | 0.00 | 33 | 62 |
Test 3 | ⋯ | ⋯ | ⋯ | ⋯ | 39 | 10.26 | 90 | 36.67 | 41 | 14.63 | 58 | 1.72 | 35 | 0.00 | 60 | 5.00 | 35 | 57 |
Test 4 | 70 | 62.86 | 64 | 7.81 | 20 | −30.00 | 79 | 25.32 | 38 | 31.58 | 71 | 16.90 | 28 | 7.14 | 57 | −3.51 | 26 | 59 |
Test 5 | 78 | 58.97 | 118 | 18.64 | 29 | −10.34 | 141 | 31.91 | 40 | 20.00 | 100 | 4.00 | 30 | −6.67 | 91 | −5.49 | 32 | 96 |
Test 6 | 55 | 38.18 | 119 | 20.17 | 38 | 10.53 | 154 | 38.31 | 47 | 27.66 | 98 | 3.06 | 35 | 2.86 | 98 | 3.06 | 34 | 95 |
Test 7 | 89 | 58.43 | 113 | 16.81 | 42 | 11.90 | 147 | 36.05 | 44 | 15.91 | 91 | −3.30 | 41 | 9.76 | 102 | 7.84 | 37 | 94 |
Test 8 | 41 | 12.20 | 94 | −1.06 | 35 | −2.86 | 148 | 35.81 | 49 | 26.53 | 96 | 1.04 | 38 | 5.26 | 90 | −5.56 | 36 | 95 |
Test 9 | ⋯ | ⋯ | ⋯ | ⋯ | 28 | −25.00 | 135 | 31.85 | 49 | 28.57 | 91 | −1.10 | 38 | 7.89 | 86 | −6.98 | 35 | 92 |
Test 10 | 77 | 50.65 | 110 | 16.36 | 41 | 7.32 | 154 | 40.26 | 51 | 25.49 | 94 | 2.13 | 41 | 7.32 | 90 | −2.22 | 38 | 92 |
Test 11 | 47 | 8.51 | 249 | 14.06 | 60 | 28.33 | 304 | 29.61 | 56 | 23.21 | 228 | 6.14 | 44 | 2.27 | 232 | 7.76 | 43 | 214 |
Test 12 | 52 | 3.85 | 219 | 0.00 | 50 | 0.00 | 280 | 21.79 | 62 | 19.35 | 231 | 5.19 | 47 | −6.38 | 210 | −4.29 | 50 | 219 |
Test 13 | 52 | 7.69 | 247 | 10.93 | 49 | 2.04 | 286 | 23.08 | 62 | 22.58 | 221 | 0.45 | 49 | 2.04 | 224 | 1.79 | 48 | 220 |
Test 14 | 57 | 14.04 | 234 | 6.84 | 60 | 18.33 | 321 | 32.09 | 64 | 23.44 | 219 | 0.46 | 53 | 7.55 | 225 | 3.11 | 49 | 218 |
Test 15 | 63 | 15.87 | 230 | 8.70 | 65 | 18.46 | 313 | 32.91 | 64 | 17.19 | 227 | 7.49 | 54 | 1.85 | 228 | 7.89 | 53 | 210 |
. | High performance24 . | GA42 . | Fast routing25 . | DDQN36 . | Proposed . | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Test name . | TL . | Improve-menta . | UE . | Improve-mentb . | TL . | Improve-mentc . | UE . | Improve-mentd . | TL . | Improve-mente . | UE . | Improve-mentf . | TL . | Improve-mentg . | UE . | Improve-menth . | TL . | UE . |
Test 1 | 100 | 68.00 | 67 | 7.46 | 36 | 11.11 | 92 | 32.61 | 39 | 17.95 | 73 | 15.07 | 30 | −6.67 | 60 | −3.33 | 32 | 62 |
Test 2 | ⋯ | ⋯ | ⋯ | ⋯ | 32 | −3.13 | 86 | 27.91 | 47 | 29.79 | 65 | 4.62 | 31 | −6.45 | 62 | 0.00 | 33 | 62 |
Test 3 | ⋯ | ⋯ | ⋯ | ⋯ | 39 | 10.26 | 90 | 36.67 | 41 | 14.63 | 58 | 1.72 | 35 | 0.00 | 60 | 5.00 | 35 | 57 |
Test 4 | 70 | 62.86 | 64 | 7.81 | 20 | −30.00 | 79 | 25.32 | 38 | 31.58 | 71 | 16.90 | 28 | 7.14 | 57 | −3.51 | 26 | 59 |
Test 5 | 78 | 58.97 | 118 | 18.64 | 29 | −10.34 | 141 | 31.91 | 40 | 20.00 | 100 | 4.00 | 30 | −6.67 | 91 | −5.49 | 32 | 96 |
Test 6 | 55 | 38.18 | 119 | 20.17 | 38 | 10.53 | 154 | 38.31 | 47 | 27.66 | 98 | 3.06 | 35 | 2.86 | 98 | 3.06 | 34 | 95 |
Test 7 | 89 | 58.43 | 113 | 16.81 | 42 | 11.90 | 147 | 36.05 | 44 | 15.91 | 91 | −3.30 | 41 | 9.76 | 102 | 7.84 | 37 | 94 |
Test 8 | 41 | 12.20 | 94 | −1.06 | 35 | −2.86 | 148 | 35.81 | 49 | 26.53 | 96 | 1.04 | 38 | 5.26 | 90 | −5.56 | 36 | 95 |
Test 9 | ⋯ | ⋯ | ⋯ | ⋯ | 28 | −25.00 | 135 | 31.85 | 49 | 28.57 | 91 | −1.10 | 38 | 7.89 | 86 | −6.98 | 35 | 92 |
Test 10 | 77 | 50.65 | 110 | 16.36 | 41 | 7.32 | 154 | 40.26 | 51 | 25.49 | 94 | 2.13 | 41 | 7.32 | 90 | −2.22 | 38 | 92 |
Test 11 | 47 | 8.51 | 249 | 14.06 | 60 | 28.33 | 304 | 29.61 | 56 | 23.21 | 228 | 6.14 | 44 | 2.27 | 232 | 7.76 | 43 | 214 |
Test 12 | 52 | 3.85 | 219 | 0.00 | 50 | 0.00 | 280 | 21.79 | 62 | 19.35 | 231 | 5.19 | 47 | −6.38 | 210 | −4.29 | 50 | 219 |
Test 13 | 52 | 7.69 | 247 | 10.93 | 49 | 2.04 | 286 | 23.08 | 62 | 22.58 | 221 | 0.45 | 49 | 2.04 | 224 | 1.79 | 48 | 220 |
Test 14 | 57 | 14.04 | 234 | 6.84 | 60 | 18.33 | 321 | 32.09 | 64 | 23.44 | 219 | 0.46 | 53 | 7.55 | 225 | 3.11 | 49 | 218 |
Test 15 | 63 | 15.87 | 230 | 8.70 | 65 | 18.46 | 313 | 32.91 | 64 | 17.19 | 227 | 7.49 | 54 | 1.85 | 228 | 7.89 | 53 | 210 |
33.27.
9.75.
3.13.
31.74.
22.93.
4.26.
1.85.
0.34.
. | . | . | Heuristic43 . | Exact routing44 . | Multi-level45 . | Proposed . | ||||
---|---|---|---|---|---|---|---|---|---|---|
Test bench . | Dimensions . | No. of max droplet . | TL . | TA . | TL . | TA . | TL . | TA . | TL . | TA . |
invitro_1 | 16 × 16 | 5 | 15.33 | 9.55 | 15 | 9.09 | 16.67 | 9.67 | 15.67 | 9.43 |
invitro_2 | 14 × 14 | 6 | 11.33 | 7.29 | 11.67 | 7.12 | 14.33 | 8 | 11.33 | 7.24 |
protein_1 | 21 × 21 | 6 | 18.67 | 12.3 | 17.67 | 12.47 | 19.66 | 13 | 18.67 | 12.24 |
protein_2 | 13 × 13 | 6 | 16.67 | 7.48 | 16.67 | 7.54 | 16.33 | 8 | 16.33 | 7.27 |
. | . | . | Heuristic43 . | Exact routing44 . | Multi-level45 . | Proposed . | ||||
---|---|---|---|---|---|---|---|---|---|---|
Test bench . | Dimensions . | No. of max droplet . | TL . | TA . | TL . | TA . | TL . | TA . | TL . | TA . |
invitro_1 | 16 × 16 | 5 | 15.33 | 9.55 | 15 | 9.09 | 16.67 | 9.67 | 15.67 | 9.43 |
invitro_2 | 14 × 14 | 6 | 11.33 | 7.29 | 11.67 | 7.12 | 14.33 | 8 | 11.33 | 7.24 |
protein_1 | 21 × 21 | 6 | 18.67 | 12.3 | 17.67 | 12.47 | 19.66 | 13 | 18.67 | 12.24 |
protein_2 | 13 × 13 | 6 | 16.67 | 7.48 | 16.67 | 7.54 | 16.33 | 8 | 16.33 | 7.27 |
Figure 9 shows the results of comparing the proposed technique with High performance, GA, Fast routing, and Q-learning in terms of latest arrival time and total used electrodes on benchmark suite I; as can be seen, the proposed technique gives better outcomes in most cases. Figure 10 shows the results of comparing the proposed technique with Heuristic, Exact routing, and Multi-level routing in terms of latest arrival time and average arrival time on benchmark suite III; again, the proposed technique gives better results in most cases.
VI. CONCLUSION
Herein, we proposed a neural network-based technique for homogeneous droplet routing in DMFBs. First, the possible paths are obtained using Ape-X DQN actors, then route compaction is applied to obtain the optimal path. Routing issues such as collision, friction, and deadlock are resolved during the second phase. The outcomes for the presented test cases showed the near-optimal performance obtained using the proposed deep learning approach. However, issues such as large distributed architectural implementation and pin actuation with wire planning for dynamic routing provide significant scope for future work.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
DATA AVAILABILITY
Data set is not applicable here. We have tested through two benchmark suites available in Ref. 39.
REFERENCES
Basudev Saha received a B.Sc. in Computer Science from the University of North Bengal in India. He did his M.C.A. and M.Tech. in Information Technology from the University of Calcutta in India. He was previously an Assistant Professor in the Department of Computational Science at Brainware University in India and is currently a research scholar in the Department of Computer Science and Technology at the University of North Bengal. His research interests include machine learning, deep learning, microfluidic systems, and digital microfluidic biochips.
Bidyut Das received a Ph.D. in Computer Science and Engineering from the Maulana Abul Kalam Azad University of Technology in India. He is currently an Associate Professor in the Department of Information Technology at Haldia Institute of Technology in India. He was awarded a gold medal for his master’s degree in Computer Science. He received an Inspire Fellowship from the Department of Science & Technology of the Government of India, and a postdoctoral fellowship from the Indian Institute of Technology Guwahati. His research interests include ICT-based teaching and learning, natural language processing, computer vision, deep learning, and digital microfluidic biochips.
Mukta Majumder is an Assistant Professor in the Department of Computer Science and Technology, University of North Bengal, Siliguri, West Bengal, India. Prior to this, he served as an Assistant Professor of Computer Centre at the Vidyasagar University. His research interests include natural language processing, machine learning, ICT-based teaching, learning and assessment, microfluidic systems and biochips.