The classification of higher-order photon emission becomes important with more methods being developed for deterministic multiphoton generation. The widely used second-order correlation g(2) is not sufficient to determine the quantum purity of higher photon Fock states. Traditional characterization methods require a large amount of photon detection events, which leads to increased measurement and computation time. Here, we demonstrate a machine learning model based on a 2D Convolutional Neural Network (CNN) for rapid classification of multiphoton Fock states up to |3⟩ with an overall accuracy of 94%. By fitting the g(3) correlation with simulated photon detection events, the model exhibits an efficient performance particularly with sparse correlation data, with 800 co-detection events to achieve an accuracy of 90%. Using the proposed experimental setup, this CNN classifier opens up the possibility for quasi-real-time classification of higher photon states, which holds broad applications in quantum technologies.
I. INTRODUCTION
Recently, quantum light has been undergoing rapid advancements and playing a pivotal role in the development of quantum technologies.1–5 At the quantum level, the photon nature of light is characterized as discrete packets of energy,6,7 offering remarkable precision, sensitivity, and enhanced communication security beyond what classical optics can achieve8–10 and thereby facilitating the application of quantum systems in diverse fields such as metrology,11–14 computer science,15–17 and communication.18–21
However, due to the typically low emission rate and detection inefficiency, experimental observation on quantum emitters often necessitates a lengthy experiment time and produces large datasets with a high level of noise,22–24 which renders the fitting process computationally expensive.25,26 The identification of multiphoton states is commonly addressed by assembling multiple single-photon detectors that are time-correlated.27–31 This exacerbates the challenge of data analysis, as traditional computational techniques such as the Levenberg–Marquardt (L-M) method32 require extensive co-detection events to achieve a satisfactory accuracy.26
Over the past decade, novel data-driven formalisms such as Machine Learning (ML) have introduced new possibilities in quantum photonics experiments.33–37 Specialized in analyzing large and sparse datasets, ML models have provided speedup by orders of magnitude in certain quantum measurements38,39 and show potential in overcoming the inherent limitations of conventional fitting methods particularly in the low-photon flux regime.40 For example, a Convolutional Neural Network (CNN)-based algorithm was developed for rapid classification of single photon emitters in the NV center of nanodiamonds.41 Compared to the L-M method, the accuracy is improved with the CNN model by recognizing subtle features extracted from sparse correlation data. A single artificial neuron model was developed to reduce the required average number of photons down to less than one for distinguishing thermal light from coherent light in low-light measurements.42 A study by Cortes et al.43 demonstrated that employing statistical learning methods for the reconstruction of g(2) data can substantially accelerate the data acquisition process from few-shot measurements.
While considerable efforts were directed toward single-photon emitters,44–47 the emission of multiple, indistinguishable photons also becomes favorable for quantum systems,48,49 making them promising candidates to exert further influence on various quantum applications such as Boson sampling.50 As the commonly adopted g(2) correlation proves inadequate when detecting the photon “superbunching” in higher Fock states, it necessitates the introduction of higher-order correlation.49,51,52
In this study, we present a 2D CNN based ML model for rapid classification of multiphoton states, including photon Fock states up to |3⟩ and coherent states of laser emission. The time-dependent photon detection data are simulated, and by mixing each Fock state with the corresponding coherent state, the quantum purity of emitters can be manipulated. g(3) correlation is performed on the simulation data and fitted using a supervised machine learning model to return the photon state classification results. Through model training and optimization, the average accuracy of classification surpasses 90% for all the Fock states, with an overall accuracy of 94%. The model exhibits an effective performance with sparse datasets, with only 800 photon detection events to achieve a 90% average accuracy. Finally, we propose an experimental setup for quasi-real-time photon state classification accelerated by the ML model. For the first time, a 2D CNN algorithm is employed for identifying multiphoton states and shows enhanced accuracy and data efficiency.
II. METHODS
To simulate photon correlation experiment within an extended Hanbury Brown and Twiss (HBT) scheme53 shown in Fig. 7, the Monte Carlo method is used to generate photon detection events with the arrival timestamps. The simulation model consists of three primary parts: photon stream emission, transport, and detection as the output. This model is built upon the TensorFlow Probability (TFP) extension and the TensorFlow Distribution (TFD) probabilistic model.54
Figure 1 illustrates the simulation of an imperfect |2⟩ state emitter as an example. To simulate the photon emission, the TFD categorical function is used to generate a list of light source labels, shown by the top row in Fig. 1(a). Each label represents either a quantum emission in the |n⟩ state (red) or a coherent laser emission |α⟩ (blue) with an average photon number of n. While for an ideal quantum emitter, no laser labels will be included, non-ideal quantum emitters contain a mixture of both quantum emission labels and laser labels that are randomly distributed within the list. The quality of quantum emitters can be assessed by the portion of quantum emission labels in the list, controlled by a simulation parameter called “quantum light probability” (QLP) that ranges from 0 to 1. In practical experiments, the significance of QLP reflects a cumulative effect from various factors such as the background signal of classical light, which impact the correlation result g(k)(0). As shown in Fig. 1, the |2⟩ state emitter with a QLP of 0.5 results in an even distribution of half quantum emission labels and half laser labels. Each light label is then replaced by an integer to represent the number of photons in each emission, illustrated by the second row in Fig. 1(a). For quantum emission labels, it is straightforward to fill in an integer 2 (in red) to represent the emission of two identical photons from the |2⟩ emitter. The laser labels are substituted by using the TFD Poisson function, which generates integers (in blue) in a Poissonian distribution with the average value of 2. This adjustment comes from the photon number for each emission from a coherent laser that follows a Poissonian distribution shown by the blue plot on the right of Fig. 1(a), instead of being a constant for quantum emission represented by the red plot.
Simulation of the HBT experiment, g(3) correlation algorithm, and rapid Fock state classification based on machine learning. (a) Schematic view of the Monte Carlo simulation. The categorical function produces a list of label markers distinguishing between quantum light and laser, where in this instance, “n” represents a photon at Fock state |2⟩, while “α” means a coherent state |α⟩ with the average number of photons set to 2. The portion of quantum light labels and laser labels in the list is determined by the quantum light probability, set here at 50%, indicating an equal mixture of both. Each marker is then replaced by the actual photon number on the second row, depending on its distribution: each quantum light “n” is replaced by 2, from its delta distribution shown by the red plot on the right, while each “α” is replaced with an integer generated by a Poissonian distribution function with an average of 2, as illustrated by the blue plot. (b) g(3) correlation algorithm. In the purple rectangle, the second row with photon numbers is equally split into three rows using a multinomial function, representing photon events detected by three virtual detectors labeled as “d1,” “d2,” and “d3.” The yellow lines linking detection events from different detectors illustrate the algorithm of g(3) correlation associated with a particular data point marked in purple on the g(3) heat map to the right. Here, τ12 and τ13, representing the time differences between detectors 1 and 2 and detectors 1 and 3 are −1 and 2, respectively. (c) CNN-based Fock state classification. The g(3) correlation results are fed into a CNN model that is pre-trained with similar simulated data, and the model yields the classification result.
Simulation of the HBT experiment, g(3) correlation algorithm, and rapid Fock state classification based on machine learning. (a) Schematic view of the Monte Carlo simulation. The categorical function produces a list of label markers distinguishing between quantum light and laser, where in this instance, “n” represents a photon at Fock state |2⟩, while “α” means a coherent state |α⟩ with the average number of photons set to 2. The portion of quantum light labels and laser labels in the list is determined by the quantum light probability, set here at 50%, indicating an equal mixture of both. Each marker is then replaced by the actual photon number on the second row, depending on its distribution: each quantum light “n” is replaced by 2, from its delta distribution shown by the red plot on the right, while each “α” is replaced with an integer generated by a Poissonian distribution function with an average of 2, as illustrated by the blue plot. (b) g(3) correlation algorithm. In the purple rectangle, the second row with photon numbers is equally split into three rows using a multinomial function, representing photon events detected by three virtual detectors labeled as “d1,” “d2,” and “d3.” The yellow lines linking detection events from different detectors illustrate the algorithm of g(3) correlation associated with a particular data point marked in purple on the g(3) heat map to the right. Here, τ12 and τ13, representing the time differences between detectors 1 and 2 and detectors 1 and 3 are −1 and 2, respectively. (c) CNN-based Fock state classification. The g(3) correlation results are fed into a CNN model that is pre-trained with similar simulated data, and the model yields the classification result.
The detected photon events are analyzed using the g(3) function, by Eq. (2). The time differences τ12 and τ13 between detectors 1 and 2 and detectors 1 and 3, respectively, are variables to compute the normalized g(3)(τ12, τ13) correlation, shown by the heat map on the right of Fig. 1(b). The data point at (τ12 = −1, τ13 = 2) marked in purple on the heat map is chosen to demonstrate the g(3) algorithm depicted with the yellow stripes in Fig. 1(b). With these specific time differences, the d2 vector is shifted forward by 1 unit relative to d1, while the d3 vector is shifted backward by 2 units. The dot product of the three vectors is divided by the vector length and normalized by its mean, shown by the purple-marked pixel in Fig. 1(b). The g(3) function implements the above algorithm for all τ12 and τ13 variables, ranging from −16 to 16 with an interval of 1, and returns a two-dimensional matrix of g(3) correlation results. A 2D CNN model is developed for photon state classification based on the correlation results, shown in Fig. 1(c). In the given example, the |2⟩ state quantum emitter is successfully identified among four options: laser within a coherent state and quantum emitters in |1⟩, |2⟩, and |3⟩ states, respectively.
According to Eq. (1), to characterize a multiphoton Fock state |n⟩, a k-th order correlation with k > n is required to ensure that the central point g(n)(0) remains zero. Although the presented algorithm is extensible for high-order correlations, due to the rapid scaling of computational complexity with the order of correlation,26 g(3) is chosen for categorizing and characterizing multiphoton states within a manageable computation time. Taking the time difference τ as the variable for g(2), the g(3) function has two variables τ12 and τ13 represented by x and y axes, being the time differences between detectors 1 and 2 and detectors 1 and 3, respectively. The third time difference τ23, which can be recursively derived from the other two as τ23 = τ13 − τ12, is not considered as a variable in this context.
Table I provides theoretical g(2)(0) and g(3)(0) values for Fock states |1⟩, |2⟩, and |3⟩ when mixed with corresponding coherent states |α⟩. The quantum light probability (QLP) represents the portion of Fock states in the mixture, e.g., QLP = 1 denotes an ideal quantum state, while QLP = 0 denotes a coherent state. During the simulation, coherent states with an average photon number of 3 or less are modeled, which is often achieved by strongly attenuating a classical source, such as a laser, when measuring quantum emission.58 The g(3) results in Fig. 2 align well with the values in Table I: First, the zero-delay value is 0, as the order of correlation 3 is greater than the value of the Fock state 2. Second, the data within the cross-pattern and the anti-diagonal that are in green approach 0.5 due to the special values of time differences. For the vertical (or horizontal) stroke of the cross, the time difference τ12 (or τ13) is 0. Meanwhile, for the anti-diagonal (bottom left to top right), the two time differences are equal, signifying that τ13 is 0. With one time difference being zero, the g(3) correlation can be treated to be equivalent to a g(2), which correlates only two detectors with one variable of time difference. Hence, the value of 0.5 on the heatmap can be explained with . Finally, the remaining data points are normalized with respect to the average photon number, typically resulting in a normalization factor of 1. To better visualize the 2D g(3) matrix, three cross sections marked with colored arrowheads on the heatmap are plotted bar charts on the right side of Fig. 2. Each represents the evolution of g(3) as a function of only τ12, with τ13 being set at special values: τ13 = −τ12 for black bars, τ13 = 0 for purple, and τ13 = τ12 for yellow, corresponding to the main diagonal, horizontal stroke in the cross-pattern, and the anti-diagonal, on the heatmap.
Theoretical g(2) and g(3) values of Fock states when mixing with the corresponding coherent states. The quantum light probability indicates the portion of quantum light in the simulation when it is mixed with a corresponding coherent state. A probability of 0.5 (in green) signifies a light source composed of equal parts of quantum light and laser, which is set to be the critical point distinguishing between a quantum emitter and a coherent laser. Values below 0.5 (in blue) or above 0.5 (in red) are labeled as a laser or quantum light, respectively.
![]() |
![]() |
An example of g(3) correlation results from simulated data for a |2⟩ Fock state without mixing with a coherent state. Left: Normalized g(3) correlation in a heat map, as a function of τ12 and τ13, which represent the time differences between detectors 1 and 2 and detectors 1 and 3, respectively. Right: 2D bar plots of g(3) correlation, as a function of only τ12, while τ13 is set to be either negative τ12, 0, or equal to τ12, corresponding to black, magenta, and yellow bars. The trace of each bar plot is indicated by the arrows within the same color on the heat map.
An example of g(3) correlation results from simulated data for a |2⟩ Fock state without mixing with a coherent state. Left: Normalized g(3) correlation in a heat map, as a function of τ12 and τ13, which represent the time differences between detectors 1 and 2 and detectors 1 and 3, respectively. Right: 2D bar plots of g(3) correlation, as a function of only τ12, while τ13 is set to be either negative τ12, 0, or equal to τ12, corresponding to black, magenta, and yellow bars. The trace of each bar plot is indicated by the arrows within the same color on the heat map.
In contrast to the second-order correlation, which has only one critical point g(2) (0) for categorization, the third-order correlation g(3) features multiple critical points: one at g(3)(0) and multiple at g(3)(0, τ ≠ 0), effectively serving as g(2)(0). In accordance with Eq. (1), the complete g(3) cross correlation enables the identification of unknown Fock state emission and the assessment of its quantum purity. Even for Fock states higher than |2⟩, although g(3) (0) is no longer 0, it can still be determined from the g(3) (0) and g(2)(0) values. Meanwhile, the dynamics of excitons can be further explored through other correlation algorithms, such as two-time second-order auto-correlation.59 In addition, the two-dimensional nature of the g(3) data proves advantageous for CNN models that are adept at extracting spatial features from images.60,61
While the g(3) critical values can typically be determined using normal fitting methods, the experimental data often exhibit significant sparsity due to the limited number of photon detection events, entailing the introduction of advanced data fitting techniques. To efficiently categorize photon states by fitting the g(3) data, a machine learning model is developed based on the open-source API Keras.62 The model architecture and layer operations are illustrated in Fig. 3. The model mainly comprises 2D convolution layers (Conv2D), 2D max pooling layers (MaxPool), 2D global average pooling layers (AvePool), and dense layers. The output shape of each layer is noted in parentheses.
Schematics of the CNN model. Conv2D is for the 2D convolution layer, MaxPool is for the 2D max pooling layer, AvePool is for the 2D global average pooling layer, and ×2 represents a layer or sequence of operations (in brackets) that is repeated twice. The shape of output data from each layer is written in parentheses. The dashed lines illustrate the connection between input data points and respective output data during layer operations. The initial input data consist of g3 correlation results in a 2D matrix, shown by the white cells, while red cells represent zero paddings. The Conv2D layer produces 3D matrices of data with a depth of 64, equal to the number of convolutional kernels (blue dashed lines represent one of the kernels). The data shape is condensed to one dimension at the AvePool layer, and the final dense layer outputs a score vector, indicating the confidence of predictions for each light source category.
Schematics of the CNN model. Conv2D is for the 2D convolution layer, MaxPool is for the 2D max pooling layer, AvePool is for the 2D global average pooling layer, and ×2 represents a layer or sequence of operations (in brackets) that is repeated twice. The shape of output data from each layer is written in parentheses. The dashed lines illustrate the connection between input data points and respective output data during layer operations. The initial input data consist of g3 correlation results in a 2D matrix, shown by the white cells, while red cells represent zero paddings. The Conv2D layer produces 3D matrices of data with a depth of 64, equal to the number of convolutional kernels (blue dashed lines represent one of the kernels). The data shape is condensed to one dimension at the AvePool layer, and the final dense layer outputs a score vector, indicating the confidence of predictions for each light source category.
Before passing to any layers, the correlation results are truncated and rescaled to improve the model’s performance. The first 32 elements in each dimension of the original 33 × 33 g(3) matrix are retained so that the truncated matrix, with a shape of 32 × 32, aligns with the preference of CNN layers for input data dimensions that are multiples of 2. In cases of simulation with few-shot data, where normalized g(3) results may contain extremely high values, the rescaling ensures that all the g(3) elements fall within the range of 0–1. The preprocessed datasets, depicted by the white square cells in the first diagram of Fig. 3, have a spatial dimension of 32 × 32 and a depth of 1.
As the core of the CNN model, a 2D convolutional layer connecting to the input layer extracts the spatial features. The input data are zero-padded on the outer border, as indicated by the red square cells in Fig. 3, to prevent rapid degradation of the information at original borders and maintain the output dimension of the Conv2D layer at 32 × 32. Each Conv2D layer contains 64 convolutional kernels, and each kernel is a 3 × 3 matrix that spatially slides across the data array with a step of 1. The blue dashed boxes in Fig. 3 illustrate the sum product calculation of a convolutional kernel, with the dashed lines connecting to the cells representing the results. The above operation is performed by all the 64 kernels, with each kernel capturing a unique feature from the input, producing the Conv2D output with a depth of 64. The zero-padding is also applied to the Conv2D output data to maintain the layer dimension. Two of Conv2D layers are sequentially stacked and followed by a 2D max pooling layer, which calculates the maximum values of each pooling patch to spatially downsample the data dimension. Three patches are depicted by the red dashed boxes in Fig. 3 for visualization, and this process is repeated for all the 64 slices in depth. The layer dimension is halved with a step size of 2 for max pooling, resulting in an output shape of 16 × 16 × 64. The number of trainable parameters is significantly reduced by the downsampling so that a relatively low model capacity is maintained without losing essential information.63 The architecture enclosed in the square bracket in Fig. 3, composed of two consecutive convolutional layers and one MaxPooling layer, is executed twice for learning hierarchical features.
The following global average pooling layer computes the average for each slice in depth, shown by the gray cells in the figure. Each average reflects the importance of an extracted feature from the neural network. This array of feature scores is analyzed by two dense layers, with each neuron fully connected to all the neurons from the previous layer. The output layer computes scores for four potential photon states: the coherent state, as well as Fock states |1⟩, |2⟩, and |3⟩. Each score indicates the model’s confidence in predicting a specific photon state, and the state with the highest score is determined as the outcome. As the QLP represents the deviation of correlation results from the theoretical values of an ideal |n⟩ quantum emitter due to experimental factors, the exact value is not included in the machine learning outcome. Instead, by discretizing QLPs into bins, the regression task is reformulated into a classification problem, which aligns better with the CNN’s specialty.64
Among the wide variety of machine learning methods, algorithms with straightforward architectures, such as k-Nearest Neighbors65 (k-NNs), Logistic Regression66 (LG), and Gradient Boosting67 (GB), have been implemented for classification problems68–70 and have also been applied to quantum and photonics research.71,72 While these simple classifiers offer better interpretability73 and lower computational requirements74 compared to the proposed 2D-CNN model, they are less suitable and may have limited performance for photon correlation problems. As shown in the supporting information of Ref. 41, for a binary decision problem of single-photon emission based on 1D g(2) correlation, the CNN model demonstrates the highest overall fitting accuracy, outperforming other algorithms including GB, and a voting classifier that combines LG and k-NN. In addition, the accuracy of the CNN improves with more training data, while neither LR nor k-NN shows further noticeable improvements.
In order to enhance the experimental determination of photon states, it is essential to achieve fast and accurate predictions from sparse and noisy experimental data, while the cost of the model training process is not the main concern. One drawback of simple ML algorithms such as k-NN is that prediction with large training datasets becomes computationally expensive due to its iterative comparison process.75 Algorithms such as LG and GB require upfront feature engineering to adapt to the non-linear relationship between g(3) results and photon Fock states. This process is particularly challenging for real-time fitting with noisy correlation data due to signal obscuration and reduced feature quality. In contrast, the complexity of neural networks for decision-making becomes a trade-off for efficiently capturing spatial hierarchies and patterns, especially with sufficient training data.74,76,77 By adopting architectures of well-developed models and utilizing novel hyperparameter optimization frameworks such as KerasTuner,78 CNN becomes a feasible and efficient ML technique for physicists.
This CNN model consists of 11 layers (excluding the input layer) and contains over 150 000 trainable parameters. Except for the dense layers, batch normalization is applied to the output of each layer (not shown in the diagram). The Adam method is chosen as the model optimizer, and all CNN layers use ReLU (Rectified Linear Unit) as the activation function. The g(3) data are simulated for |1⟩, |2⟩, and |3⟩ Fock states, with two variable parameters: QLP and the number of detection events. The QLP has 21 possible values ranging from 0 to 1 with intervals of 0.05, to represent the quantum pureness of the emitter. Qualified quantum emitters require a QLP greater than or equal to 0.5, while a QLP less than 0.5 will be categorized as a coherent state. The number of detection events that simulates the experiment time ranges from 100 to 100 000. Values from 100 to 10 000 are taken at intervals of 100, while values from 10 000 to 100 000 are taken at intervals of 1000, totaling 190 values. A total of 11 970 cases are simulated, with 100 measurements for each case, resulting in 11 970 000 g(3) correlation datasets. 70% of the datasets are used for model training, with a batch size of 32 and a single epoch, given the substantial data that do not necessitate iterations. 20% of the data are validation sets for hyperparameter tuning, while the remaining 10% are used for testing the accuracy. Considering this study’s emphasis on improving fitting accuracy for sparse data, the model’s evaluation primarily relies on the required number of photon detection events to achieve satisfactory predictions, instead of computational performance metrics such as computation time.
III. RESULTS
The average accuracy of classification results is shown in Fig. 4, as a function of variables, including the photon state, QLP, and the number of detection events. The left bar chart displays the average accuracy for all cases in each photon state, with the coherent state plotted in blue, and the Fock states |1⟩, |2⟩, and |3⟩ are plotted in red. An accuracy of 90% is achieved for all photon states, with classification of the coherent state and Fock state |3⟩ being higher compared to |1⟩ and |2⟩. The middle and right plots describe the accuracy over all the photon states as functions of QLP and the number of events. A significant drop in accuracy is observed when the QLP approaches 0.5. This is because QLP = 0.5 is defined as the threshold distinguishing between quantum light (Fock state) labels and non-quantum light (coherent state) labels. Near this decision boundary, the similarity in the correlation results posts a challenge for accurate classification, which is further indicated in Fig. 5. As shown on the right, the average accuracy across all light categories improves from 72% to 95% as the number of simulation events increases from 100 to 10 000. The accuracy boundary of 90%, achieved with only 800 events, is depicted by the green vertical line.
Accuracy of the CNN-based photon state classifier. Left: Overall accuracy for each light source category. Blue: Laser with a coherent state. Red: Quantum light Fock states. Middle: Averaged accuracy over all light source categories as a function of quantum light probability, which indicates the portion of quantum light in the simulation when it is mixed with a corresponding coherent state. A probability of 0.5 signifies a light source composed of equal parts of quantum light and laser, while 0.0 or 1.0 indicates a pure laser or pure quantum light, respectively. Right: Averaged accuracy over all light source categories as a function of number of detection events. The number of detections is in logarithmic scale for better visualization. The green vertical line indicates the 90% accuracy boundary.
Accuracy of the CNN-based photon state classifier. Left: Overall accuracy for each light source category. Blue: Laser with a coherent state. Red: Quantum light Fock states. Middle: Averaged accuracy over all light source categories as a function of quantum light probability, which indicates the portion of quantum light in the simulation when it is mixed with a corresponding coherent state. A probability of 0.5 signifies a light source composed of equal parts of quantum light and laser, while 0.0 or 1.0 indicates a pure laser or pure quantum light, respectively. Right: Averaged accuracy over all light source categories as a function of number of detection events. The number of detections is in logarithmic scale for better visualization. The green vertical line indicates the 90% accuracy boundary.
Accuracy distribution of photon Fock state classification, as functions of quantum light probability and the number of detections for each Fock state case. From top to bottom: Accuracy distributions for light sources, comprising |3⟩, |2⟩, or |1⟩ Fock states of quantum light, respectively, each combined with its corresponding coherent state |α⟩, where the average photon number is 3, 2, or 1, respectively. For each case, a quantum light probability of 0.5 signifies a light source composed of equal parts of quantum light and laser, while 0.0 or 1.0 indicates a pure laser or pure quantum light, respectively. Green vertical line: boundary of 90% averaged accuracy over all light source categories, at 800 detection events.
Accuracy distribution of photon Fock state classification, as functions of quantum light probability and the number of detections for each Fock state case. From top to bottom: Accuracy distributions for light sources, comprising |3⟩, |2⟩, or |1⟩ Fock states of quantum light, respectively, each combined with its corresponding coherent state |α⟩, where the average photon number is 3, 2, or 1, respectively. For each case, a quantum light probability of 0.5 signifies a light source composed of equal parts of quantum light and laser, while 0.0 or 1.0 indicates a pure laser or pure quantum light, respectively. Green vertical line: boundary of 90% averaged accuracy over all light source categories, at 800 detection events.
To explicitly illustrate the model’s performance on a case-by-case basis, Fig. 5 presents accuracy heatmap for the three Fock states, where each pixel is colored according to the average accuracy of all cases with a specific QLP and the number of events. The accuracy of coherent state classification is included in each Fock state, where QLP is less than 0.5, instead of a separate plot. In general, the accuracy is significantly enhanced with an increasing number of events across most QLPs. A consistent accuracy drop occurs at the decision boundary with QLP approaching 0.5, which aligns with the trend in the middle figure of Fig. 4. For all three cases, the performance decreases at the left of the green line (the 90% overall accuracy boundary from Fig. 4, right), where datasets contain fewer detection events and have QLPs higher than 0.5. While a relatively low accuracy seems reasonable with few-shot data, the “vertical” asymmetry of accuracy comes from the large data sparsity and uncertainty with higher QLPs. In photon statistics, coherent states with a larger variance are more likely to emit higher photon states,6 which produce recognizable features in the correlation. For higher QLPs, the occurrence of coherent states drops down, consequently leading to less distinctive features in the correlation results for recognition. As the number of simulation events increases, the g(3) results tend to stabilize for both coherent states and Fock states, which leads to a quick disappearance of the triangular region with low accuracy.
For the |1⟩ case, the low-accuracy triangle vanishes earlier than |2⟩ and |3⟩ with growing detection events. According to Table I, the |1⟩ state exhibits larger variations in g(2)(0) and g(3)(0) values with changing QLP, which contributes greatly to the CNN classification. Conversely, the smaller differences for |2⟩ and |3⟩ make them less conducive to recognition with fewer events, resulting in lower rate of increase in accuracy. At the decision boundary with a QLP of 0.5, the accuracy of the |3⟩ state exhibits a higher rate of increase compared to |1⟩. This is because the relatively higher number of photons from the |3⟩ state facilitates pattern recognition by the model and consequently requires fewer events for accurate classification.
For the |2⟩ state, the accuracy shows a decrease with more detection events, particularly at intermediate QLPs. Some mispredictions of coherent states can be attributed to the confusion near the decision boundary; as mentioned earlier, the majority of mispredictions involve the |3⟩ state, potentially caused by overfitting.79,80 Given that |2⟩ and |3⟩ exhibit similar values of g(2)(0), the decision boundary g(2)(0) = 0.83 [determined by Eq. (1)] for distinguishing |3⟩ and its coherent state is misused for identifying |2⟩. Consequently, for the |2⟩ cases with g(2)(0) greater than 0.83 (or the QLP exceeds 0.3), misclassification as |3⟩ results in a noticeable drop in accuracy, even with increased numbers of detection events. While they possess different g(3)(0) values, the contribution of this single element to the classifier is overshadowed by the abundance of g(2)(0) elements in the correlation matrix. Given that typically low photon co-detection events are involved in quantum correlation experiments,58 this paper primarily focuses on datasets containing fewer than 10 000 detection events.
To visualize the prediction distribution over photon states, a 3D bar chart of the confusion matrix is shown in Fig. 6. The diagonal elements in green indicate the average accuracy for each state, while the off-diagonal elements in yellow describe the distribution of misclassification. The highest accuracy achieved is 98.7% for the coherent state, while the accuracy for Fock states remains above 90%. This is attributed to more datasets that are simulated for coherent states, which boosts the ML algorithm during the training process. The model showcases a balanced performance in recognizing the Fock states, with the most common incorrect prediction being the coherent state, owing to their similarity near the decision boundary. Another notable misclassification is observed between |2⟩ and |3⟩, which arises from the minor difference in the g(3)(0) values as discussed earlier. Other elements in the error matrix are below 0.1% and can be considered negligible.
Confusion matrix of the classifier in a 3D bar plot. The correct predictions are represented by the diagonal elements in green, while the off-diagonal elements, indicating incorrect predictions, are depicted in yellow. Blue labels: Laser with a coherent state. Red labels: Quantum light Fock states.
Confusion matrix of the classifier in a 3D bar plot. The correct predictions are represented by the diagonal elements in green, while the off-diagonal elements, indicating incorrect predictions, are depicted in yellow. Blue labels: Laser with a coherent state. Red labels: Quantum light Fock states.
IV. CONCLUSION
In contrast to the fitting methods that can be implemented directly, the machine learning algorithm requires an additional training phase prior to data analysis. However, this trade-off can yield substantial enhancement of fitting accuracy and efficiency.39 By achieving an average accuracy of 90% with only 800 simulated events, the ML model demonstrates the capability for reliable classification without requiring lengthy measurement time. Another advantage of ML algorithms is their ability to be optimized for recognizing specific data pattern.81 As 2D CNNs have been extensively developed for recognizing spatial features,61 the integration of g(3) with 2D CNN holds great potential for a novel quantum light classifier, especially in the multiphoton regime.
With Fig. 7, we propose the implementation of the 2D CNN model in quasi-real-time photon state distribution measurements. Examples of different light emitters are included in Fig. 7(a). The model on the top shows single-photon emission from the exciton recombination in a two-level system, such as a semiconductor quantum dot. The red dots represent the single photons emitted periodically in the temporal regime, with the photon Fock state labeled as |1⟩. The middle section illustrates a two-photon emitter that emits two indistinguishable photons represented by the red dots in pairs upon each excitation, with a Fock state of |2⟩. The photon stream from a continuous-wave laser is shown at the bottom, where each emission occurs randomly in time, exhibiting a coherent photon state |α⟩. These listed light sources are coupled into an optical fiber system with an HBT configuration, indicated by the dark blue wavy lines in Fig. 7(b). The incident beam path is evenly split into four output paths by the 50:50 beam splitters (BS). Each output is attached to an avalanche photodiode (APD) that records the photon arrival time and transmits the data to the correlation electronics (CB). g(3) correlation data are computed online and fed into a Tensor Processing Unit (TPU), serving as a machine learning hardware accelerator. The ML model is pre-trained and optimized to provide real-time photon state classification results, as shown in Fig. 7(c). With the specialty of rapid fitting on sparse datasets, the measurement time on individual samples can be substantially reduced, and immediate feedback from the ML analysis enables real-time parameter optimization.
Experiment layout for measuring quasi-real-time Fock state distribution enabled by machine learning. (a) Light sources including quantum light emitters, specifically two-level systems in a semiconductor quantum dot, with potential photon Fock states such as |1⟩ or |2⟩ (red dots), and an attenuated laser in a coherent state |α⟩ (blue dots) with a Poissonian photon distribution. (b) Optical setup. The photon emission is coupled into an optical fiber and is then equally split into four paths by the 50:50 fiber Beam Splitter (BS). Each path leads to an Avalanche Photodiode (APD) that is connected to the Correlation Board (CB). (c) Correlation data will be analyzed by an optimized CNN model in the Tensor Processing Unit (TPU), serving as a specialized hardware accelerator, which returns the predicted light source category.
Experiment layout for measuring quasi-real-time Fock state distribution enabled by machine learning. (a) Light sources including quantum light emitters, specifically two-level systems in a semiconductor quantum dot, with potential photon Fock states such as |1⟩ or |2⟩ (red dots), and an attenuated laser in a coherent state |α⟩ (blue dots) with a Poissonian photon distribution. (b) Optical setup. The photon emission is coupled into an optical fiber and is then equally split into four paths by the 50:50 fiber Beam Splitter (BS). Each path leads to an Avalanche Photodiode (APD) that is connected to the Correlation Board (CB). (c) Correlation data will be analyzed by an optimized CNN model in the Tensor Processing Unit (TPU), serving as a specialized hardware accelerator, which returns the predicted light source category.
With the development of ML software,54,62,82 further enhancement on the presented prototype becomes possible. For example, 2D locally connected layers can be a promising substitute for the CNN layers for capturing the pattern (shown on Fig. 2 left) that are spatially fixed in correlation data. In these layers, the parameter-sharing scheme between convolutional kernels is relaxed, which facilitates the recognition of specific structures.83 However, the model’s capacity will be increased by incorporating multiple such layers. Moreover, an ensemble of similar models can be developed, with each model trained for specific cases, to enhance the overall performance.
The presented simulation model can be tailored to simulate broader scenarios involving random coherent lights with low average photon numbers mixing with quantum emission. This adaptation is particularly relevant as quantum light detection often involves laser emission, as the excitation source of quantum emitters. In addition, this model can be further modified to compute dynamic correlations, such as the conditional auto-correlation function (CACF) for studying heralded photons.84 As the landscape of quantum emitters has been enriched by the emerging 2D materials with functional heterostructures and satisfactory fabrication efficiency such as graphene,85 TMDs (Transition Metal Dichalcogenides),47,86,87 and moiré superlattices,88 the scalability for photonic circuits is often impeded by the rare and random occurrence of quantum emitters,47 which are considered from defects or strains in these atomically thin layers.89,90 This study offers insights into Fock state mapping on 2D materials, rapidly determining the spatial location and assessing the quantum purity of potential quantum emitters.
Recently, novel methods for multiphoton preparation have been demonstrated. Reference 91 proposed a theoretical model of deterministic generation of large Fock states up to |100⟩ based on the resonant interaction of a coherent state with two-level systems. The experimental generation of eight indistinguishable photons using temporal-to-spatial demultiplexing with single photon sources was presented.92 The proposed g(3) correlation–2D CNN combination introduces new possibilities for categorizing and characterizing multiphoton states.
In addition, this approach opens up the possibility for optimizing photosynthesis through real-time quantum feedback. Reference 93 has shown that the implantation of biocompatible quantum dots (BQDs) enables quantum measurements for monitoring intracellular environments, effectively overcoming the persistent challenge of strong autofluorescence background in traditional spectroscopy of plant cells.94
In this article, we present a machine learning model for quasi-real-time categorization of photon states with g(3) correlation and propose the implementation in experiments. This methodology introduces new feasible solutions for identifying quantum emitters and holds broad applications in quantum metrology within the field of nanophotonics.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Guangpeng Xu: Data curation (lead); Formal analysis (lead); Software (lead); Validation (lead); Visualization (lead); Writing – original draft (lead). Jeffrey Carvalho: Writing – original draft (supporting). Chiran Wijesundara: Writing – original draft (supporting). Tim Thomay: Conceptualization (lead); Funding acquisition (lead); Project administration (lead); Software (supporting); Supervision (lead); Visualization (supporting); Writing – original draft (supporting).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.