This study develops mathematical tools and approaches to investigate spatiotemporal patterns of firearm acquisition in the U.S. complemented by hypothesis testing and statistical analysis. First, state-level and nation-level instant background check (BC) data are employed as proxy of firearm acquisition corresponding to 1999–2021. The relative-phase time-series of BC in each U.S. state is recovered and utilized to calculate the time-series of the U.S. states’ synchronization degree. We reveal that U.S. states present a high-level degree of synchronization except in 2010–2011 and after 2018. Comparing these results with respect to a sitting U.S. president provides additional information: specifically, any two presidential terms are characterized by statistically different synchronization degrees except G. W. Bush’s first term and B. H. Obama’s second term. Next, to detail variations of BC, short-time Fourier transform, dimensionality reduction techniques, and diffusion maps are implemented within a time-frequency representation. Firearm acquisition in the high frequency band is described by a low-dimensional embedding, in the form of a plane with two embedding coordinates. Data points on the embedding plane identify separate clusters that signify state transitions in the original BC data with respect to different time windows. Through this analysis, we reveal that the frequency content of the BC data has a time-dependent characteristic. By comparing the diffusion map at hand with respect to a presidential term, we find that at least one of the embedding coordinates presents statistically significant variations between any two presidential terms except B. H. Obama’s first term and D. J. Trump’s pre-COVID term. The results point at a possible interplay between firearm acquisition in the U.S. and a presidential term.

With a rate of firearm ownership of 1.21 firearms per capita, the U.S. has the largest firearm prevalence among all developed countries in the globe. There are more guns than people. Are gun purchases independently made throughout the country, or is there some form of coordination, collective behavior that drives firearm acquisition in U.S. states? Addressing this question is the first objective of this paper. Next, we investigate how patterns of firearm acquisition evolve over time. We bring forward methodological advances in statistics and engineering to answer the above two questions with data covering the last 22 years. Overall, this effort reveals compelling evidence that the U.S. states act in a highly coordinated manner when it comes to acquiring firearms and that such coordination has a time-dependent character. In view of these, we perform statistical analysis to compare this time-dependent coordination between the U.S. presidents. Results suggest that there exists a possible interplay between a sitting U.S. president and temporal patterning of the BC data.

## I. INTRODUCTION

According to a report from the U.S. Firearms Commerce, private citizens in the U.S. own over $393$ million firearms.^{1} This figure corresponds to $1.21$ firearms per capita, leading to a well-known fact in the U.S.: *we have more guns than people*. A decade ago, gun ownership in the U.S. was $0.9$ per capita:^{2} thus, gun ownership has risen over the last ten years even when adjusted against population growth. These numbers suggest a unique stand of the U.S. in terms of firearm prevalence given that in other high-income countries, such as Germany, France, Canada, Italy, Australia, and Finland, ownership averages only around 0.27 per capita, even with a large number of civilian firearm holdings.^{2}

The publicly available National Instant Criminal Background Check (BC) data are a reliable resource that can help characterize firearm prevalence in the U.S. However, neither every background check results in a firearm acquisition nor does each background check necessarily result in only a single firearm acquisition.^{3} Despite these limitations, BC data have been extensively utilized by researchers as a proxy of firearm acquisition.^{4–11} The data exhibit a clearly increasing trend, especially in the past decade, qualitatively supporting some of the survey results cited above. BC data have been utilized by Timsina *et al.*,^{4} for example, to study the relationship between firearm acquisition and youth gun carrying. Likewise, Porfiri *et al.*^{5,11} examined the BC data to show how fear of stricter gun regulations after a mass shooting causes an increase in firearm purchases with nation- and state-level analyses. More recently, Schleimer *et al.*^{6} and Lang and Lang^{7} studied excess firearm purchases during the COVID-19 pandemic. Many authors also utilized BC data as a proxy to investigate firearm-related homicides, suicides, and accidental deaths.^{8–10}

BC data are available at the nation-level and at the state-level, each set of data enabling different research questions. The nation-level BC data are rich and informative, being available at a daily resolution, but it is only an aggregate of the state-level data. Studies based on monthly resolution state-level data can provide knowledge about how each state contributes to firearm acquisition in the U.S. and, therefore, help inform state-level policy making and characterize state-to-state interactions. Given that the U.S. states are diverse in many aspects, including socio-economic factors, political views, and firearm laws, and that they may even react differently in different political climates, it is of interest to understand whether or not U.S. states behave similarly or differently in their firearm acquisition. Synchronization characteristics of U.S. states in their BC data can provide valuable insight into these similarities and differences.

Political, economical, and social climates can potentially influence the BC data. For example, nation-level BC data have experienced a dramatic increase immediately after B. H. Obama won the election in 2008. According to one source, this was possibly because gun-purchasing communities feared of stricter firearm regulations during President Obama’s term.^{12} Luca *et al.*^{13} found the party in power at the state-level to be influential on the enactment of firearm-related laws upon occurrence of mass shootings. In addition, Eshbaugh-Soha and Peake^{14} offered evidence about particular characteristics of three presidents (R. W. Reagan, W. J. Clinton, and G. W. Bush) when setting agendas on unemployment, inflation, and international economics. Connecting these ideas together, a natural question that arises is how firearm acquisition patterns across states compare to presidential terms. With BC data available since 1999, such an investigation can cover five terms of U.S. Presidency, namely, two terms each of G. W. Bush and B. H. Obama and one term of D. J. Trump.

Here, we study synchronization patterns in BC data across states. Synchronization is calculated based on the phase between the states, where phase is defined as the instantaneous angle between two oscillators, both at the same frequency. Specifically, phase time-series of each U.S. state are extracted from annual oscillations in the data. These time-series are then used to calculate synchronization time-series among the U.S. states based on the Kuramoto order parameter.^{15} Analyzing nation-level BC data at relatively high frequencies offers opportunities in understanding firearm acquisition characteristics among the U.S. states. Indeed, the fast-varying content in the detrended and seasonally adjusted BC data was recently employed in an information theory-based approach^{5} to investigate causality between time-series of BC, media output, and mass shootings. Recognizing the rich information contained in the fast-varying content of BC data, we investigate the energy at relatively high frequencies in the spectrum of nation-level BC data using short-time Fourier transform. However, the arising power spectrum data are high dimensional and opaque to interpretation, thereby requiring a dimensionality reduction technique with clustering analysis to detect variations.

This study is focused on explaining whether or not U.S. states present a synchronized behavior in their firearm acquisition and how firearm acquisition compares with respect to the most recent five U.S. presidential terms. To this end, literature on the main methods used in this paper is reviewed in Sec. II, and data are described in Sec. III. Research methods and results pertaining to synchronization based on state-level BC data are provided in Sec. IV A and Sec. V A, with their robustness assessment included in Appendix A. Time and frequency domain analysis of nation-level BC data based on short-time Fourier transform, dimensionality reduction, and clustering is presented in Secs. IV B and V B. The paper ends with a discussion in Sec. VI.

## II. LITERATURE REVIEW

### A. Phase synchronization and the Kuramoto order parameter

Interest in studying synchronization possibly started with the demonstration of how oscillations of two pendulums closely positioned on a foundation eventually reach a “common rhythm” due to weak mechanical coupling between them via the foundation.^{16} Study of synchronization is much broader than only mechanical systems. For example, synchronization appears in economics,^{17} neuroscience^{18–20} and human social behavior.^{21–23} Based on the nature of data available, various mathematical tools, including phase synchronization,^{24} cross correlation,^{18} event synchronization,^{25} and nonlinear interdependence,^{26} are commonly utilized for investigating synchronization.

Given that BC data for most U.S. states exhibit annual oscillations, a natural choice for our study is to analyze such oscillations in terms of their phase synchronization. Phase synchronization can be studied with oscillatory systems in a narrow frequency band.^{27} This phenomenon is associated with the phase-locking degree and can be understood as the relative phase between these systems.^{21,28} A well-established approach to measure the coherence of phase among a group of oscillatory units is to first recover their instantaneous phase and then utilize this phase information to calculate the Kuramoto order parameter.^{15,22,29,30} The magnitude $r$ of this order parameter will then help quantify the instantaneous degree of synchronization among the units in the group (see Methods and Analysis in Sec. IV).

The quantity $r$, which we call synchronization degree, varies between zero and one, where larger values indicate a larger relative degree of synchronization. This mathematical construct has been utilized with promising results in the literature. For example, in an experiment where a number of participants sat together and rocked their chairs at the same time, the authors reveal a distinctive synchronization degree for the movement of rocking chairs utilizing $r$ in statistical tests.^{22} In another study^{31} based on multiple simulations, $r$ is used to determine the effects of frequency mismatches on the synchronization of coupled oscillators.

### B. Hilbert transform for phase recovery

Before phase synchronization among oscillating units can be studied, the phase of each of them needs to be calculated. This can be done by following two common techniques, namely, Hilbert transforms^{32–34} or wavelet transforms.^{19,35} These two transforms are intrinsically related,^{27,36–38} and their premise is the definition of an analytical signal from the given time-series. This signal is a complex function whose amplitude and phase provide an estimate of the instantaneous amplitude and phase of the time-series.

Hilbert transform has proven to be a powerful tool in phase recovery in diverse research fields, from neuroscience,^{38,39} signal processing,^{40} and acoustics^{41} to aeronautics^{34} and human behavior.^{21} In terms of computational implementation, it can be calculated via several algorithms, such as the fast Fourier transform^{32} and discrete convolution.^{42} The Hilbert transform is applicable to time-series with narrow frequency bands,^{43} but real-world data, such as the BC time-series at hand, are not necessarily narrow band. For this reason, a common practice is to first pre-process the time-series via filtering.^{27} Other limitations in implementing the Hilbert transform include the influence of the product theorem of the Hilbert transform^{44,45} and edge effects.^{44} In Sec. IV, we focus on Hilbert transforms, and, to address these limitations, we include an equivalent analysis with wavelet transforms in Appendix A.

### C. Diffusion maps for clustering

By applying mathematical and computational tools to analyze raw data, we often generate high-dimensional data that must be systematically interpreted to explain the underlying characteristics of the original, raw data. However, this is a challenging task due to the high-dimensional nature of the generated data. Originating from the notion of random walks, diffusion maps provide a useful and practical approach to analyze high-dimensional data in an effective manner via dimensionality reduction and spectral clustering.^{46} Diffusion maps have been instrumental in helping explain high-dimensional data arising in the study of animal behavior,^{47,48} nonlinear dynamical system,^{49} network anomaly detection,^{50} and document classification.^{51}

Technically speaking, diffusion maps extract the intrinsic geometry in a given data set by constructing a network between the points in the data set and identifying pair-wise transition probabilities between any pairs of such points. This is achieved based on a so-called “diffusion distance,” which, in some sense, combines all the paths of random walk transitions between the points. As per the definition of “diffusion distance,”^{46,52} diffusion maps offer a robust means to reveal the geometry underlying the data set even in the presence of noise and perturbations. In this sense, they are more preferred over other commonly utilized linear methods, such as principal component analysis,^{53} and nonlinear dimensionality reduction methods, such as ISOMAP.^{51,54} Readers are referred to Kolpas *et al.*^{47} and Aureli *et al.*^{48} where authors demonstrate the efficacy of utilizing diffusion maps in studying collective motion.

Considering that real-world data, such as the BC time-series at hand, may be subject to noise and uncertainty, the choice of diffusion maps is well justified. Performing the diffusion map analysis of the BC time-series over large windows, on the order of years, will also help reduce the effects of noise. Research in collective group behavior provides further support and parallelism for the current study. For example, diffusion maps are utilized in Ref. 48 to reveal typical coherent patterns of animal groups and their transitions between these patterns. The current effort has analogous elements to the cited study in that we aim to reveal whether or not there exist certain patterns of behavior among the U.S. states in their BC data and how these patterns, if they exist, are similar or different in a presidential term.

## III. DATA

National instant criminal background check (BC) data available publicly are utilized for analysis.^{55} These data are available in two forms: (a) for each state at a monthly resolution and (b) aggregate of all the states as national data at a daily resolution. These data are plotted in Appendix B. The data range from $1/1999$ to $4/2021$ is 268 months in total. The state of Hawaii is not included in our study since BC data for this state are not available. Although the state of Connecticut has missing data from $1/2000$ to $8/2001$, this state is still included by considering zero values in place of the missing data.

When necessary, the data are treated in time windows corresponding to a presidential term. Since the last year of D. J. Trump’s term might be impacted by the COVID-19 pandemic, only the first three years of his Presidency are considered in our calculations. Data corresponding to D. J. Trump’s fourth year in the Office are separately treated. G. W. Bush’s first and second terms are, respectively, labeled with Bush (1) (1/2001–12/2004) and Bush (2) (1/2005–12/2008); B. H. Obama’s first and second terms are denoted by Obama (1) (1/2009–12/2012) and Obama (2) (1/2013–12/2016), and the first three years of D. J. Trump’s term as Trump (pre-COVID-19) (1/2017–12/2019).

## IV. METHODS AND ANALYSIS

Two main directions are pursued: (1) using state-level monthly BC data, we calculate phase time-series of each U.S. state to estimate synchronization degree among the states (Sec. IV A); and (2) using nation-level daily BC data, we investigate how the energy in the BC data distributes over time and frequency and implement diffusion maps and clustering techniques to explain how this power shift is similar or different in presidential terms (Sec. IV B). In the following, we provide the mathematical procedures to pursue (1)–(2). In all statistical analyses, we provide proper terminology when constructing hypotheses and comparing presidential terms vis-à-vis research directions (1) and (2); however, these analyses should not be intended as causal links from a president or a presidential term to the outcomes of (1) and (2).

### A. Analysis of state-level monthly background check data

By observing BC time-series (see Appendix B), a cyclic pattern is clear for most U.S. states where BC reaches peaks in winters and valleys in summers. That is, BC data exhibit a seasonality effect in *annual oscillations* with a frequency of $1/12\u22480.0833$ [1/months]; by factoring in some level of margin, we consider annual oscillations to be in the range of $[0.06,0.11]$ [1/months]. With data sampled every month, the frequency spectrum of BC data is in the range of $[0,0.5]$ 1/months. Hence, the frequency range of annual oscillations is about $10%$ of the whole spectrum. Next, BC time-series are detrended to ensure stationarity for power spectral density estimation. On average, the power of *annual oscillations* is found to be $35%$ of the total power for all U.S. states considered.

Since annual oscillations are the most prevalent for their relatively large power ($35%$) in a relatively small frequency band ($10%$), we focus on them for the study of synchronization. To this end, we view BC data of each U.S. state as an oscillator unit. Our approach is sketched in Fig. 1 and is made of three main steps (1)–(3):

- (1)
pre-process raw BC data by

*reflection*and*filtering*(see Sec. IV A 1), - (2a)
construct an analytical signal by implementing Hilbert transform on the pre-processed BC data of each U.S. state (see Sec. IV A 2),

- (2b)
compute phase time-series of each U.S. state (see Sec. IV A 2),

- (3a)
calculate the synchronization degree $r$ time-series based on the definition of the Kuramoto order parameter (see Sec. IV A 2), and

- (3b)
shuffle the phase samples from (2a) in multiple trials to generate surrogate data and repeat step (3a) on these data to obtain a distribution for $r$ (see Sec. IV A 3).

At step (1), data are *pre-processed* by *reflection* and *pre-filtering*. Reflection is required to minimize artifacts in phase calculations in step (2).^{44} Pre-filtering to a desired narrow band^{27,36} is necessary since the Hilbert transform is applicable to only narrowband signals.^{27} At step (2), the discrete-time Hilbert transform on the pre-processed BC data yields an analytical signal for each U.S. state [step (2a)], from which one can calculate the instantaneous phase time-series [step (2b)]. Phase values are aggregated to compute the $r$ time-series magnitude of the Kuramoto order parameter [step (3a)]. The $95$th percentile of $r$ distributions from step (3b) based on surrogate data can then be compared against $r$ from step (3a), in each presidential term, to exclude that synchronization of U.S. states is due to chance.

Once steps (1)–(3) are completed, we state the hypothesis that *the synchronization degree $r$ depends on the presidential term*. One-way ANOVA $F$-test^{56} is conducted with all $r$ values in their respective presidential terms. The null hypothesis is that *the population means of $r$ in the five terms are all equal*. A small $p$-value ($p<0.05$) indicates that the null hypothesis can be rejected, and hence, different presidential terms have impact on the value of $r$. If the null hypothesis is rejected, then a *post hoc* test is conducted with Tukey’s approach between pairs of terms to reveal which two terms are different from each other $(p<0.05)$.

#### 1. Pre-processing and pre-filtering of data

In the implementation of the Hilbert transform, one must mitigate errors due to *edge effects* caused by the finite length of time-series to be analyzed. This issue can be addressed by following a procedure called *reflection* or *symmetric extension*.^{44,57} Let the original time-series of BC be denoted as $y[n]$ with $n=1,\u2026,Nt$ and $Nt$ being the total number of months included in the analysis. We process the time-series as follows:

The output of this procedure, $y^[n]$, is a time-series where the original BC time-series $y[n]$ is placed in between its mirrored duplicates. In what follows, $y^[n]$ is utilized for further analysis.

Since the Hilbert transform is effective on narrowband time-series, the data $y^[n]$ should further be pre-processed via a bandpass filter.^{27,36} Practically speaking, it is difficult to interpret the meaning of the phase if $y^[n]$ contains frequencies that span annual or even slower long-term oscillations as well as faster seasonal oscillations. To address this issue, one should focus on a particular frequency band. Given that annual oscillations are dominant in the BC data (see Sec. IV A), a bandpass filter is designed to extract these oscillations. Specifically, we implement a fixed-order finite impulse filter (FIR filter), chosen for its excellent linear phase response with respect to frequency. The filter bandpass corresponds to the frequency range $[0.06,0.11]$ [1/months]. By filtering $y^[n]$, we obtain $y^f[n]$, which is then used in the Hilbert transform as explained in Subsection IV A 2.

#### 2. Instantaneous phase and synchronization degree

The Hilbert transform can help reveal the instantaneous phase of a given signal $S[n]$. In principle, this transformation can be seen as a way to create an analytical signal $z[n]$ with complex entries, whose real part is $S[n]$ and imaginary part is a signal that is $90\xb0$ phase shifted with respect to $S[n]$. The phase of these complex entries determines the instantaneous phase of $S[n]$. The key characteristics of $z[n]$ in the frequency domain are that it has twice the magnitude at positive frequencies compared to the original signal, and its entire negative frequency spectrum is filtered out.

A signal $z[n]$ with the above-listed spectrum characteristics can be obtained by applying a Fourier transform followed by an inverse Fourier transform,^{32} without having to manipulate the signal in the time domain. For this, we introduce $Y[m]$ as the $Ns$-point fast Fourier transform of $y^f[n]$ with length $Ns$ and construct the frequency domain signal $Z[m]$,

where $Ns$ is even. The magnitude is set to zero for negative frequencies and doubled for positive frequencies. By applying next an $Ns$-point inverse discrete-time Fourier transform to $Z[m]$, we ultimately recover the analytical signal $z[n]=zR[n]+izI[n]$, where $i$ is the imaginary number. The phase of $z[n]$ is obtained by

which is the instantaneous phase of $y^f[n]$.

Working with the BC data set, we have a total of $N$ phase time-series, $\varphi j[n]$, $j=1,\u2026,N$, and $N$ is the number of states. Then, the complex-form Kuramoto order parameter^{15} is given by

where the instantaneous magnitude $r[n]=|k[n]|$ measures the *coherence of phases*, which we refer to as synchronization degree among the U.S. states. The parameter $r$ takes values between $0$ and $1$, where $r=1$ means that units are perfectly synchronized with all their phases identical, while $r=0$ means no coherence among them with their phases uniformly distributed in their respective ranges (in the limit $N\u2192\u221e$). The magnitude $r$ of the Kuramoto order parameter is originally defined for a large number of oscillators, $N\u2192\u221e$.^{15,23} However, $r$ has also been utilized in experimental work with a finite number of oscillators.^{22}

Finally, the last step is to remove the data that were inserted into the analysis to avoid edge effects (see Sec. IV A 1)^{44} and adjust the time delay introduced by the FIR filter.^{33} With this, the segment of $r[n]$ of interest is given by $r[n],n=Nt+o2+1,\u2026,2Nt+o2$, where $o$ is the order of the FIR filter and the data at $n0=Nt+o2+1$ corresponds to January 1999.

#### 3. Permutation test for the synchronization degree

As a rule of thumb, the magnitude of the Kuramoto order parameter satisfying $r>0.8$ is indicative of strong synchronization among coupled dynamical systems with large coupling strengths.^{58,59} On the other hand, one should assess whether or not such $r$ values can occur just by chance due to the specific structure of the data.

To generate chance values, one should create surrogate data through permutation of phase values of each U.S. state. Actual $r$ values calculated in Sec. IV A 2 should then be contrasted against surrogate data to evaluate how likely/unlikely these $r$ values are to happen due to chance. Given a time window of a U.S. president, this window contains $T$ number of phase values of $49$ states. We can think that these data form $49$ by a $T$-dimensional matrix, $\Phi ={\varphi j[n]}$, where $\varphi j[n]$ is the phase of state $j$ at time $n$, with $n=1,\u2026,T$ and $j=1,\u2026,49$. Surrogate data are generated by following these steps:

shuffle by randomly selecting one entry from each row of $\Phi $ to obtain 49 phase values, each from a different state, and collect these values in a vector $J$ with 49 entries of phase values;

use the phase entries in $J$ to compute one $r$ value; and

repeat (a) and (b) in multiple trials to generate the distribution of $r$.

The above procedure is repeated separately for each time window of Presidency, yielding a distribution of $r$ based on surrogate data (see Fig. 2). Next, we calculate the $95$th percentile of each distribution and state that values of $r$ obtained in Sec. IV A 2 that are above this percentile in a particular time window are not to be attributed to chance.

### B. Analysis of nation-level daily background check data

BC data carry rich information in the high frequency spectrum,^{5} and hence, understanding how the power in these data distributes and potentially shifts over time with the presidency is of strong interest. This question can be studied with BC data available at a daily resolution at the nation-level, where considered frequencies span seasonal and higher-frequency oscillations with periods in the range of $[20,110]$ days (or, equivalently, frequencies in the range of $[0.27,1.5]$ 1/month). First, the national BC data must be pre-filtered by removing the frequency components at relatively lower frequencies. This pre-filtering is performed based on fixed-order finite impulse response filter, following a procedure similar to that in Sec. IV A 1.

Next, the short-time Fourier transform (STFT) of the filtered data is obtained. Specifically, a time segment with a fixed size of $730$ days (i.e., two years) is moved over the data by $30$ day increments; that is, adjacent time segments have an overlap of $700$ days (see details in Sec. IV B 1). In each time segment, the power spectral density (PSD) is calculated based on the Fourier transform. This calculation leads to a 3D visualization of the power with respect to frequency and time.

Due to the high-dimensional nature of the STFT data, a dimensionality reduction/clustering procedure is adopted. To this end, we utilize an approach based on diffusion mapping, where the intrinsic geometry in a given data set can be constructed as a network between the points in the data set and by identifying pair-wise transition probabilities between any pairs of such points. By combining all the paths of random walk transitions between the points, one can define what is known as the “diffusion distance” and express the geometry of the data in the diffusion space. This representation can be further reduced to a lower dimension by observing that only a few of the dimensions are sufficient to explain the main features of this geometry. Low-dimensional embedding of the original data provides a practical means to view the structure of the data.

The procedure starting with national BC data and ending with the low-dimensional embedding space is sketched in Fig. 3 and can be summarized in the following steps:

nation-level daily BC data are filtered to the target frequency band;

the filtered series is divided into multiple equal-length overlapping time segments (see Sec. IV B 1 for details);

short-time Fourier transform is implemented separately on all time segments, and the Fourier coefficients corresponding to each frequency are obtained. For a time segment $i$, these coefficients are squared to obtain the power at each frequency and collected in a vector $V(i)$. Next, each entry of $V(i)$ is normalized by the sum of its entries (see Sec. IV B 1 for details);

dimensionality reduction for $V(i)$ is performed by diffusion maps to find low-dimensional embedding and analyze the low-dimensional character of the data set. More specifically, at this step, a similarity matrix is constructed based on the Euclidean distance between pairs of these vectors, $V(i)$ and $V(j)$. This similarity matrix is normalized to obtain the transition probability $M(i,j)$ from point $i$ to point $j$, where $M$ is the Markov matrix. The eigenvalues and eigenvectors of matrix $M$ inform whether or not a low-dimensional representation of the information in $V$ is possible. If there is a spectral separation among the eigenvalues of $M$, then such a low-dimensional representation is feasible where an embedding coordinate $Q(i)$ is obtained corresponding to $V(i)$ (see Sec. IV B 2 for details).

It is possible that the data points in the embedding coordinates associated with a presidential term would form a coherent cluster and separate from the clusters of data points corresponding to other terms. In each coordinate separately, a one-way ANOVA F-test is conducted with data grouped by their respective presidential terms. The null hypothesis is that *the population means of data in an embedding coordinate in five presidential terms are all equal*. A small $p$-value ($p<0.05$) indicates that the null hypothesis can be rejected, and hence, different presidential terms have impact on the samples in that coordinate. If the null hypothesis is rejected, then a *post hoc* test is conducted with Tukey’s approach between pairs of terms to reveal which two terms are different from each other $(p<0.05)$. We remark here that the above dimensionality reduction and clustering analyses are conducted on the STFT data, not on BC time-series. The reason for this choice is that relevant information lies in the frequency domain.

#### 1. Windowing and short-time Fourier transform

Denote with $pf[n]$ the pre-filtered data of the original time-series of daily national BC data $p[n],n=1,\u2026,L$, where $L$ is the total length in days. We use the FIR filter introduced in Sec. IV A 1, but since the focus is on higher frequencies, the passband here is associated with oscillations of periods $[20,110]$ days. Next, filtered national BC data are divided into time segments and the Fourier transform in each segment is expressed by the short-time Fourier transform (STFT),

where $STFT[\u22c5]$ denotes the short-time Fourier transform, $\omega $ is the frequency, and $w[n]$ is the windowing function. A typical windowing function smooths the values of pre-filtered time-series to avoid spectral leakage in short-time Fourier transforms.^{60} However, BC data potentially carry seasonal characteristics, and hence, the length of the time segment should be sufficiently large to capture such effects. In this study, the segment size is selected with a length of two years ($Nw=730$ days).

BC data present characteristically different frequency properties in winters vs summers. Therefore, when an edge-smoothing windowing function (such as the Blackman-Harris or Hamming window) is implemented on these data, the edges of the time segment corresponding to winters and summers will be smoothed as the time segment slides. Since winters and summers have different frequency properties, this implementation can lead to undesired loss of information. There are possibly two ways to fix this issue: one is to change the length of the time segment and the other is to use a different windowing technique. If the time segment is enlarged, it will register loss of time resolution, which is not preferred since we need sufficient time resolution to be able to resolve presidential terms. If the length of the time segment is decreased, we will experience a loss of frequency resolution. Since BC data present seasonal effects, the time segment should not be selected smaller than one year.

A more convenient way to fix the above issue is to use a windowing technique that does not have an edge-smoothing feature. For this purpose, we utilize a rectangular window. However, this window is prone to spectral leakage, meaning that some portion of the power in the fundamental frequency of BC data will leak to the rest of the frequency spectrum. We calculate the leakage factor as $9.3%$, indicating that only $9.3%$ of the total spectral power is spread over the side lobes of the spectrum. We suggest that this is a reasonable trade-off since it is more important to avoid artifacts due to edge-smoothing that can mask the seasonality effects in the data. Moreover, since the BC data here are pre-filtered before being convoluted with a rectangular window, we find that the power in the frequency spectrum of interest still dominates the power in the remainder of the spectrum. As a result, some level of spectral leakage does not significantly affect the analysis.

With the above understanding, the rectangular window is defined next,

where $Nw$ is the length of each time segment. Here, the adjacent time segments are apart from each other by $30$ days. That is, for all selection of $m\u2208[m1,m2,\u2026,mT]$, $mi+1\u2212mi=30$, $m1=365.5$, $T=248$. Furthermore, in a time segment $i$ of the data, the power distribution of frequency components is collected in a high-dimensional vector $V(i)$ and each entry of $V(i)$ is normalized as follows: $V(i)=1\u2211\omega P2(mi,\omega )[P2[mi,\omega 0],P2[mi,\omega 1],\u2026,P2[mi,\omega Nyquist]]$.

#### 2. Spectral clustering with diffusion maps

For the obtained set of vectors $V(i)\u2208RNb,i=1,\u2026,T$, where $Nb=366$ is the number of discrete Fourier transform bins, we introduce the dimensionality reduction method, diffusion maps, to find the low-dimensional embedding of the data. This method is executed as explained next.^{46,52,61} We define a similarity matrix $W\u2208RT\xd7T$,

where $d(V(i),V(j))$ is the Euclidean distance between the vectors $V(i)$ and $V(j)$, with $\sigma =0.2dmax$, where $dmax$ is the maximum Euclidean distance between two different vectors in the dataset $V(i)$. Clearly, $W(i,j)=W(j,i)$; that is, $W$ is symmetric. By normalizing $W$ by the diagonal matrix $D$, we obtain the Markov matrix $M=D\u22121W$, where $D(i,i)=\u2211jW(i,j)$. Then, after transforming $M$ into $Ms=D1/2MD\u22121/2$, we can write $Ms=D1/2D\u22121WD\u22121/2=D\u22121/2WD\u22121/2$, and therefore, it is easy to see that $Ms$ is a symmetric matrix. Due to symmetry, this matrix has real eigenvalues, and it can be decomposed as $Ms=S\Lambda S\u22a4$, where $\Lambda $ is a block-diagonal matrix containing the eigenvalues $\lambda k$ of $Ms$, $S$ is a matrix with orthogonal eigenvectors $\gamma i$ of $Ms$, and $\u22a4$ denotes matrix transposition.

Moreover, we can write $M=D\u22121/2S\Lambda S\u22a4D1/2=(D\u22121/2S)\Lambda (D1/2S\u22a4)\u22121$. Thus, $Ms$ has the same eigenvalues as $M$ and the eigenvectors of $M$ and $Ms$ are related as $\varphi i=D\u22121/2\gamma i$. Noting that the largest eigenvalue of $M$ is one by definition, the remaining eigenvalues are to be used to determine how to proceed with dimensionality reduction. Specifically, one sorts the eigenvalues in descending order as $\lambda 2,\u2026,\lambda T$ and inspects the relative distance, i.e., the spectral gap between them. This will inform which one of the larger eigenvalues is to be selected over smaller ones for dimensionality reduction, and the eigenvectors corresponding to the selected eigenvalues will define the low-dimensional embedding coordinates.^{46–48} For $V(i)$, the coordinates of the corresponding points in the low-dimensional space will be denoted by $Q(i)=(\varphi 2(i),\varphi 3(i),\u2026,\varphi k(i))$, considering $k\u22121$ eigenvectors are kept in the low-dimensional representation.

## V. RESULTS

### A. The synchronization degree varies with the presidential term

The time-trace of the synchronization degree $r$ of the U.S. states is shown in Fig. 4. We caution the reader that synchronization is associated with the relative phase between the U.S. states, but seasonality characteristics of the BC data do not automatically imply synchronization.

During G. W. Bush’s terms, the $r$ value is always at relatively high levels with a mean of $r=0.939$. One year after B. H. Obama takes the Office, however, the value of $r$ starts dropping until 2011 when it reaches its lowest value of $r=0.826$ in that term. Then, it grows back to values to around the same levels of G. W. Bush’s terms. Throughout B. H. Obama’s second term, the value of $r$ holds steady at relatively high levels. In D. J. Trump's pre-COVID-19 term, the value of $r$ holds steady initially, but within a year, it drops to $r=0.635$. The drop continues until D. J. Trump’s last year in the Office during the COVID-19 pandemic. In permutation tests, all these values are different from chance values marked with horizontal dashed lines in Fig. 4. See also Appendix A where an alternative analysis provides support for the robustness of results in Fig. 4.

A one-way ANOVA $F$-test is performed next to study how synchronization of U.S. states compares with presidential terms. We find that $r$ is affected by presidential terms ($F(4,223)=22.93$, $p<0.001$; see also the violin plots in Fig. 5). Furthermore, *post hoc* comparisons identify that synchronization in the following pairs of presidential terms is statistically different ($p<0.05$): Bush (1)–Bush (2), Bush (1)–Obama (1), Bush (1)–Trump (pre-COVID-19), Bush (2)–Obama (2), Bush (2)–Trump (pre-COVID-19), Obama (1)–Obama (2), Obama (1)–Trump (pre-COVID-19), and Obama (2)–Trump (pre-COVID-19). That is, synchronization among the U.S. states in annual oscillations in their BC data is statistically different between any two terms of Presidency except when comparing Bush (2) with Obama (1) and Bush (1) with Obama (2).

### B. Nation-level background check data embed on a low-dimensional manifold that varies with the presidential term

We present in Fig. 6 the eigenvalues of matrix $Ms$ associated with the diffusion map in descending order. A spectrum gap is obvious in this figure between the second eigenvalue $\lambda 2=0.75$ and the remaining smaller eigenvalues and a smaller gap between the 3rd eigenvalue $\lambda 3=0.45$ and the remaining smaller eigenvalues. This result suggests that we can view the high-dimensional STFT data in a two-dimensional embedding space with respect to the second and third eigenvectors, $\varphi 2$ and $\varphi 3$, of the diffusion map. In Fig. 7, the embedding space is depicted where each data point in this space is representative of a time segment in which PSD was calculated. These points collectively describe a low-dimensional representation of the original high-dimensional STFT data. Recalling that the STFT data are made of multiple time segments, the STFT data points in the $i$th time segment correspond to a single data point on the 2D embedding plane marked by $Q(i)=(\varphi 2(i),\varphi 3(i))$.

Analogous to the study of animal groups,^{47} the diffusion map here provides a representation of collective behavior of firearm acquisition in two independent coordinates $\varphi 2(i)$ and $\varphi 3(i)$ associated with $\lambda 2$ and $\lambda 3$, see Coifman and Lafon,^{46} where each coordinate provides critical information about the data. Moreover, since each data point in Fig. 7 corresponds to a time segment $i$ in which PSD is performed, we can color these data points depending on which U.S. presidential term they belong to. In Fig. 7, a clustering feature is obvious based on color coding of the markers. Clearly, Bush (2) separates out from the markers associated with Obama (1)–(2) and Trump (pre-COVID-19). Markers corresponding to Obama (2) are also separate from the data associated with Obama (1) and Trump (pre-COVID-19).

Specifically, in the first coordinate $\varphi 2(i)$, we find that data are distinguished in two separate clusters; one corresponds to G. W. Bush’s both terms ($\varphi 2(i)<0.5$) and the other covers B. H. Obama’s two terms combined with that of D. J. Trump’s ($\varphi 2(i)>0.5$). Furthermore, we observe that the data associated with G. W. Bush’s second term are much more spread along this coordinate, suggesting that the evolution of firearm acquisition during this term exhibits different characteristics with respect to G. W. Bush’s first term. With regard to the coordinate $\varphi 3(i)$, we gather additional information for the point cloud at $\varphi 2(i)>0.5$. More specifically, $\varphi 3(i)$ indicates that this point cloud is characterized by three separate clusters, where one is related to only B. H. Obama’s second term ($\varphi 3(i)<\u22121$), the other with B. H. Obama’s first term combined with D. J. Trump’s ($2>\varphi 3(i)>\u22121$), and the third one with black markers associated with the transition from G. W. Bush’s second term to B. H. Obama’s first term ($\varphi 3(i)>2$). With the help of diffusion maps, the diffusion coordinates help reveal relevant information as to the similarities and differences in the data associated with presidents’ terms (color coded).

Last, following the methods presented in Sec. IV B, data are statistically analyzed along each embedding coordinate $\varphi 2$ and $\varphi 3$ separately. One-way ANOVA in each embedding coordinate points at variations with the presidential term ($F(4,108)=596.24$, $p<0.001$ for $\varphi 2$; $F(4,108)=278.57$, $p<0.001$ for $\varphi 3$). *Post hoc* tests reveal that any two presidential terms are statistically different ($p<0.05$) in at least one of the embedding coordinates except between Obama (1) and Trump (pre-COVID-19); see also Fig. 8.

## VI. DISCUSSION

With its unique legal landscape forged by historical elements, the U.S. is a country where private citizens satisfying certain conditions are eligible to purchase firearms. Since data as to how many firearm sales transactions are made in the U.S. are not publicly available, researchers have turned toward tracking background check data, available from the Federal Bureau of Investigation (FBI), as a proxy of firearm acquisition in the U.S.^{10} These data only capture the number of background checks performed: a background check may not necessarily indicate a firearm purchase, and a single background check can lead to multiple firearm purchases.^{3} Despite these limitations, BC data are still considered to be a reliable resource with which researchers can perform various analyses associated with firearm prevalence in the U.S.^{4–11}

There are various reasons that studying firearm prevalence in the U.S. is of value to the research community. For example, it is of strong interest to understand why affinity to firearms is much stronger in the U.S. than it is in other comparable benchmark countries.^{2} This question relates to how humans choose to acquire firearms or not, and how potentially their socio-economical status, country’s history and legal landscape, educational level, income level, and political views contribute to such choices.^{62} Moreover, humans react only to certain events, and they react differently toward different events. How such reactions contribute to their choices can guide us to better understand human behavior and elucidate how such decisions contribute to governance, legal landscape, political debate, and ultimately influence firearm acquisition in the U.S.

In this paper, we studied spatiotemporal patterns of BC data in the U.S. and compared the results with presidential terms. This was done by taking a rigorous mathematical/computational approach on the available BC data. Although the U.S. states have their unique characters, some are similar to some others in many ways, such as geographically, politically, and socio-economically. We investigated whether or not there exists coordination among the states in terms of their BC data. With annual oscillations predominantly present in the data, the focus was to understand how such oscillations compare in terms of their relative phases with respect to each other. To study these phase relationships, we calculated a metric called a synchronization degree, with strong roots in the study of synchronization in diverse fields of nonlinear dynamical systems.^{15} We determined that the U.S. states are highly synchronized in their BC data except for about two years during the first term of B. H. Obama and during almost all the pre-COVID term of D. J. Trump. Alternative approaches were also provided to show evidence that the results have a strong degree of robustness. Statistical tests indicate that synchronization levels among the U.S. states vary with presidential terms except when comparing the second term of G. W. Bush with the first term of B. H. Obama and the first term of G. W. Bush with the second term of B. H. Obama.

Although we have not looked into the root causes of synchronization, it is reasonable to assume that nation-level common drivers contribute to orchestrated behaviors of firearm-purchasing communities. In view of Rogers^{63} and Melgar,^{64} for example, it is tenable that firearm sales follow seasonality cycles with peaks during the holiday season. The presence of such peaks in the BC data^{55} aligning with the holiday season further supports seasonality effects (see Appendix B for state-level BC data). On the other hand, root causes of synchronization cannot be attributed to only common drivers. More specifically, published work indicates key coupling among some U.S. states, for example, in terms of business cycles,^{65} house prices^{66} as well as firearm acquisition.^{11} Hence, it is plausible that synchronization among the U.S. states in terms of BC data is a combination of the effects of common drivers as well as state-to-state couplings.

It is critical to note that seasonality in the data does not imply synchronization. Synchronization is associated with the phase difference between time-series of the same frequency; in this vein, temporary loss of synchronization is indicative of increased variability in the phase of oscillations. Such a disarray in the BC data is detected around 2010–2011 after one and a half years into the first term of B. H. Obama and in a much more pronounced manner during the pre-COVID term of D. J. Trump. While synchronization sheds light on the degree of organization/disorganization among the U.S. states in their BC data, such an analysis has further implications, in particular, in our pursuit to thoroughly understand variations in firearm acquisition in the U.S. For example, it would be possible to investigate the relationship of synchronization, or lack thereof, to various attributes and/or extrinsic effects. In the current study, we have compared synchronization with presidents’ terms, and opportunities in this direction include the study of which U.S. states might be potentially influencing each other (network characterization, network identification), which states might be behaving more independently in their firearm acquisition (leaders vs followers), and how legal landscape influences the behavior of those states (stricter vs permissive laws).

BC data contain rich information, especially at relatively high frequencies. This was leveraged, for example, in a recent study where information carried by high frequency noise has been statistically studied to explain the causal links between background checks, media output, and mass shootings.^{5} Motivated by this observation, we examined nation-level BC data at relatively high frequencies with the goal of understanding how the energy in the frequency spectrum possibly shifts over time and how such an energy-shift compares with a presidential term. Short-time Fourier transform was combined with diffusion mapping and clustering techniques to compactly represent the dominant geometric features of the BC data on two-dimensional diffusion coordinates. This representation in essence demonstrates how the energy in the BC data diffuses over time. Specifically, in diffusion coordinates, we observe data clusters from one term to another term of a president. Statistical tests indicate that such clusters are different between any pairs of presidential terms in at least one of the diffusion coordinates except between the first term of B. H. Obama and the pre-COVID term of D. J. Trump.

The nation-level analysis indicates the differences and similarities of firearm sales in the U.S. vis-à-vis a sitting president’s term. One possible explanation for this is a president’s ability to directly interact with masses and to potentially steer the trajectory of gun laws either toward stricter or permissive regulations. One piece of evidence to these arguments is in line with Ref. 12, where it is argued that the election of B. H. Obama created a perception that stricter firearm laws will be implemented.

We identify two main limitations of our research. First, since the actual figures of firearm purchases are not publicly available, we have relied on BC data as proxy of firearm acquisition in the U.S., following the practices of previous publications. Second, as mentioned in the study of nation-level BC data, we chose to accept some level of error in our analysis due to spectral leakage in the implementation of short-time Fourier transform operations. By assessing that leakage was less than $10%$, we decided that this was an acceptable level of error as it helped to avoid the influence of seasonal differences in firearm purchases on clustering.

Although this study focuses on firearm prevalence, it can potentially translate to elucidate the relationship between firearm acquisition and gun-related harms. To better understand the complex mechanisms that lead to a harm, researchers in public health, criminology, policy making, and engineering have been studying the nature of shootings,^{67–69} impacts of firearm-related harm on healthcare,^{70} psychological drivers of violence,^{71,72} effectiveness of law and regulations on firearm,^{73} moderating role of media on violence,^{74} and root causes of firearm acquisition.^{5,75} In criminology, specifically, studies include the investigation of how gun carrying and drug dealing are related,^{76,77} spatiotemporal spreading characteristics of gun violence,^{78} analysis on spatial clustering effects of gun violence within urban environments,^{79} and how certain urban physical features of properties could attract gun crimes.^{80}

We suggest several research directions in future work. Although we have not performed a causality analysis, future studies should look into whether or not there exists any causal link from a president’s political agenda to firearm prevalence and vice versa. Moreover, the analysis of the BC data can be expanded into frequency bands other than those studied in this effort. While we have not studied in what ways each presidential term could have influenced U.S. states’ synchronization in their BC data nor have we looked into similarities/differences between a Democratic and Republican president, future work can analyze the presidents’ agendas, public statements, and support for stricter/permissive gun laws to unveil the root causes of loss of synchronization. Referring to possible similarities/differences in firearm-purchasing communities, another direction of research could focus on characterizing those communities in their firearm acquisition with respect to relevant national and local events.

In conclusion, we put forth a mathematical/computational approach to study BC data at the nation- and state-level and revealed the synchronization characteristics of U.S. states and how energy contained in the BC data shifts over time. These results are backed by statistical tests and robustness analysis, all supporting the existence of time-dependent coordination among the U.S. states. The results further point out that this time dependence compares with presidential terms; that is, there exists a possible interplay between a president’s term and the complex dynamics associated with firearm acquisition in the U.S. This interplay is possibly due to similarities and differences in the legal, economical, and social climates in the U.S. corresponding to different presidential terms.

## ACKNOWLEDGMENTS

This study has been supported in part by the U.S. National Science Foundation under Award No. CMMI-1953135.

## AUTHOR DECLARATIONS

### Conflict of Interest

The authors have no conflicts to disclose.

### Author Contributions

**Xu Wang:** Formal analysis; Validation; Writing – original draft; Investigation (equal); Software; Writing – review & editing (supporting). **Rifat Sipahi:** Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Supervision (equal); Writing – review & editing (equal). **Maurizio Porfiri:** Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Supervision (equal); Writing – review & editing (equal).

## DATA AVAILABILITY

The data that support the findings of this study are openly available in the Federal Bureau of Investigation website at https://www.fbi.gov/services/cjis/nics, Ref. 55.

### APPENDIX A: ROBUSTNESS ANALYSIS

The phase calculated based on the principles of Hilbert transform can be influenced by how rapidly the amplitudes of the time signal are changing,^{44,45} and hence, this issue should be addressed. Moreover, an alternative approach would help provide a degree of robustness on the reliability of the results. To this end, the wavelet transform can be leveraged to estimate the degree of synchronization $r$.

#### 1. Influence of variant amplitudes

To investigate whether or not amplitude changes cause substantial variations in phase calculations, one can calculate $r$ by normalizing the amplitudes of BC data after pre-filtering and reflection procedures and before applying the Hilbert transform. That is, one should normalize first all the oscillation amplitudes in the filtered signal $y^f[n]$ such that amplitudes do not change over time. This is an adjustment that only scales the amplitudes and, therefore, should not influence the phase calculations. The amplitude-tuned signal then reads

where $ni$ is the $i$th zero point of $y^f[n]$, $Fi$ is the maximum absolute value of $y^f[n]$ in the range of $[ni,ni+1]$, and $Myf^$ is the maximum absolute value of $y^f[n]$. The signal $yt^[n]$, which is adjusted for uniform amplitude, is utilized to compute the phase based on Hilbert transform. This phase is then used to calculate the synchronization degree $r$ (blue in Fig. 9). In this figure, we compare $r$ with that in Fig. 4 (red). Given that the two curves of $r$ are almost identical to each other, we conclude that $r$ calculation is only negligibly affected when amplitudes are normalized, providing support for the consistency of results.

#### 2. Synchronization degree based on a wavelet transform

An alternative Morse wavelet transform can be implemented on the frequency domain representation of time-series data to obtain $\varphi Ws(t)$, the corresponding phase series. The procedure is as follows. Based on Ref. 35, for a square-integrable signal $x(t)$, the wavelet transform is defined as

where $s$ is the normalization constant, $\psi (t)$ is the selected wavelet function in the time domain, $\Psi (\omega )$ is the Fourier transform of $\psi (t)$, $X(\omega )$ is the Fourier transform of $x(t)$, and $(\u22c5)\u2217$ denotes the complex conjugate of $(\u22c5)$.

In this paper, we implement the wavelet in the frequency domain $\omega $, and the type of wavelet function selected is the Morse wavelet given by

where $U(\omega )$ is the unit-step function, $P2$ is the time-bandwidth product, and $\gamma $ represents the symmetry of this wavelet. With this setting, complex number $Ws(t)$ can be calculated as

from which the phase $\varphi Ws(t)$ can be extracted.

Specifically, utilizing discrete version of wavelet transform in MATLAB on state-level BC time-series, phase time-series for each U.S. state are obtained. These phase time-series are then utilized to compute $r[n]$. Since we utilize the reflected BC data in this procedure, the parts of $r[n]$ associated with the reflected data are neglected. Here, wavelet parameters are selected as $\gamma =3$ for a maximum degree of symmetry of shape of the Morse wavelet^{35} and $P2=30$, which are appropriate values to balance the two needs for a narrow frequency passband width and short decay time. Next, the central frequency of the wavelet is adjusted with different selections of the normalization parameter $s$.

Since our focus is on BC data associated with annual oscillations, the central frequency is selected for an oscillation period in the range of $[10.6,13.7]$ months, which is slightly narrower than that used in the FIR filter in Sec. IV A 1. To study the variability in the results, we sample this range into $18$ linearly spaced data points, each describing a Morse wavelet with a different central frequency. For each of the wavelets, phase calculations are performed and a synchronization degree $r[n]$ is obtained. The family of $18$ different curves is obtained as depicted in Fig. 10.

Figure 10 provides support for the degree of robustness of results in Fig. 4. It also reveals that at two critical time points, the results have more sensitivity, or, equivalently, there is more variation in the family of curves in Fig. 10. These time points arise around the valley in the winter of $2010$ and after the spring of $2018$. Nevertheless, the trend lines still clearly indicate the presence of the valley around $2010$ in Obama (1) and a substantial drop in $r$ in Trump (pre-COVID-19).

### APPENDIX B: STATE-LEVEL BACKGROUND CHECKS

The plots of BC data by state (the state of Hawaii is excluded) are provided here for direct observation and capture of the synchronization in Figs. 11–13.

## REFERENCES

*29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society*(IEEE, 2007), Vol. 2007, pp. 3864–3867.

*Principal Manifolds for Data Visualization and Dimension Reduction*(Springer, Berlin, 2008), pp. 238–260.

**138**, 135–144 (2017).