RNA translation to protein is paramount to creating life, yet RNA and protein correlations vary widely across tissues, cells, and species. To investigate these perplexing results, we utilize a time-series fixation method that combines static stimulation and a programmable formaldehyde perfusion to map pseudo-Signaling with Omics signatures (pSigOmics) of single-cell data from hundreds of thousands of cells. Using the widely studied nuclear factor kappa B (NFκB) mammalian signaling pathway in mouse fibroblasts, we discovered a novel asynchronous pseudotime regulation (APR) between RNA and protein levels in the quintessential NFκB p65 protein using single molecule spatial imaging. Prototypical NFκB dynamics are successfully confirmed by the rise and fall of NFκB response as well as A20 negative inhibitor activity by 90 min. The observed p65 translational APR is evident in both statically sampled timepoints and dynamic response gradients from programmable formaldehyde fixation, which successfully creates continuous response measurements. Finally, we implement a graph neural network model capable of predicting APR cell subpopulations from GAPDH RNA spatial expression, which is strongly correlated with p65 RNA signatures. Successful decision tree classifiers on Potential of Heat-diffusion for Affinity-based Trajectory Embedding embeddings of our data, which illustrate partitions of APR cell subpopulations in latent space, further confirm the APR patterns. Together, our data suggest an RNA-protein regulatory framework in which translation adapts to signaling events and illuminates how immune signaling is timed across various cell subpopulations.
INTRODUCTION
Nuclear factor kappa B (NFκB) is a transcription factor signaling pathway that is central to innate immune signaling in humans and other mammals. Its activity has been implicated in cell differentiation, fate, and apoptosis.1 While typically responsible for pro-inflammatory activation, its dysregulation has been associated with cancers, rheumatoid arthritis, Alzheimer's, autoimmune disorders, and others.1–4 Although its proteomic activity has been studied for decades, the role of the transcriptome in the signaling pathway has not been examined. The transcriptome has been known to play a larger role than sole translation; the existence of non-coding and other RNAs in disease pathology such as cancer has been acknowledged.5 In fact, RNA-protein correlations vary wildly across cells, tissues, and animals; it is reported that some correlation figures only have 40% explanatory power of their associated proteins.6 Disruptions of these processes have been shown to engender cancer states.7 Therefore, investigating translational regulation may reveal how RNAs and associated proteins interact at the subcellular level, which likely plays a large role in signaling mechanisms and disease pathologies.
For decades, NFκB has been an extensively studied signaling pathway. Originally discovered in B cells, it has been found in almost all cell types and affects a wide variety of other signaling pathways.8,9 Earlier laboratory tools used blotting and other in vitro protein assays to isolate interacting candidates. More elaborate in vivo mouse models established the essential nature of NFκB and tied its role to the skin, skeleton, brain, and other disease pathologies. Then, the advancement of tools such as single-cell live fluorescent imaging enabled real-time measurements of dynamical NFκB responses and how a single signaling pathway contains redundancy and modularity to control different types of responses. Furthermore, bioinformatics techniques involving machine learning and protein modeling have identified more roles of different NFκB members, such as Iκκ,10 to uncover more signaling relationships at a higher throughput than in vivo models. Together, these tools have provided a more comprehensive picture of NFκB signaling and its importance in human health.
While pro-inflammatory mechanisms have been discovered, NFκB negative inhibitors have also been examined to investigate how signaling events lessen.11–13 A20 and IκBα proteins are prominent players in inhibiting phosphorylation events to reduce inflammatory effects.14 However, elucidating this negative feedback mechanism at the spatial transcriptomic and proteomic levels has not been accomplished thus far. Such measurements would highlight translational control of signaling events in space and time. In this study, we use A20 protein activity to confirm the negative inhibition of our NFκB measurements.
Microfluidics, involving the continuous treatment and monitoring of live cells, offers a significant advantage over in vitro and in vivo counterparts due to its ability to measure cell dynamics in real time. For example, NFκB oscillatory responses can be recorded and watched in live images with a microfluidic chip.15,16 Using live reporter proteins, including p65, and a fluorescent imaging system, researchers were able to measure different NFκB responses, in terms of duration and amplitude, in single cells to the perfusion of stimulating media with controlled valves.16,17 Continuous imaging provided real-time video of oscillatory response in individual cells. It was recently discovered that fibroblasts exhibit different NFκB responses at the single-cell level based on not only the stimulus concentration but also the position and distance to the stimulus source.16 This could mimic conditions where a macrophage secretes stimulating cytokines from a point source infection and may reveal how different cells can respond in similar ways to the same cytokine stimulus. While microfluidics provides precise control over stimulus, concentration, distance, and high-resolution, single-cell proteomic readouts, these setups are limited to 3 fluorescent proteins at a time and only monitor a single field of view (around 20–30 cells). Additionally, such methods cannot be multiplex labeled to highlight subcellular, spatial biomolecular patterns. Thus, such methods are unfavorable for studying spatial variability and translational patterns directly.
Single-cell RNA sequencing (scRNA-seq) has complemented live cell studies by profiling whole cell transcriptomes at various timepoints (“sample collection times”) and positions within tissues or cells.16 Such methods have revealed pro-inflammatory gene hotspots and their timing in NFκB signaling. For example, A20 genes were localized closely with the stimulus, whereas RANTES, Casp4, and other genes persisted at high levels regardless of distance to stimulus.16 Another study of NFκB scRNA-seq and live cell imaging has segregated heterogeneous subpopulations based on distinct initial responses and transcriptome.18 With increasing NFκB stimulation, transcriptional analyses revealed the upregulation of NFκB inhibitors including IκBα, A20, and IκBz. Finally, end point RNA fluorescence in situ hybridization (FISH) confirmed a few of these transcriptional patterns.
Although spatial biology studies have revealed specific RNA and protein molecules that play important roles,19 they (1) cannot discern the temporal axis necessary to study translational dynamics or (2) are limited to very few cells at a time. In situ dynamical studies have only recently emerged and can provide potential single-cell differences in this signaling response. Meanwhile, recent advances in single molecule detection assays, such as fluorescence in situ hybridization (FISH), sequential FISH (seqFISH+), and multiplex immunofluorescence, have enabled the profiling of up to thousands of molecules across thousands of cells.20–24 Thus, live cell NFκB studies to date have lacked significant cell throughput and continuous spatial pattern measurements of single biomolecules (e.g., RNA). Live cells present difficulties in spatial multiplexing to capture their transient, dynamic responses. However, fixed animal tissues or cells permit multiplexing and more complete assays but only provide a snapshot in time.
Hence, to overcome the cell throughput limitations of live cell imaging and spatial tagging of single molecules, we propose programmable fixation with seqFISH. We present our timed and continuous fixation method, pSigOmics, which involves concepts from scRNA sequencing collected at different sample times (termed sample collection time). Previous work in this arena has suggested that reconstructing continuous responses from discrete sample collections is possible.25 To increase throughput, we implemented a timed fixation method that stimulated batches of 01-3T3 mouse fibroblasts after various stimulation times followed by fixation. Each batch was visualized along a pseudotime trajectory in which different cells from each timepoint were organized along an artificial time axis according to known stimulation patterns. Cells from discrete timepoints could also be concatenated along the pseudotime axis by normalized median p65 protein behavior.
With this high cell throughput data, we discovered strong asynchronous pseudotime regulation (APR) between RNA and protein levels in single cells over time. We then trained a graphneural network (GNN) model to predict APR cell types from subcellular, spatial RNA locations. Finally, we investigated how well these APR subpopulations partitioned in latent space with Potential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE).26 Together, our results demonstrate a translational oscillatory lag and illustrate a new way to study single-cell signaling in high cell throughput with multiplexed methods.
RESULTS
Programmable fixation enables high cell throughput and multiplex advantages over live imaging
Compared to live cell imaging, pSigOmics measures multiplexed RNA and protein detection in up to 8965 cells per timepoint and 89 665 cells in total that were fixed after various stimulation times. Similar to sample collection timepoints in scRNA sequencing, pSigOmics utilizes replicates of cells in different wells, stimulates them in various time intervals, and fixes the samples accordingly after stimulation (Fig. 1). By fixing samples collected from different wells, multiplex spatial omics profiled p65 RNA, GAPDH RNA, and p65 protein in thousands of single cells per timepoint [Fig. 1(a)].16 Additionally, to create a more continuous stimulant response gradient, we slowly perfused 4% paraformaldehyde (PFA) at 25 μl/min through six microchannels containing statically stimulated cells from 0 to 45 min microchannels. A detailed comparison of static and programmable experimental parameters can be found in Table S1. Sequential imaging of p65 protein reporters, p65 RNA, and finally GAPDH RNA allowed the detection of RNA and protein signatures in the same single cells. The subsequent analysis will focus on the interrelation of p65 protein and p65 RNA while the GAPDH RNA was used as a positive control to confirm cell viability and robust single molecule mRNA detection capabilities in this multiplexed data acquisition pipeline. Finally, A20 protein signatures were used to measure negative NFκB inhibitor activity. Our complete multiplex target panel can be found in Fig. 2.
As time progressed from 0 to 105 min, the p65 protein transcription factor appeared to translocate from the cytosol to the nucleus more extensively under cytokine treatment than unstimulated baselines, thus agreeing well with expected NFκB behavior (Fig. 3). Note that p65-reporter signals were only imaged after fixation with formaldehyde (no permeabilization) for all samples to prevent reporter leaking (Fig. 2). To ensure proper single-cell characterization, we compared segmented 3D cell masks using (1) GAPDH RNA or (2) segmentation antibodies including Vim, αSMA, Hsp47, and S100A4 [Figs. S1(a) and S1(b)]. Both masks were comparable but the segmentation antibodies, which highlight the fibroblast membranes, generated finer cytosolic edges, and thus we chose the segmentation antibody masks for further analyses. Additionally, we verified that p65 and GAPDH RNA dot counting was accurate enough to profile single cells [Fig. S1(c)]. Our static fixation method preserved the typical p65 translocation from the nucleus to cytosol akin to live imaging of p65 reporters (Fig. 3). Interestingly, as an important advancement of this pSigOmics platform, the spatial distribution of RNA transcripts for p65 exhibited heterogeneity across cytoplasmic and nuclear volumes of individual cells. Thus, the density and spatial co-localization of p65 RNA and proteins need to be investigated to dissect how p65 RNA encoding regulates corresponding proteins as a function of time, which we observe to be mostly APR.
pSigOmics recapitulates live cell signaling dynamics
To compare pSigOmics cell response to known live cell behavior, we visualized p65 protein subcellular localization over discrete time-series data. We first observed cytokine-stimulated subtypes in which p65 protein expresses the highest in the nucleus, nuclear membrane, or cytosol [Fig. 4(a)]. Thus, we categorized cells by the highest median p65 protein signal in each subcellular region. Nonresponding cells had the highest p65 protein signal in the cytosol, partially responding cells had the highest signal in the nuclear membrane, and fully responding cells had the highest protein signal in the nucleus. Additionally, we observed some cells expressing high p65 protein and low p65 RNA, while others express low p65 protein and high p65 RNA. To explore this interesting phenomenon, we gathered all p65 protein and p65 RNA content in each cell's cytosol, nuclear membrane, and nucleus regions to project all 45 000+ cells in latent space on a uniform manifold and projection (UMAP) representation [Fig. 4(b)]. Timepoint coloring showed a gradient starting from the top right and moving toward the bottom left. Stimulation classifications were also colored, showing a gradual transition from nonresponding cells toward the left to partially and fully responding cells toward the right. Meanwhile, to quantify p65 RNA and p65 protein APR, cells were colored by their whole cell p65 protein to p65 RNA difference [Fig. 4(b), bottom]. This difference calculation measures each single cell's translational “offset” to gauge the population's p65 translational variability. Smaller p65 protein to p65 RNA difference values highlighted cells with low protein and high RNA while larger ratio values highlighted cells with high protein and low RNA patterns. Similarly, p65 protein to RNA ratios can be observed in Fig. S2(a). Single-cell examples of both these APR patterns were visualized to confirm this quantification [Fig. 4(b), bottom]. Additional principal component analysis (PCA) latent space embeddings can be observed in Fig. S3. PCA plots show a slight gradient pattern from top left to bottom right in terms of stimulation time, subtypes, and p65 RNA-protein difference [Fig. S3(a)]. The scree plot indicates that adding principal components only slightly increases the explained variance ratio of the data because PC1 is dominant [Fig. S3(a)]. When investigating which features contribute most to PC1, we discovered that p65 protein, especially in the nuclear membrane and cytosolic regions, dominated over the other features, which is logical because p65 protein translocation across the nuclear membrane differentiates the stimulating cell subtypes [Fig. S3(b)]. Overall, single-cell APR patterns are evident in most latent space embeddings but are most prominent in UMAP.
Stimulation subtypes aided in quantifying signaling dynamics [Fig. S4(a)]. Over time, for fully responding cells, there are fewer partially responding cells and more fully responding cells. The median p65 protein signal in each cell region also shows a stronger protein signal in the nuclear membrane and nucleus over time compared to the cytosol [Fig. S4(a)]. Additionally, the NFκB response, computed by the ratio of median nucleus to cytosolic p65 protein, appears to peak and fall more than the unstimulated baseline over time in concordance with previous studies [Figs. S4(b) and S5].16 To support the consistency of these patterns, we stimulated the same 01-3T3 cells on coverslips for 0–50 min in 10-min intervals and repeated this experiment for six batches. We mixed various static timepoints from different batches in silico to ascertain the same resulting NFκB response pattern (Fig. S6). Furthermore, we considered the NFκB inhibition activity via A20 deubiquitinase expression, which appeared to be more active and colocalized with p65 Protein at TNFα 90 min (Fig. S7). Thus, we conclude that NFκB activity successfully peaks and terminates during the static time course. Finally, we investigated the p65 RNA signaling relationship with metabolic signatures, e.g., GAPDH RNA [Fig. S4(c)]. p65 RNA and GAPDH RNA dots exhibited a strong positive association [Fig. S4(c), top row]. Simultaneously, p65 and GAPDH RNA demonstrated similar associations, thus confirming the strong link between NFκB signaling and glycolytic metabolism. To further support this relationship, we implemented a GNN on all multiplexed signals excluding p65 RNA and protein to predict each cell's APR subpopulation label—either low p65 protein and high RNA or high protein and low RNA [Figs. S8(a) and 8(b)]. Generally, GNN classifiers outperformed machine learning classifiers for TNFα, IL1β, and unstimulated baseline cells [Fig. S8(c), Table S2]. Furthermore, GNN classifiers solely trained on GAPDH RNA still successfully discriminated both APR subpopulation classes, suggesting that metabolism is driving this signaling difference among cells [Fig. S8(d), Table S3]. PHATE ordering features also demonstrated successful APR classification with a decision tree classifier, thus highlighting the behavioral differences of these cell subpopulations (Fig. S9, Videos S1–S4).26
pSigOmics reveals RNA-protein APR across time
To further elucidate if all cells exhibited p65 RNA and p65 protein APR, median p65 protein and sum p65 RNA signals were plotted by cell region for TNFα and IL1β stimulations compared to unstimulated baselines (Fig. 5). Strikingly, all cell regions and whole cell patterns exhibited an oscillatory phase lag between p65 protein and p65 RNA. Compared to the unstimulated baseline, cytokine stimulations appeared to amplify the p65 protein to larger oscillations, which is expected of transient, pro-inflammatory signaling behavior like NFκB. This significant p65 protein activity seems to slightly delay p65 RNA upregulation in the cytosol, nuclear membrane, and whole cell patterns. The larger, later p65 RNA peak may be attributable to the depletion of higher p65 protein activity and how cells maintain translational balance during this signaling event. Unstimulated baseline cells exhibit smaller amplitude p65 RNA-protein APR over time. When ordering cells from various 0–50 min timepoints along the same pseudotime axis, we still observed lags in single-cell p65 RNA and protein levels (Fig. S10). We applied the same pseudotime ordering to cells stimulated for 0–105 min and discovered similar peak-to-peak p65 RNA and protein lags (Fig. S11). When each timepoint is examined in isolation, the least squares linear regression between p65 RNA and protein levels appears to vary (Fig. S12). However, note that this analysis only captures APR within each timepoint rather than across timepoints.
From this general APR trend, we grouped cells based on APR patterns by grouping the whole cell p65 protein to p65 RNA ratio by quartiles [Fig. 6(a)]. Cells in the lower quartile were classified as low p65 protein and high p65 RNA while cells in the upper quartile were considered high p65 protein and low p65 RNA. The interquartile range was defined to consist of medium p65 RNA and medium p65 protein cells. Finally, cells from each category were counted and segregated for later analysis.
RNA protein APR vary by subpopulation
To investigate if subpopulations also exhibited APR behavior, we separated p65 RNA and p65 protein behavior by subpopulation [Fig. 6(b)]. Cells in the bottom quartile were low p65 protein and high p65 RNA (22 415 cells) while cells in the top quartile were high p65 protein and low p65 RNA (22 417 cells). Finally, cells in the interquartile range were considered medium p65 protein and medium p65-(44 833 cells). This partition is not unique; other partitions of this continuous cell's RNA and protein distribution are possible. Large changes in p65 protein appear to precede large changes in p65 RNA, thus implying that protein signaling activity is upregulating its downstream translation. Representative, stimulated cells are shown on the right of Fig. 6(b). The association of cell-stimulated subtypes with APR subpopulations is shown in Fig. S13, suggesting more nonresponding and partially responding cells in the medium RNA and medium protein group at earlier and later timepoints while more fully responding and high/low p65 RNA and protein groups at the middle 15–60 min timepoints.
Programmable fixation in helical pattern creates a continuous gradient of cell response
Beyond our static studies, we aimed to create a more dynamic continuum of cell response via perfusion. Briefly, we statically stimulated 6 microchannels containing cells in 0, 25, and 45 min statically stimulated TNFα (10 ng/ml) in duplicate: A0, B0, C25, D25, E45, and F45 (Fig. 7). Gradually, from static 0 to 45 min, the p65 protein translocates from the cytosol (0 min start) to the nucleus (45 min start) in Fig. 7(a). The exact static, perfusion, and fixation time for the 6 microchannels is shown in Fig. 7(b) in the top barplot. At 25 μl/min of 4% PFA perfusion, the fixation buffer traversed a single microchannel in 6 and 17 min between adjacent microchannels [Fig. 7(b), left schematic]. Quantitatively, the NFκB response also exhibits a gradual increase over time [Fig. 7(b), bottom boxplot]. In total, we multiplexed 4521 cells for A0, 2605 cells for B0, 2963 cells for C25, 2911 cells for D25, 4266 for E45, and 1998 for F45. Interestingly, microchannels A0, C25, D25, and E45 exhibit smaller, oscillatory fluctuations in time, thus capturing the shuttling of the p65 protein reporter between cytosol and nucleus. Thus, the pSigOmics programmable fixation experiment successfully creates a more continuous gradient of NFκB response. Compared to traditional live cell imaging, our programmable fixation method possesses certain advantages such as multiplexing ability, higher temporal resolution, and higher cell throughput (Table S4).
The programmable experiment also highlighted p65 RNA and protein APR differences as observed in the static experiments. The classical NFκB response gradually increases over the elapsed time [Fig. 8(a)] and exhibits the single-cell p65 RNA and protein difference across all microchannels and stimulation responding classes in latent space [Fig. 8(a), tSNE plots]. During this continuous response gradient, median p65 RNA and protein levels exhibit tighter co-regulation with a finer p65 RNA and protein lag, especially within single microchannels when plotted as a function of perfusion distance [Fig. 8(b)]. Thus, programmable fixation highlights the finer gradient of p65 RNA and protein APR.
Together, our pSigOmics results capture the spatial variance of NFκB time-series signaling. We propose a biological mechanism in which cytokines (TNFα, IL1β) bind to fibroblast receptors, initiate the NFκB phosphorylation cascade, and engender APR translational control of p65 [Fig. 9(a)]. While numerous single-cell studies struggle to correlate RNA and protein (or show a positive linear relationship), our results highlight a fascinating APR of p65 RNA and protein in time.27 When considered in this context, there must be some downstream inhibition of p65 RNA from its corresponding protein expression/translocation. To support this hypothesis, we used our pSigOmics method to (1) sample static timepoints and (2) perfuse fixative to create a dynamic time gradient [Fig. 9(b)]. Both approaches reveal a striking APR p65 RNA to protein pattern [Fig. 9(c)], which may be implicated in cyclic pro-inflammatory behavior and determine the efficacy of timed drug interventions.
DISCUSSION
We have proposed and validated a high cell throughput method of studying translational behavior in single cells in Fig. 1. By measuring tens of thousands of cells, we have uncovered an inverse RNA-protein APR pattern that was previously thought to be uncorrelated or strongly correlated. This heterogeneous response is appropriate to the native in situ fibroblast environment in which the host will need early and late responders for wound healing and building connective tissue.
One key consideration for the programmable fixation approach is the concentration and types of chemical fixatives. In the current demonstration, the pSigOmics approach was implemented on cells fixed with 4% formaldehyde. HCR RNA FISH protocols yield better results with 4% fixation and thus RNA counts are more reliable. In separate static experiments from 0 to 50 min, three batches were fixed with 1.6% formaldehyde, and another 3 were fixed with 4% formaldehyde. The combined data yielded consistent NFκB results (Fig. S6). The pSigOmics programmable fixation method could easily consider other fixatives such as methanol, glutaraldehyde, or acetone. When using the same fixative across different batches, we expect comparable results to the ones presented here provided each batch is treated with the same fixative.
The presented pSigOmics could impact the studies of human inflammatory responses that are relevant to many diseases. Therapeutic dose concentrations and timing are difficult to precisely tune in to different patients. Furthermore, the translational timing of inflammatory response is not well understood. With pSigOmics, we revealed that translational expression behaves in an oscillatory, APR RNA-protein relationship in response to external cytokine stimulation. Hence, this aspect of immune stimulation could be analogous to disease and dosage timing in human patients. For example, the delayed timing of types I and III interferon activity in controlling epithelial cell infection rate has been implicated in SARS-CoV-2.27 Therefore, pSigOmics robustly illustrates how immune responses vary in space and time in high cell throughput, which opens the door to uncovering signaling dynamics in other disease states.
Traditional cell signaling studies have involved in vitro cell alterations, in vivo animal models, or live cell imaging. These methods typically lack throughput (limited to a few tens of cells) or resolution (subcellular behavior is difficult or impossible to measure). Our presented pSigOmics method is a scalable, high cell throughput method that utilizes timed fixation of fully responding cells that can be spatially multiplexed. pSigOmics samples both static timepoints and a continuous gradient of cell responses via programmable fixation. Strikingly, when applied to cytokine-stimulated NFκB fibroblasts, pSigOmics revealed RNA and protein APR over time, which is contrary to conventional wisdom and published literature—certainly a new line of translational regulation of how RNA and protein can even be asynchronously regulated. To quantify this unexpected but observed APR, we successfully predicted APR subpopulations with GNN training on subcellular RNA spatial locations. Furthermore, we quantified p65 RNA and protein phase lag and used PHATE to segregate the APR subpopulations in latent space. Therefore, pSigOmics enables a high cell throughput study of single-cell signaling with programmable fixation and spatial multiplexing.
METHODS
Cultures
01-3T3 mouse fibroblasts containing H2B-GFP and p65 Protein RFP reporters were obtained from Dr. Savas Tay at the University of Chicago.16 The cell population was expanded on a T-75 flask until 70%–80% confluent.
Static cell stimulations
For static samples, two 12-well glass bottom plates were coated with poly-L-lysine. Upon confluency, cells were seeded onto coverslips, adhered overnight, and stimulated the following day with 10 ng/ml of TNFα (BioTechne 410-MT), 1 ng/ml of IL1β (BioTechne 401-ML-010), or DMSO unstimulated baseline (to the same dilution as TNFα). Wells were stimulated with one of these conditions for 0–105 min in 15-min intervals, inclusive. Finally, cells were fixed with 4% formaldehyde.
A separate set of static samples was used for PHATE embeddings. For these samples, 22 × 22 mm2 glass coverslips were coated with poly-L-lysine within six well plates. Cells were seeded onto coverslips, adhered overnight, and stimulated the following day with 10 ng/ml of TNFα (BioTechne 410-MT). At each 10-min interval from 0 to 50 min, inclusive, cells were fixed with 4% formaldehyde. In total, there were six replicate batches of these cells from various passages. These batches were used for computational batch mixing (Fig. S6) and PHATE analyses (Fig. S9).
Programmable stimulations with fixative perfusion
For the six programmable microchannels in helix, cells were seeded on Ibidi treated μ-Slide I Luer with 0.6 mm height (Ibidi Cat. No. 80186). After overnight adherence, each pair of microchannels was stimulated for 0, 25, or 45 min at 37 °C with 10 ng/ml TNFα in duplicate. We aimed to slowly perfuse formaldehyde through them to create a continuous gradient of cell response. We attempted many variations of formaldehyde perfusion setups, but the formation of bubbles during flow presented difficulties. First, we tried simple horizontal flow between microchannels, but bubbles would remain stuck inside the microchannel (Video S5). Next, we turned each microchannel upside down at a tilt and perfused each microchannel upward while arranging them in a helical pattern [Fig. 7(a)]. To minimize bubble formation, we diluted the 4% formaldehyde fixative with 0.1% Tween 20, which successfully prevented bubbles (Video S6). Finally, to ensure Tween 20 did not permeabilize or kill the cells before fixation, we repeated the helical experiment with pure 4% formaldehyde excluding any Tween 20 (Video S7). Therefore, we chose this helical arrangement for the final programmable design. An example of fluidic perfusion with color dye at 20X experimental speed (0.5 ml/min) can be found in Video S8. Then, all six microchannels were flipped upside down (to prevent bubbles) and arranged (from top to bottom) in 0–45 min. 4% formaldehyde was perfused through the system with a programmable Harvard pump from top to bottom at 25 μl/min (Fig. 7). Once the formaldehyde had passed through the entire system, the microchannels were ready for spatial RNA and protein profiling.
A complete description of experimental parameters for both static and programmable experiments can be found in Table S1.
Spatial RNA profiling in single cells
To detect single molecule RNA targets, we utilized the Molecular Instruments RNA FISH on mammalian cells on slides protocol for HCR RNA FISH v3.0. p65 protein and H2B-protein reporters were imaged in the first cycle after only formaldehyde fixation. p65 protein was imaged in the ds-RED/TRITC channel and H2B-protein was imaged in the A488/GFP channel. Then, samples were permeabilized in 70% ethanol at −20° C for 30 min. HCR probes were detected in HCR probe hybridization buffer, amplified 75 min (150 min for p65 RNA) in HCR amplification buffer, and washed with 5X SSC. The complete multiplex panel can be found in Fig. 2.
Sequential multiplexed imaging of protein reporters and RNA transcripts
Cells on coverslips were superglued (Gorilla Glue) by the edges to acrylic, laser cut coverslips for rigid mounting during multiplex imaging. Otherwise, glass bottom and plates or Ibidi microchannels were imaged directly as they were. Cycles were imaged according to Fig. 2 using a custom-modified Nikon widefield microscope (TE2000). Due to the unavailability of GFP and TRITC channels due to existing and H2B and p65 protein reporter signals, respectively, we only used Alexa Fluor 647 fluorescent channel for HCR detection using B1 amplifiers. To multiplex p65 RNA and GAPDH RNA, we performed the first detection of p65 RNA in Alexa Fluor 647 channel, and then digestion of HCR assembly using DNase I enzyme for 4 h at room temperature. In the subsequent cycle, we performed GAPDH RNA detection using the same Alexa Fluor 647 channel as a positive control (Fig. 2). GAPDH RNA signal was removed with DNAse again for 4 h at RT. Later, A20 Ab was labeled overnight, imaged, and quenched with DNAse and LiBH4 (1 mg/ml) bleach for 15 min under white light. Finally, segmentation Abs (Vim, Hsp47, S100A4, and αSMA) were labeled in the final cycle for later 3D cell segmentation. We performed image registration across cycles using nuclear signatures from reporter protein imaging, p65 RNA, and GAPDH RNA imaging cycles.
Benchmarking pSigOmics with synthetic cells and batch mixing
Our pSigOmics workflow was validated with synthetic cell data. For this analysis, we only used the six batches of coverslips with 0–50 min 10 ng/ml TNFα stimulations. First, synthetic cells were generated in silico with random circles of nucleus radius 25 pixels and expanded by another 25 pixels for the cytosol. The nuclear membrane region was created by blurring the nucleus edge with a 10 × 10 pixel Gaussian blur. p65 RNA and protein values were assigned to each cell region based on Fig. S6(a) with added Gaussian noise (randomly selected mean between 0 and 20, standard deviation between 0 and 5). Each representative timepoint contained 40 field of views with 30 cells each. Synthetic timepoints were created for 0, 10, 20, 30, 40, and 50 min.
Furthermore, we mixed timepoints from various experimental batches to test the replicability across trials and passages. Specifically, we chose Batch 6 Time 0, Batch 3 Time 10, Batch 1 Time 20, Batch 5 Time 30, Batch 2 Time 40, and Batch 4 Time 50 [Fig. S6(c)]. We again visualized the fraction of nonresponding, partially responding, and fully responding cells for each timepoint as well as the median p65 protein behavior.
Pseudotime ordering model
To examine the intracellular dynamic fluctuation of the p65 protein and p65 RNA after stimulation, we derived the pseudotime ordering of the cells collected at 0, 15, 30, 45, 60, 75, 90, and 105 min of stimulation (Fig. S11). Recognizing that the overall protein and RNA levels at these discrete time points vary, we assumed that the overall stimulation levels of cells collected at various points were different. Therefore, we ranked the cells from each time point separately and concatenated them afterward. For cells with a stimulation time of 0 min, we assume that the cell containing the least amount of protein is the least stimulated and placed it earliest on the pseudotime axis. Cell ordering after the initial cell is determined by recursively seeking the closest neighbor based on the Euclidean distance of curated subcellular RNA and protein substance (p65 RNA cytosol, p65 RNA nuclear membrane, p65 RNA nucleus, p65 protein cytosol, p65 protein nuclear membrane, p65 protein nucleus, GAPDH RNA cytosol, GAPDH RNA nuclear membrane, and GAPDH nucleus). Since RNA content is significantly lower than protein, we conducted min-max normalization to each feature separately. The reason for using these nine dimensions is that RNA and protein levels should be connected as a continuous curve for all three subcellular regions for a properly ranked cell list to reflect substance transport between cellular regions. For cells stimulated for 15–105 min, we started with the cell closest to the last ranked cell of the previous timepoint.
Segregating and quantifying cell APR subpopulations
To investigate cell subpopulation behavior, we segregated cells into APR subpopulations and computed the APR levels. Each cell's ratio of total p65 RNA and protein was computed. Then one was added to every value to avoid division by zero errors. The histogram of whole cell p65 protein to RNA ratio was visualized and divided into quartiles [Fig. 6(a)]. The bottom quartile was designated as cells with low protein and high RNA and the top quartile was cells with high protein and low RNA. Finally, cells in the interquartile range were considered medium RNA and medium protein.
When visualizing median cell behavior over time [Fig. 6(b)], each subpopulation behavior was plotted separately by cell region. To quantify APR levels, each peak and trough of the p65 RNA and protein curves were recorded. For each peak and trough, the nearest trough or peak, respectively, of the other biomolecule was computed and the difference was recorded. Finally, all differences were concatenated together, and the root mean squared difference of this vector was reported in bold italics on the bottom left of each graph. Thus, the lower scores indicate closer peak-trough distances that signify higher APR levels.
Graph neural network modeling
-
Graph construction [Fig. S8(a)]: We constructed each graph by one hot-encoding of each GAPDH RNA dot as a node and RNA-located region (Cytosol, Nuclear Membrane, or Nucleus). The edge is built through 3D Delaunay Triangulation and the edge weight is represented by 1.0/(Spatial Euclidean Distance). A20 protein intensity at this RNA location is also assigned to these nodes. The goal was to predict the APR subpopulation graph label: either p65 low RNA and high protein or high RNA and low protein.
-
Graph architecture [Fig. S8(b)]: We utilized multi-layer graph neural network architecture including message passing to achieve communication between nodes of close proximity, a pooling layer to aggregate information from all the nodes, and a linear layer to achieve classification. To obtain the optimal GNN model for each treatment group, we compared different message-passing methods (GAT, GCN, GraphConv, and GATV2Conv), and pooling methods (mean, max, sum, global attention, and gated attention). In addition, we benchmarked hidden layers in 2, 3, and 4 for each GNN model.
-
GNN training: We applied cross-entropy loss during training for 50 epochs with optimizer Adam with a learning rate of 5 × 10−4. We conducted a 5-fold cross-validation of each dataset by separating the datasets into 20% of the testing dataset and 80% of the training dataset. Accuracy (ACC), area under the curve of the receiver operating characteristic (AUC ROC), and F1 score are calculated to evaluate the model performance.
-
Baseline model comparison [Figs. S8(c) and 8(d)]: We compared the performance of the GNN model with logistic regression, support vector machine (SVM), Gaussian Naive Bayes (GaussianNB), Decision Tree, and Histogram-based Gradient Boosting Classification Tree (HistGradientBoostingClassifer) by fivefold cross-validation by separating the datasets into 20% of testing dataset and 80% of training dataset.
Single-cell PHATE modeling of subpopulations
We utilized the Potential of Heat-diffusion for Affinity-based Trajectory Embedding (PHATE) to investigate the local and global similarity of the cells with p65 protein and RNA six-dimensional data.26 We only considered the cells on coverslips stimulated for 0–50 min with 10 ng/ml TNFα. To encode local distance, PHATE calculates the Euclidean distance for cells in a pairwise fashion and applies cell-specific α-decay kernel for each cell to derive affinity from spatial distance. α-decay kernel utilizes the parameter nearest neighbor k to control the local bandwidth and decay rate α to control the decay rate for the kernel tail. In other words, a good choice of α and k allows the local information to be well stored, maintain the full connectivity for sparsely located cells, and avoid uniform affinity for farther cells. The connectivity matrix is then normalized by rows to derive the transition matrix (M). A random walk of t-steps is conducted after deriving M by raising M to the t power to obtain diffusion operator (P), which encodes the global structure of the cells. PHATE next converts P into log space, calculates informational distance between cells, and utilizes metric multidimensional embedding (MDS) to embed cells into the 2D or 3D space. After examinations, we chose k = 5, α = 15, and t = 150 in our case to ensure subtle and global trajectories are both well maintained. Furthermore, we colored PHATE embedding by both cell timestamps and APR labels in both 2D and 3D space (Fig. S9).
To quantify the separation of three APR cell populations in PHATE-based 3D embeddings beyond visual examination, we utilized a decision tree, with weight adjustments for three classes due to highly unbalanced class labels. We reported the performance with a confusion matrix and a summary of accuracy, F1 score, and recall [Fig. S9(c)].
SUPPLEMENTARY MATERIAL
See the supplementary material for details on the full multiplexing panel, in silico benchmarking, more latent space embeddings, GNN training and testing, single-cell pseudotime orderings, and p65 RNA and protein correlations.
ACKNOWLEDGMENTS
A.F.C. holds a career award at the Scientific Interface from the Burroughs Welcome Fund and a Bernie-Marcus Early-Career Professorship. A.F.C. was supported by startup funds from the Georgia Institute of Technology and Emory University. Research reported in this study was supported by the National Institute of General Medical Sciences of the National Institutes of Health under Award Nos. T32GM142616 and R35GM151028.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Ethics Approval
Ethics approval is not required.
Author Contributions
Nicholas Zhang: Data curation (equal); Formal analysis (equal); Methodology (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Mingshuang Wang: Formal analysis (equal); Visualization (equal); Writing – review & editing (equal). Dhruv Nambiar: Data curation (equal); Methodology (equal). Samyukta Iyer: Data curation (equal). Priyam Kadakia: Data curation (equal). Qianqi Luo: Formal analysis (equal). Sicheng Pang: Data curation (equal). Aaron Qu: Formal analysis (equal). Nivik Sanjay Bharadwaj: Data curation (equal). Peng Qiu: Conceptualization (equal); Formal analysis (equal); Investigation (equal); Project administration (equal); Writing – review & editing (equal). Ahmet F. Coskun: Conceptualization (equal); Funding acquisition (equal); Investigation (equal); Methodology (equal); Project administration (equal); Resources (equal); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are openly available in Zenodo; GitHub; and figshare at https://zenodo.org/records/11321594; https://github.com/coskunlab/pSigOmics; and https://figshare.com/articles/figure/NFkB_Video_S8/25898560?file=46535068, Refs. 28–30.