Antibody fragments without the Fc region are attracting attention in the pharmaceutical industry due to their high ability to penetrate solid tissues, cost-effective expression using microbial expression systems, and distinctive modes of action compared to those of full-size antibodies. Based on these characteristics, several antibody fragment agents have been approved. However, developing platform engineering methodologies to accelerate their development is important. In this review, we summarize and discuss protein engineering strategies for preparing therapeutic antibody fragments composed of antibody variable domains. Three (introduction of high-solubility tag systems, complementarity-determining region grafting, and domain arrangements) and two (introduction of purification tag systems and mutagenesis studies for protein L- or protein A-binding) protein engineering strategies have been reported for the cultivation and purification processes, respectively. Fusion tags might negatively impact molecular folding, function, immunogenicity, and final yield. If the production behavior of antibody fragments is not improved through complementarity-determining region grafting, domain arrangements, or human sequence-based mutagenesis, using additional fusion tag systems should be considered, with careful attention to the points described above. This summarized knowledge regarding protein engineering strategies for effectively producing antibody fragments will further accelerate therapeutic antibody fragment development.
I. INTRODUCTION
Monoclonal antibodies (mAbs) are widely used as therapeutic agents for various diseases, such as cancer, immune-mediated disorders, and infectious diseases, owing to their high specificity and binding affinity to target antigens.1 Additionally, they are often used as analytical tools in bioimaging and biosensing.2 Over the last few decades, many pharmaceutical companies and academic researchers have paid increasing attention to antibody drug research, and the therapeutic antibody market has been expanding rapidly. This trend is expected to continue, with the market size expected to reach 237 billion USD in 2023 and 834 billion USD in 2033. However, there are some hurdles in the pharmaceutical application of mAbs. For example, although therapeutic mAbs can elicit an effect by penetrating target tissues and the extracellular matrix, the penetration efficiency of full-size mAbs is low because of their high molecular weight.3,4 Another concern is the limited number of potential target antigens in the recent developments in therapeutic antibodies.
To overcome these difficulties, antibody fragments without the antibody Fc region have been developed.5 Although there are various formats of antibody fragments, such as single-chain Fv (scFv)6 and tandem scFvs (taFvs),7 their lower molecular weights compared to those of full-size antibodies enable better tissue penetration and easier protein engineering for additional functions.8,9 Nine antibody fragment products have been approved by the United States Food and Drug Administration (FDA) and European Medicines Agency (EMA). Fab, a major antibody fragment, has been studied and developed for a long time, and four Fab-based therapeutic agents have been approved by the FDA and EMA. The anti-glycoprotein IIb/IIIa receptor abciximab (first US approval: 1994 and first EU approval: 1995),10 anti-vascular endothelial growth factor (VEGF) ranibizumab (first US approval: 2006 and first EU approval: 2007),11 and anti-dabigatran idarucizumab (first US approval: 2015 and first EU approval: 2015)12 have pure Fab structures, whereas certolizumab pegol has a more complex structure in which the anti-tumor necrosis factor Fab is linked with a 40 kDa polyethylene glycol moiety.13 In recent years, some scFv-based therapeutic agents have been approved, including anti-VEGF brolucizumab (first US approval: 2019 and first EU approval: 2020)14 and the complex of anti-cluster of differentiation-22 (CD22) moxetumomab scFv with pasudotox (first US approval: 2018 and first EU approval: 2021).15 The variable domain of the heavy chain of heavy-chain antibody (VHH)-based anti-von Willebrand factor antibody, caplacizumab, was approved by the US in 2019 and the EU in 2018.16,17 Moreover, two therapeutic agents that bind to two different targets have been approved: blinatumomab (first US approval: 2015 and first EU approval: 2015)18 and tebentafusp (first US approval: 2022 and first EU approval: 2022).19
In this review, we summarize and discuss the protein engineering strategies that contribute to efficiently producing antibody fragments in pharmaceutical manufacturing. Particularly, we focused on antibody fragments composed only of antibody-variable regions. Because these antibody fragments are smaller than Fabs, they can attain additional functions while maintaining high-yield expression, which has led to a recent trend in antibody fragment development. However, there are multiple barriers to overcome for the pilot-scale production of these antibody fragments. During the cultivation process, a combination of protein engineering, cultivation process parameters, and host cell selection should be carefully considered for actual production. Additionally, as the gold standard purification method for full-size mAbs, protein A affinity chromatography cannot be applied for antibody fragment purification because of the lack of an Fc region.20 Thus, affinity tags on the N- or C-terminus of the protein of interest (POI) are commonly used. However, their negative impacts on immunogenicity and productivity remain a concern for this strategy. As therapeutic antibody fragments have been accelerated by their design complexity and diversification in the last few years, it is expected that the production process development and actual manufacturing will begin in earnest, with preparation for use in clinical trials after a few years. To avoid the detour of future therapeutic agent development for efficient process determination, researchers should design new molecular structures by considering the possibility of actual manufacturing. This review will contribute to the design of strategies and efficient drug development using antibody fragments.
II. DESIGN STRATEGY FOR THERAPEUTIC ANTIBODY FRAGMENTS
Full-size mAbs provide therapeutic benefits due to their high specificity and binding affinity for target antigens. Antibody-dependent cellular cytotoxicity and complement-dependent cytotoxicity induced by the Fc region are important modes of action.21 However, full-size mAbs have some disadvantages for therapeutic use, such as high production costs, poor tissue penetration, and side effects, due to the presence of the Fc region.22,23 Over the last few decades, various engineered antibody fragment formats that do not include the Fc region have been developed as attractive strategies to overcome these problems (Fig. 1).
The antibody fragment formats composed of only variable domains. Bi- or multi-specific forms can be designed by combining two or more antibody clones. Variable heavy and variable light domains are colored green and light green, respectively. The VHH is colored yellow. Artificially introduced polypeptide linkers for connecting each domain are depicted by a black line. VHH, variable domain of heavy chain of heavy-chain antibody; VNAR, variable domain of the new antigen receptor; DART, dual-affinity retargeting; scDb, single-chain diabody; TandAb, tandem diabody.
The antibody fragment formats composed of only variable domains. Bi- or multi-specific forms can be designed by combining two or more antibody clones. Variable heavy and variable light domains are colored green and light green, respectively. The VHH is colored yellow. Artificially introduced polypeptide linkers for connecting each domain are depicted by a black line. VHH, variable domain of heavy chain of heavy-chain antibody; VNAR, variable domain of the new antigen receptor; DART, dual-affinity retargeting; scDb, single-chain diabody; TandAb, tandem diabody.
A. Single-chain variable fragment
Some antibody fragment formats consisting only of the variable heavy (VH) and variable light (VL) domains have been developed as therapeutic agents for various diseases.1,24,25 An scFv, in which VH and VL are connected by a short peptide linker (at least 12 residues), has the simplest structure in the antibody fragment formats.26 The molecular weight of an scFv is less than 30 000, overcoming the aforementioned problems associated with full-size antibodies.27 Brolucizumab (Novartis), which targets and inhibits vascular endothelial growth factor A (VEGF-A), is the most successful scFv therapeutic agent and was approved by the FDA in 2019.14 Compared to other anti-VEGF biopharmaceutical agents against neovascular age-related macular degeneration, such as aflibercept or ranibizumab, brolucizumab has a higher affinity for all VEGF-A isoforms. However, the molecular weight of an scFv also results in a short half-life, and not all scFvs achieve high production. Additionally, some scFvs show low homogeneity and aggregation tendencies; thus, various studies have been conducted to avoid these problems.27–30 To overcome the low homogeneity of scFv, Yamauchi et al. cyclized scFv using the sortase A-mediated ligation method, in which the LPETG motif on the C-terminus, recognized by sortase A, was digested between threonine and glycine and connected with the N-terminal glycine residues.31 As expected, the cyclized scFv improved the low homogeneity caused by aggregation while maintaining the antigen-binding ability of the original scFv. However, preparing cyclized scFv requires additional cyclization steps for future pharmaceutical manufacturing, which could decrease the final yield or increase production costs. As many of these problems involve trade-offs between full-size mAbs and scFv, proper formats should be selected for the purpose and strategy of developing therapeutic agents.
B. Single-domain antibody fragment
Recently, VHH derived from camelid antibodies have attracted attention due to their simple structure. This simple antibody structure binds to target antigens using only a single immunoglobulin domain with adequately high stability. It enables easy recombinant production with an affinity comparable to that of full-size antibodies.32 Particularly, the resistance of VHH to high temperatures contributes to its pharmaceutical use, which requires high drug stability.33 Additionally, the sequence homology of most VHH is highly consistent with human VH3 subfamily (IGHV3); thus, low immunogenicity is expected in pharmaceutical usage.34,35 The variable domain of the new antigen receptor (VNAR) derived from shark antibodies is also known as a single-domain antibody. Similar to VHHs, VNAR shows high stability and solubility, and its three-dimensional structure shares high homology with VL or T-cell receptor V domains.35,36 However, using VNAR domains potentially induces an undesirable immune response, which is the greatest hurdle for pharmaceutical applications. To avoid this problem, humanization of VHH and VNAR and preparation of human-derived single-domain antibody fragments have been considered in the last few decades.35–38
C. Bi- or multi-valent antibody fragment
More complex antibody fragments with more than two pairs of variable regions have also been designed, and their effectiveness has been demonstrated by many researchers. The structure of taFv consists of two scFvs linked via a short peptide linker with a molecular weight of approximately 55 000. In the case of taFvs, researchers can easily construct bispecific antibodies by combining only two scFvs that bind to different target antigens. This antibody fragment has some advantages, such as increased avidity, ease of construction of the bispecific form, and high tissue penetration ability, similar to scFv. TaFv-recruiting T- or natural killer cells are designated bispecific T-cell engagers or bispecific killer cell engagers,39 respectively. The bispecific T-cell engager blinatumomab (Amgen), which simultaneously binds to the T-cell surface antigen CD3 and B-cell differentiation antigen CD19, is the only bispecific antibody fragment approved by the FDA.18 Extremely high cytotoxicity (ED50 values of 10−12–10−13 M) can be induced by the recruitment of T-cells to cancer cells, although the affinity of blinatumomab is not extremely high against CD3 (equilibrium desorption constant [KD] = 1.0 × 10−7 M) and CD19 (KD = 1.0 × 10−9 M).40,41 Diabody, single-chain diabody (scDb), and dual-affinity retargeting (DART) are bivalent antibody fragment formats that consist of two pairs of variable domains, similar to taFv. However, the manner in which the domains are connected is different. Diabodies are usually constructed by adjusting the peptide linker length between VH and VL to 3–12 amino acids.26 Additionally, it is possible to construct a bispecific diabody if researchers use a different clone of scFvs.42 Although the diabody has advantages owing to its small size, similar to the taFv format, dissociation in the sample solution is one of the problems in the manufacturing process because the proper formation of the diabody is based on weak non-covalent bonding between VH and VL. Moreover, it is difficult to produce homogeneous heterodimers, especially in the case of bispecific diabodies, because the expression levels of each chain cannot be equally regulated when using co-expression vectors. To overcome these problems, an scDb format was developed by linking each diabody chain.43 Some studies have reported that the scDb format was expressed as a soluble monomer form in bacteria.44,45 Furthermore, DART is a bivalent antibody fragment in which an additional C-terminal disulfide bridge is introduced into the scDb to improve molecular stabilization. Johnson et al. investigated the stabilization of an anti-CD16 × anti-CD32B DART under formulation buffer and showed that the molecular properties were unchanged after storage at 2–8 °C for nine months.41 Antibody fragments with trivalent or tetravalent antigen-binding sites have also been developed. Triabodies or tetrabodies consist of three or four pairs of variable domains with molecular weights of approximately 75 000 and 100 000, respectively.26 These antibody fragments usually consist of three or four scFv-based fragments, with a peptide linker between each domain under three amino acids to form proper multivalent structures. ScDb-scFv is another potential trivalent antibody. NM21–1480 is a trispecific scDb-scFv with a structure in which an anti-4–1BB × anti-programmed cell death ligand 1 scDb on the N-terminus is linked to an anti-human serum albumin scFv on the C-terminus by a (G4S)2 linker; a phase I/II study is currently ongoing on it.46 Tandem diabodies, also known as TandAbs, are tetravalent bispecific antibody fragments. In this format, bivalent binding to individual targets reduces the dissociation rate constant (koff) and enables longer retention of the target cell surface.47 Furthermore, its long half-life, owing to its high molecular weight, results in excellent pharmacokinetics. Additionally, single-domain antibody fragments are more suitable for creating multivalent formats by simply connecting them to each other. Caplacizumab (Sanofi), a therapeutic agent for acquired thrombotic thrombocytopenic purpura, contains two identical VHH domains connected by a short peptide linker.16,17 A single-domain antibody fragment was also used to prepare bispecific formats. Makabe et al. reported bispecific VHH constructs using a protein trans-splicing reaction.48 The two VHH components were fused to the N- or C-terminus of the split intein, and the intein-removed bispecific VHH maintained the function of each VHH component. As expected, the cyclized VHH bispecific antibodies showed enhanced resistance to peptidase digestion while maintaining their function. Cyclization technology has also been applied to bispecific VHH via the split-intein-mediated circular ligation of peptides and proteins to protect against exopeptidase digestion.49 Cyclization is a commonly adopted strategy to increase the target protein stability,50 and its versatility has been demonstrated in the case of VHH.
These antibody fragments overcome the problem of full-size mAbs in target antigen depletion owing to their enhanced penetration ability and easy construction of multivalent antibodies. Moreover, they can be expressed using a cost-effective microbial expression system. As mentioned above, various antibody fragment formats have been reported and advanced in clinical trials, and this trend is expected to continue. Researchers must select an appropriate antibody format from various perspectives, ranging from efficacy to manufacturing, by considering the characteristics of each format.
III. UPSTREAM PROCESS
In the upstream process of antibody manufacturing, researchers typically focus on how the cultivation process can offer high POI productivity while suppressing the generation of impurities or mutants. This criterion is similar for antibody fragments, and some researchers have attempted to develop engineering strategies to enhance the upstream process productivity or stability. In the last few decades, the technology for synthesizing recombinant proteins has been rapidly established because of the development of genetic engineering methods. Various types of recombinant proteins have been heterologously expressed in host cells, such as bacteria,51 filamentous fungi,52 yeast,53 microalgae,54 insect cells,55 plants,56 mammalian cells,57 and in cell-free synthesis.58 Among these, the E. coli expression system has some advantages, such as low expression cost, short doubling time, and ease of manipulation. Hence, many researchers have selected it as the first choice for host cells, along with CHO cells.51 However, not all recombinant proteins can be expressed in a soluble form after their synthesis as a recombinant protein in E. coli; some are expressed as inclusion bodies. Although there have been discussions on the refolding of inclusion bodies to recover the POI activity and some efficient methodologies have been reported, the loss of yield has been regarded as a continuous problem.59 Therefore, various strategies for expressing soluble recombinant proteins have been investigated. While some are non-protein engineering methods, such as host engineering, selection of expression vectors, and cultivation conditions,51 others are protein engineering strategies. In this section, we discuss the strategies for enhancing the expression of soluble recombinant proteins by focusing on protein engineering.
A. Introduction of fusion tags for high production levels
Over the last few decades, many studies on introducing peptide- or protein-based tags for highly soluble POI expression have been conducted, and efficient heterologous expression for various target recombinant proteins has been achieved. Maltose-binding protein (MBP) is a popular protein-based tag that achieves high heterologous expression levels in bacterial expression systems. Some recombinant proteins fused to MBP tags have been found to improve soluble expression.60,61 The mechanisms that enhance POI expression upon MBP tag fusion through E.coli expression system possibly include intrinsic chaperone activity, especially pronounced when fused to the N-terminus of POI (reviewed in detail by Costa et al.62). Because of this excellent characteristic, MBP is a promising fusion tag for the expression of recombinant proteins in a soluble form. However, for the efficient soluble expression of target recombinant proteins, selecting appropriate combinations of host cell strains, tags, and culture conditions is important. In this section, we describe recently developed fusion tag systems that are already applied or applicable in the future for the purification of antibody fragments (Table I). Loss of the N-terminal region of the carbonic anhydrase of Hydrogenovibrio marinus decreased its soluble expression, indicating that this region is a useful soluble tag (NEXT tag, 53-amino acids long).63 The NEXT tag improved the soluble expression of recombinant proteins, including epidermal growth factor, green fluorescent protein (GFP), and two enzymes. Upon fusion of the NEXT tag, the soluble expression level of the POI from E. coli BL21 (DE3) increased by approximately 5.6–8.3 fold, with a minimal decrease in POI activity, compared to that upon fusion of the MBP or Fh8 tag derived from the low-molecular-weight Fasciola hepatica 8-kDa antigen. Li et al. reported that the 77-amino acid acyl carrier protein tag, called the ACP tag, from E. coli fatty acid synthase could accelerate the soluble expression of the tobacco etch virus using E. coli BL21 (DE3) strain as a host cell.64 The strong electrostatic repulsion evoked by the acidic acyl carrier protein sequence enables a reduction in the interactions between fusion proteins and oligomerization. CBM66 (164-amino acids long) from the exo-levanase of Bacillus subtilis (BsSacC), a carbohydrate-binding module, has been used as a solubility enhancement tag for signaling polypeptides and microbial enzymes.65 The mechanism underlying the solubilization of POI using this tag is considered chaperone-like; soluble expression levels of POI were notably increased when protein expression was induced in E. coli BL21 (DE3) at low temperatures. The Fh8-ΔI-CM system developed by Zhang et al. consists of an 8-kDa Fh8 tag to enhance POI solubility and a self-cleavable intein ΔI-CM for effective purification.66 The production rate of the soluble form of the tumor necrosis factor-related apoptosis-inducing ligand from E. coli BL21 (DE3) increased 4.5-fold with this system compared to that in the absence of it. Grl1p tag with 384 amino acids from Tetrahymena thermophila was developed by Agrawal et al., and its performance was evaluated by fusion with Plasmodium falciparum proteins Pfs25 and Pfs48/45.67 Particularly, the fusion of the Grl1p tag on Pfs25 dramatically enhanced the soluble expression (180-fold) from E. coli SHuffle, possibly due to the net negative charge of Grl1p (pI = 4.28). Although these protein-based tags can enhance POI expression in soluble form, large tags may negatively impact the POI function. There are possibilities to avoid this problem by inserting additional steps for tag digestion, similar to those in the Fh8-ΔI-CM system; however, this additional step is usually unfavorable in pharmaceutical production because it leads to loss of the POI. To overcome these problems, researchers have focused on developing smaller peptide-based solubility tags (< 50 amino acids) to enhance the soluble expression of recombinant proteins. The NT11 tag from Dunaliella species consists of an 11-amino acid sequence (VSEPHDYNYEK). Nguyen et al. demonstrated its effect by fusing two carbonic anhydrases with yellow fluorescent protein (YFP) from E. coli BL21 (DE3).68 Fusion of the NT11 tag enhanced the soluble expression of POI by up to 5.0-fold without inhibiting the protein function. Zou et al. reported that the 15-residue PCDS tag (DIAGAAAPARVRNAS) from Pseudomonas putida NCIMB 9866 enhanced the expression levels of PcaHG98 and YFP from E. coli BL21 (DE3).69 The pI of the PCDS tag was 9.849; this alkaline characteristic could improve POI solubility by influencing the static interactions between the target POI. Moreover, S1v1 and NCTR25 tags can be used for metal affinity chromatography and are thus considered promising functional tags for efficiently preparing recombinant proteins. The S1v1 tag (AEAEAHAH)2 from the zuotin protein derived from Saccharomyces cerevisiae enhanced three types of POIs (polygalacturonate lyase, lipoxygenase, and GFP) by up to 3.8-fold in E. coli BL21 (DE3) and E. coli Rosetta (DE3).70 The 25-amino acid sequence of the NCTR25 tag (MDHSHHMGMSYMDSNSTMQPSHHHP) derived from human copper transporter 1 increased the expression level of enhanced green fluorescent protein (EGFP) by 63% compared to that of EGFP without NCTR25 from E. coli BL21 (DE3) pLysS.71 In addition, because the tag sequence is of human origin, its immunogenicity is typically low.
Newly developed tag systems for protein expression since 2017. EGFR, epidermal growth factor receptor; VHH, variable domain of heavy chain of heavy-chain antibody; YFP, yellow fluorescent protein; PGL, polygalacturonate lyase; LOX, lipoxygenase; GFP, green fluorescent protein; TEV, tobacco etch virus; hEGF, human epidermal growth factor; TNF-α, tumor necrosis factor α; FDH, formate dehydrogenase; GDH, glucose 1-dehydrogenase; MBP, maltose binding protein; MLL, mixed-lineage leukemia 3 protein; fhSP, fragment human surfactant protein; Aβ, amyloid β.
Tag . | Tag sequence . | Target protein . | Host cell . | Derivation . | Reference . | Year . |
---|---|---|---|---|---|---|
NEXT | 53 amino acids | • hEGF | E. coli BL21 (DE3) | Hydrogenovibrio marinus carbonic anhydrase | 63 | 2022 |
• GFP | ||||||
• Thermovibrio ammonificans carbonic anhydrase | ||||||
• Polyethylene terephthalate-hydrolyzing enzyme | ||||||
ACP | 77 amino acids | TEV | E. coli BL21 (DE3) | E. coli fatty acid synthase | 64 | 2023 |
CBM66 | 164 amino acids | • 4 human-derived signaling polypeptides | E. coli BL21 (DE3) | exo-levanase of Bacillus subtilis (BsSacC) | 65 | 2021 |
• 3 microbial enzymes | ||||||
Fh8-ΔI-CM | 8 kDa | TNF-related apoptosis-inducing ligand | E. coli BL21 (DE3) | Fasciola hepatica | 66 | 2018 |
Grl1p | 384 amino acids | • Pfs25 | E. coli SHuffle | Tetrahymena thermophila | 67 | 2019 |
• Pfs48/45 | ||||||
mCherry | 28 kDa | Pyrobaculum yellowstonensis WP-30 trehalose transferase | E. coli BL21 (DE3) | Discosoma sp | 74 | 2019 |
SPY | 31 kDa | • Im7 L53A I54A | E. coli BL21 (DE3) | Bacterial periplasmic chaperone | 75 | 2020 |
• Human muscle acylphosphatase | ||||||
• FDH | ||||||
• GDH | ||||||
• MBP G32D I33 P | ||||||
• MLL | ||||||
Ffu | 209 or 217 or 312 amino acids | • Community-acquired respiratory distress syndrome toxin | E. coli BL21 (DE3) | Arthrobacter arilaitensis NJEM01 | 76 | 2017 |
• Vascular endothelial growth factor receptors-2 | ||||||
• Rubella virus structural polyproteins (RVs) | ||||||
• Chlamydia major outer membrane protein (Omp85) | ||||||
• Hirudin variant III | ||||||
NT | 133 amino acids | • rSP-C33Leu | E. coli BL21 (DE3) | Spider silk protein | 77 | 2017 |
• rKL4 | ||||||
• rfhSP-D | ||||||
• rCCK-58 | ||||||
• rAβ1-42 | ||||||
• rhCAP-18 | ||||||
• c-Myc-rβ17 | ||||||
• rSP-Css | ||||||
CASPON | ERNKERKEAELEAETAEQ | • Fibroblast growth factor 2 | E. coli BL21 (DE3) | T7 bacteriophage | 78 | 2022 |
• Mature TNF-α | ||||||
• BIWA4 | ||||||
• Human growth hormone | ||||||
• Granulocyte-colony stimulating factor | ||||||
• Parathyroid hormone | ||||||
• SARS-CoV-2 nucleocapsid protein | ||||||
• Interferon-gamma | ||||||
intein+kitin | carboxypeptidase G2 | E. coli BL21 (DE3) | • Mycobacterium xenopi (intein) | 79 | 2021 | |
• Bacillus circulans (CBD) | ||||||
NCTR25 | MDHSHHMGMSY MDSNSTMQPSHHHP | eGFP | E. coli BL21 (DE3) pLysS | Human copper transporter 1 | 71 | 2019 |
NT11 | 11 amino acids | • Carbonic anhydrase | E. coli BL21 (DE3) | Dunaliella species | 68 | 2019 |
• YFP | ||||||
PCDS | 15 amino acids | • PcaHG98 | E. coli BL21 (DE3) | Pseudomonas putida NCIMB 9866 | 69 | 2021 |
• YFP | ||||||
s1v1 | (AEAEAHAH)2 | • PGL | E. coli BL21 (DE3) | Saccharomyces cerevisiae | 70 | 2018 |
• LOX | ||||||
• GFP | E. coli Rosetta (DE3) | |||||
C9R | (GR3)3 | • TEV protease | E. coli BL21 (DE3) pLysS | … | 80 | 2018 |
Iasp | 45 amino acids | • eGFP | E. coli BL21-star (DE3) | Bacillus thuringiensis | 81 | 2020 |
• mCherry | ||||||
• Matrix metalloprotease-13 | ||||||
• Myostatin (growth differentiating factor-8) | ||||||
S1nv | S1nv10: (ANANARAR)2 S1nv11: (ANANARAR)3 S1nv17: ANANARARANANAR | • PGL | E. coli BL21 (DE3) | Saccharomyces cerevisiae | 82 | 2019 |
• LOX | ||||||
• L-asparaginase (ASN) | ||||||
• transglutaminase (MTG) | ||||||
hexa-lysine | KKKKKK | Myosin light chain 6 (gMYL6) | E. coli BL21 (DE3) | … | 83 | 2018 |
C5K | KKKKK | Anti-EGFR VHH-7D12 | E. coli BL21 (DE3) pLysS | … | 72 | 2021 |
SF | GAGAGS | Anti-human epididymis protein 4 VHH 4G8, 1G8, and 3A3 | E. coli BL21 (DE3) | Silkworm Bombyx mori | 73 | 2022 |
Tag . | Tag sequence . | Target protein . | Host cell . | Derivation . | Reference . | Year . |
---|---|---|---|---|---|---|
NEXT | 53 amino acids | • hEGF | E. coli BL21 (DE3) | Hydrogenovibrio marinus carbonic anhydrase | 63 | 2022 |
• GFP | ||||||
• Thermovibrio ammonificans carbonic anhydrase | ||||||
• Polyethylene terephthalate-hydrolyzing enzyme | ||||||
ACP | 77 amino acids | TEV | E. coli BL21 (DE3) | E. coli fatty acid synthase | 64 | 2023 |
CBM66 | 164 amino acids | • 4 human-derived signaling polypeptides | E. coli BL21 (DE3) | exo-levanase of Bacillus subtilis (BsSacC) | 65 | 2021 |
• 3 microbial enzymes | ||||||
Fh8-ΔI-CM | 8 kDa | TNF-related apoptosis-inducing ligand | E. coli BL21 (DE3) | Fasciola hepatica | 66 | 2018 |
Grl1p | 384 amino acids | • Pfs25 | E. coli SHuffle | Tetrahymena thermophila | 67 | 2019 |
• Pfs48/45 | ||||||
mCherry | 28 kDa | Pyrobaculum yellowstonensis WP-30 trehalose transferase | E. coli BL21 (DE3) | Discosoma sp | 74 | 2019 |
SPY | 31 kDa | • Im7 L53A I54A | E. coli BL21 (DE3) | Bacterial periplasmic chaperone | 75 | 2020 |
• Human muscle acylphosphatase | ||||||
• FDH | ||||||
• GDH | ||||||
• MBP G32D I33 P | ||||||
• MLL | ||||||
Ffu | 209 or 217 or 312 amino acids | • Community-acquired respiratory distress syndrome toxin | E. coli BL21 (DE3) | Arthrobacter arilaitensis NJEM01 | 76 | 2017 |
• Vascular endothelial growth factor receptors-2 | ||||||
• Rubella virus structural polyproteins (RVs) | ||||||
• Chlamydia major outer membrane protein (Omp85) | ||||||
• Hirudin variant III | ||||||
NT | 133 amino acids | • rSP-C33Leu | E. coli BL21 (DE3) | Spider silk protein | 77 | 2017 |
• rKL4 | ||||||
• rfhSP-D | ||||||
• rCCK-58 | ||||||
• rAβ1-42 | ||||||
• rhCAP-18 | ||||||
• c-Myc-rβ17 | ||||||
• rSP-Css | ||||||
CASPON | ERNKERKEAELEAETAEQ | • Fibroblast growth factor 2 | E. coli BL21 (DE3) | T7 bacteriophage | 78 | 2022 |
• Mature TNF-α | ||||||
• BIWA4 | ||||||
• Human growth hormone | ||||||
• Granulocyte-colony stimulating factor | ||||||
• Parathyroid hormone | ||||||
• SARS-CoV-2 nucleocapsid protein | ||||||
• Interferon-gamma | ||||||
intein+kitin | carboxypeptidase G2 | E. coli BL21 (DE3) | • Mycobacterium xenopi (intein) | 79 | 2021 | |
• Bacillus circulans (CBD) | ||||||
NCTR25 | MDHSHHMGMSY MDSNSTMQPSHHHP | eGFP | E. coli BL21 (DE3) pLysS | Human copper transporter 1 | 71 | 2019 |
NT11 | 11 amino acids | • Carbonic anhydrase | E. coli BL21 (DE3) | Dunaliella species | 68 | 2019 |
• YFP | ||||||
PCDS | 15 amino acids | • PcaHG98 | E. coli BL21 (DE3) | Pseudomonas putida NCIMB 9866 | 69 | 2021 |
• YFP | ||||||
s1v1 | (AEAEAHAH)2 | • PGL | E. coli BL21 (DE3) | Saccharomyces cerevisiae | 70 | 2018 |
• LOX | ||||||
• GFP | E. coli Rosetta (DE3) | |||||
C9R | (GR3)3 | • TEV protease | E. coli BL21 (DE3) pLysS | … | 80 | 2018 |
Iasp | 45 amino acids | • eGFP | E. coli BL21-star (DE3) | Bacillus thuringiensis | 81 | 2020 |
• mCherry | ||||||
• Matrix metalloprotease-13 | ||||||
• Myostatin (growth differentiating factor-8) | ||||||
S1nv | S1nv10: (ANANARAR)2 S1nv11: (ANANARAR)3 S1nv17: ANANARARANANAR | • PGL | E. coli BL21 (DE3) | Saccharomyces cerevisiae | 82 | 2019 |
• LOX | ||||||
• L-asparaginase (ASN) | ||||||
• transglutaminase (MTG) | ||||||
hexa-lysine | KKKKKK | Myosin light chain 6 (gMYL6) | E. coli BL21 (DE3) | … | 83 | 2018 |
C5K | KKKKK | Anti-EGFR VHH-7D12 | E. coli BL21 (DE3) pLysS | … | 72 | 2021 |
SF | GAGAGS | Anti-human epididymis protein 4 VHH 4G8, 1G8, and 3A3 | E. coli BL21 (DE3) | Silkworm Bombyx mori | 73 | 2022 |
In some cases, the effectiveness of solubility enhancement has been investigated using antibody fragments. Kibria et al. developed a C5K tag comprising five sequential lysine residues.72 In this study, the effect of five or nine sequential lysine or arginine tags on the soluble expression of the anti-epidermal growth factor receptor VHH from E. coli BL21 (DE3) pLysS was compared. It was concluded that the C5K tag increased POI expression by over 80% while maintaining structural and functional properties by inhibiting protein aggregation. Yu et al. reported that repetitive amino acid sequence motifs (SF tag, GAGAGS) from the silk fibroin protein facilitated the solubilization of three clones of VHH targeting the human epididymis protein.73 In this study, the authors evaluated the fusion of one to five SF tag cassettes on three VHH clones. They showed that there was an increase in the soluble expression level from E. coli BL21 (DE3) upon extension of the SF cassettes without any loss of antigen-binding ability. Moreover, the authors evaluated the mechanism of solubilization enhancement using RT-PCR and mRNA structure analysis. They showed that the increase in the transcription level and structural stability of mRNA by the fusion SF tag had a positive impact.
We summarized the recent developments in fusion tags for the enhancement of soluble expression. In the cultivation phase of the pharmaceutical industry, developing processes that maintain the POI function without inducing side effects during future dosing is important. Although some newly developed fusion tags have been evaluated for their solubility enhancement effects using non-antibody fragments, these tags have sufficient potential to be applied to antibody fragments. However, because most of these tags are derived from non-human sequences, we must consider their immunogenicity and rapid clearance when dosing in humans. Because many other fusion tags have also been developed to enhance recombinant protein expression in the soluble form, researchers must carefully evaluate the introduction of solubilization tags by considering the above-mentioned characteristics.
B. CDR grafting
Each variable region (VH and VL) of the human antibody has three complementarity-determining regions (CDRs). These regions are three-dimensionally localized at the vertex of the variable domain and recognize the target antigens. It is conventionally known that mouse antibodies acquired from hybridomas exhibit serious immunogenicity when prescribed to humans. CDR grafting is a frequently adopted methodology for humanizing non-human antibodies, in which the CDR cassettes of non-human antibodies are grafted onto human-antibody frameworks to suppress immunogenicity and prevent serious side effects.84,85 Some researchers have focused on this methodology to enhance the solubility of antibody fragments by grafting CDRs onto other framework regions (FRs) of human antibodies, which originally showed favorable folding properties in the periplasmic space when using E. coli expression systems (Fig. 2). Jung et al. evaluated the enhancement of solubility by grafting CDRs of antibody clone 4–4-20 (in which scFv was originally expressed as an inclusion body) to the FRs of humanized antibody clone 4D5, which showed favorable folding properties.86–88 In this study, the authors superimposed the x-ray structures of both antibody-variable regions and determined that four framework regions (VL46, VL66, VH71, and VH78) were required to conserve proper CDR orientation against the target antigen. The soluble expression rate increased after CDR grafting when the E. coli SB536 strain was used as a host cell without any loss of target-binding ability. Similarly, Öncü et al. grafted the CDRs of the anti-kinase insert domain receptor 1.3 scFv onto Lig7 scFv, which originally showed efficient periplasmic expression with moderate sequence similarity to the anti-kinase insert domain receptor 1.3 scFv.89 Before the actual expression, the authors calculated solubility using the machine learning-based solubility predictor SoluProt.90 They showed that the solubility of the sequence of CDR-grafted Lig7 scFv was comparable to the calculated original Lig7 scFv solubility. Recombinant expression using E. coli BL21 Star (DE3) strain showed that CDR-grafted Lig7 scFv was expressed in the soluble fraction. These sequential reports indicate that soluble expression of any antibody fragment can be achieved using CDRs (or FRs) grafting but only if sequential or three-dimensional comparison adjustments are implemented. Safdari et al. investigated the residues important for scFv solubility.91 The authors first searched for 100 human germline repertoires that had excellent sequence identities with cetuximab variable domains, originally expressed in an insoluble form in a bacterial cell expression system, using the IMGT/domain gap alignment tool (https://www.imgt.org/IMGTindex/IMGTDomainGapAlign.php). From these sequences, the authors selected 40 VHs and 26 VLs (each belonging to the IGHV3 or VL3 kappa subfamily [IGKV3]), for the second round of selection because they originally had proper folding properties. In the second round, the authors selected four VH and three VL candidate sequences based on the similarity of the canonical classes. In the third round of selection, 3D models of the CDR-grafted sequence were superimposed onto the cetuximab variable domains, and root mean square deviation value calculations and additional mutations were performed to select the final sequences while minimizing the adverse effects on the structures. CDRs of cetuximab scFv were grafted onto selected scFv FRs, and the final sequence of scFv (66*01–11*01) was successfully expressed as a soluble protein using the E. coli BL21 strain. Further investigation to identify the amino acids most closely related to solubility was conducted by focusing on the hydrophobic cores of scFv. Although there are no clues with respect to VL, phenylalanine-29 of VH is an essential residue owing to its structural stabilization by interaction with the triplet core (Ala24/Arg71/Leu78, in the case of cetuximab in this study). This knowledge will contribute to the pharmaceutical use of antibody fragments by efficiently enhancing their productivity in the soluble form. Based on this knowledge, CDR grafting is an effective strategy for enhancing the solubility of antibody fragments. However, this technology sometimes distorts the entire structure of antibody fragments and induces loss of function, such as antigen binding ability or stability of antibody fragments. In this case, researchers should replace some of the vernier zone residues, located in the FR and support the CDR structure,92,93 with their parental amino acids to keep the functions of the antibody fragments.
A scheme of the CDR grafting strategy for high-level soluble expression.86,89 The CDRs from an antibody with the propensity for insoluble expression are grafted onto an antibody with potentially soluble expression. The grafted antibody fragment often improved its folding property and expression level in soluble fractions. CDR, complementarity-determining region; FR, framework region.
A scheme of the CDR grafting strategy for high-level soluble expression.86,89 The CDRs from an antibody with the propensity for insoluble expression are grafted onto an antibody with potentially soluble expression. The grafted antibody fragment often improved its folding property and expression level in soluble fractions. CDR, complementarity-determining region; FR, framework region.
C. Domain arrangement
As described in Sec. II, some antibody fragments consist of one or more pairs of variable domains. In the design phase of antibody fragments, researchers have options on how components are oriented: two domain arrangements in scFv or scFv dimer (diabody) (VH is located at the N- or C-terminal), four options in the bispecific diabody (each chimeric component has two possible domain orders: VH domain-linker-VL domain or VL domain-linker-VH domain), and eight options in the bispecific scDb, bispecific taFv, and DART (Fig. 3). In this review, to unify the different expressions used by the articles, we denominate the LH type or HL type as where VL is located on the N-terminal compared with VH or VH is located on the N-terminal compared with VL, respectively. In the case of scFv, Koçer et al. demonstrated the influence of domain arrangement on the expression levels of scFv using an E. coli expression system.94 Here, the authors selected the SHuffle T7 Express cell system because it improves disulfide bond formation efficiency and protein folding.95 The results showed that the LH type (VL located at the N-terminus) was expressed more efficiently than the HL type (VH located at the N-terminus). However, the HL type was superior in monomer formation and target antigen-binding ability. Ayyars et al. also evaluated the influence of domain arrangement and the length of connecting peptide linkers on soluble expression levels using E. coli Top 10 strain.96 In this study, the authors showed that the LH type, especially when connected to a long peptide linker, enabled efficient production of soluble fractions. Similarly, the influence of domain order on the expression levels was evaluated for diabody, scDb, and taFv. We previously reported the construction of a bispecific diabody designated as hEx3, which simultaneously binds to the epidermal growth factor receptor on cancer cells and CD3 on T cells and showed the potential for high cancer cytotoxicity. Four types of domain orders were expressed in the E. coli BL21 Star (DE3) strain, and their expression levels differed (2.5, 0.6, 3.6, and 2.7 mg/l culture).97 In addition, we previously conducted a comprehensive study on the domain order of an scDb and taFv format that consists of the same antibody clones as hEx3 diabodies using E. coli BL21 Star (DE3), Pichia pastoris PPS-9010, and Bacillus choshinensis HPD31.45 Interestingly, the arrangement with high expression levels varied depending on the host cell, although the strength of the antibody function was immutable between the selected host cells. In addition, although some studies have reported a relationship between domain arrangement and antibody expression levels, the appropriate domain order differs among selected antibody clones or host cells, even within the same fragment formats.98 These sequential results indicate that the proper domain order for effective expression depends on the antibody clone and host cell used; therefore, researchers should carefully investigate the combination of the proper domain order and production host cell from the early design phase. Additionally, domain orientation alterations often affect the function of antibody fragments, and an optimal solution should be sought to manage both expression level and function.
Schematic diagrams of the constructs of (a) scFv, (b) diabody, (c) scDb, and (d) tandem scFv. The signal peptide is shown using a green-colored hexagon, whereas the short, middle, and long linkers are colored yellow, gray, and pink, respectively. VH, variable heavy domain; VL, variable light domain.
Schematic diagrams of the constructs of (a) scFv, (b) diabody, (c) scDb, and (d) tandem scFv. The signal peptide is shown using a green-colored hexagon, whereas the short, middle, and long linkers are colored yellow, gray, and pink, respectively. VH, variable heavy domain; VL, variable light domain.
IV. DOWNSTREAM PROCESS
A. Development of affinity tags for purification
Although some antibody fragments can bind to Staphylococcus aureus protein A or Peptostreptococcus magnus protein L, almost all antibody fragments lacking the Fc region must be purified using alternative purification methods. Affinity tag systems are frequently used because users can select an appropriate tag system according to their experimental conditions. In this section, we describe affinity tag systems that are already applied or applicable for the purification of antibody fragments based on combinations of tag systems and ligand types: (i) peptide-based tags and protein ligands, (ii) peptide-based tags and non-protein ligands, and (iii) protein-based tags and any kind of ligand.
1. Utilization of interaction of peptide tags with protein ligands
Among the peptide-based tags, epitope peptides to which a particular antibody binds are the most frequently developed. Hemagglutinin, FLAG, and Myc tags are preferred by many researchers as purification affinity tags for recombinant proteins, and purification kits based on these principles have been developed.99–101 Antigen–antibody reactions are generally supported by hydrophobic interactions, hydrogen bonds, electrostatic interactions, and van der Waals forces, resulting in high specificity and strong binding. Since 2016, several research groups have developed epitope peptide tag systems using immunoglobulin G as a ligand on chromatographic support, including MAP,102 RAP,103 AGIA/E2D,104 CP5,105 2B8,106 RIEDL,107 RA,108 and T5 tag109 (Table II). There are various backgrounds for new tag development; some are strategically developed to pursue smaller or more efficient elution, whereas others are not. Functional evaluation of RAP, CP5, and RA tags was conducted using recombinant proteins from various expression systems, such as E. coli, insect cells, cell-free protein synthesis, mammalian cells, yeast, and plants, which would be advantageous for actual pharmaceutical applications. In the case of the CP5, RA, and RIEDL tags, because their molecular weights are sufficiently small, users can avoid negative impacts on the POI represented by conformational changes or their functions. However, most of these tags are derived from nonhuman sequences; thus, researchers must consider their immunogenicity and additional tag removal steps when considering their pharmaceutical usage. Moreover, the usage of MAP, CP5, and RA is sometimes undesirable because it is possible that phosphorylation occurs at serine, threonine, or tyrosine, and ubiquitination occurs at lysine. Although various affinity tag systems using epitope peptides have been developed, the abundance of tag systems is essential because appropriate tag systems differ based on target proteins, expression hosts, and other parameters. These tags enable the tagged POI to be separated from various host cell proteins in the pre-purification fraction. However, a major drawback of using epitope tag purification systems is that the production of immunoglobulin G is costly because of the preparation of hybridomas and the ethical problems associated with using ascites or antibody expression by the mammalian expression system. To overcome this problem, some research groups have recently developed novel epitope peptide tag systems, such as ALFA110 and BC2T tag.111 VHH was selected as the ligand protein because it can solve the problem of using full-size mAbs as purification ligands using cost-effective microbial expression systems for their production.112 The sequence of the ALFA tag (PSRLEEELRRRLTEP) was based on the combination of an artificial helical peptide (SRLEEELRRRL), an additional Thr-Glu dipeptide to neutralize the positive net charge, and a framed proline to avoid a negative impact on the secondary structure of the core sequence.110 The binding between the ALFA tag and VHH is dominated by hydrogen bonds, salt bridges, and cation–Pi interactions. As a result of these rigid interactions, KD was determined to be ∼26 pM using a surface plasmon resonance assay. Moreover, the authors engineered this VHH to weaken its affinity for efficient elution of ALFA-tagged POI, and the final affinity was regulated to 11 nM, which enabled adequate binding and a high recovery rate. Likewise, the BC2T tag (PDRKAAVSHWQQ) can bind BC2-VHH with high affinity (KD = 1.4 × 10−9 M), and Ren et al. engineered BC2-VHH to weaken the affinity with the BC2T tag for efficient purification. Mutation of the 44th residue from Glu to Asp disrupts the electrostatic interaction with Arg106 because of the difference in the side chain length. The flexibility of CDR3 of BC2-VHH attached to this mutation decreased the stability of BC2-VHH and its binding affinity to the BC2T-tagged POI (KD = 1.4 × 10−7 M). Finally, affinity chromatography using a mutant BC2-VHH-immobilized resin showed that the BC2T-tagged POI could be purified with high product purity (95%). Combining such epitope tags with VHH ligands could accelerate the research and development of antibody fragments in the pharmaceutical industry.
Newly developed affinity tags for protein purification since 2016. IgG, immunoglobulin G; Nb, nanobody (equal to VHH); Phy, phytochrome.
. | Tag sequence . | Ligand . | . | . | . |
---|---|---|---|---|---|
Tag . | Peptide . | Protein ligand . | Elution . | Reference . | Year . |
MAP | GDGMVPPGIEDK | PMab-1 (IgG) | Epitope peptide (GDGMVPPGIEDKIT) | 102 | 2016 |
RAP | DMVNPGLEDRIE | PMab-2 (IgG) | Epitope peptide (GDMMVNPGLEDRIE) | 103 | 2016 |
AGIA/E2D | EDAAGIARP | Ra48 (IgG) | Epitope peptide (EEAAGIARP) | 104 | 2016 |
CP5 | GQHVT | Ra62 (IgG) | Epitope peptide (GQHPT) | 105 | 2017 |
2B8 | RDPLPAFPP | 2B8 monoclonal antibody (IgG) | Epitope peptide | 106 | 2017 |
RIEDL | RIEDL | LpMab-7 (IgG) | Epitope peptide (RIEDLRIEDL) | 107 | 2020 |
RA | DIDLSRI | 47RA (IgG) | Epitope peptide | 108 | 2020 |
C | EPEA | NbSyn2 (camelid single-chain antibody) | 20 mM Tris-HCl, 2 M MgCl2 | 113 | 2017 |
ALFA | SRLEEELRRRLTE | NbALFA (VHH) | Epitope peptide | 110 | 2019 |
BC2T | PDRKAAVSHWQQ | BC2-VHH | Acidic or neutral pH | 111 | 2022 |
T5 | QRVRELAV | 5G10 (IgG) | Epitope peptide (QRVRELAV) | 109 | 2018 |
THETA | TKTKAARMTEQT | C13 scFv | Thermal release | 114 | 2019 |
PIF6 | 22 amino acids | PhyB* beads | Light-control | 115 | 2019 |
Peptide | Other | ||||
SB7 | RQSSRGR | Silica or Shirasu | 0.5–0.3 M l-arginine | 116 | 2016 |
Glu6 | EEEEEE | Bare iron oxide nanoparticles | citrate buffered saline, pH 7 | 117 | 2019 |
S1v1 | (AEAEAHAH)2 | HisTrap™ FF Ni-nitriloacetic acid affinity column | 50 mM phosphate buffer containing 300 mM NaCl and 500 mM imidazole, pH 7.4 | 70 | 2018 |
HB | 34 amino acids | Heparin-sepharose | >500 mM NaCl | 118 | 2016 |
NCTR25 | 25 amino acids | Ni2+ | 300 mM imidazole | 71 | 2019 |
R5 | 19 amino acids | Silica gel | 1 M l-lysine containing 2% glacial acetic acid | 119 | 2018 |
Tryptophan-based tags | 6 amino acids | A4C8 | 0.1 M glycine-NaOH pH 11 0.1% tween | 120 | 2016 |
Protein | |||||
Cysta | Molecular weight = 11 000 | HisTrap | Imidazole | 121 | 2016 |
LSL | 187 amino acids | sepharose | 0.2 M lactose | 122 | 2016 |
CL7 | 132 amino acids | immunity protein 7 | 6 M guanidine 5 hydrochloride | 123 | 2017 |
mbT4L | 121 amino acids | Ni2+ | 300 mM imidazole (pH 8.0) | 124 | 2017 |
AK | Molecular weight = 27 000 | Blue-sepharose beads | 20 mM Tris-HCl, 50 μM Ap5A, 1 mM MgCl2 (pH7.4) | 125 | 2016 |
CRDSAT | 156 amino acids | Lactose | 200 mM lactose | 126 | 2019 |
ChBD-AD | Chitin | Factor Xa cleavage | 127 | 2020 | |
SmbP | Molecular weight = 10 000 | Ni2+ | 200 mM imidazole | 128 | 2020 |
CBM64 | 88 amino acids | Cellulose | 50 mM Tris-HCl, 1 mM EDTA, and 0.5 M NaCl (pH 6.5) | 129 | 2022 |
CBD | 157 amino acids | Magnetic cellulose | Magnet | 130,131 | 2022, 2021 |
ABD | molecular weight = 14 500 | Sepharose 6B-CL biotin agarose | 20 mM lactose, 5 mM biotin | 132 | 2017 |
CBM56-Tag | 85 amino acids | Curdlan | 2% (w/v) chitosan in 50 mm sodium acetate buffer, pH 5.5 | 133 | 2019 |
MagR | molecular weight = 14 600 | Fe3O4–SiO2 nanoparticles | Magnet | 134 | 2017 |
MhPA14 | molecular weight = 20 000 | dextran | 5 mM EDTA or 100 mM glucose | 135 | 2020 |
. | Tag sequence . | Ligand . | . | . | . |
---|---|---|---|---|---|
Tag . | Peptide . | Protein ligand . | Elution . | Reference . | Year . |
MAP | GDGMVPPGIEDK | PMab-1 (IgG) | Epitope peptide (GDGMVPPGIEDKIT) | 102 | 2016 |
RAP | DMVNPGLEDRIE | PMab-2 (IgG) | Epitope peptide (GDMMVNPGLEDRIE) | 103 | 2016 |
AGIA/E2D | EDAAGIARP | Ra48 (IgG) | Epitope peptide (EEAAGIARP) | 104 | 2016 |
CP5 | GQHVT | Ra62 (IgG) | Epitope peptide (GQHPT) | 105 | 2017 |
2B8 | RDPLPAFPP | 2B8 monoclonal antibody (IgG) | Epitope peptide | 106 | 2017 |
RIEDL | RIEDL | LpMab-7 (IgG) | Epitope peptide (RIEDLRIEDL) | 107 | 2020 |
RA | DIDLSRI | 47RA (IgG) | Epitope peptide | 108 | 2020 |
C | EPEA | NbSyn2 (camelid single-chain antibody) | 20 mM Tris-HCl, 2 M MgCl2 | 113 | 2017 |
ALFA | SRLEEELRRRLTE | NbALFA (VHH) | Epitope peptide | 110 | 2019 |
BC2T | PDRKAAVSHWQQ | BC2-VHH | Acidic or neutral pH | 111 | 2022 |
T5 | QRVRELAV | 5G10 (IgG) | Epitope peptide (QRVRELAV) | 109 | 2018 |
THETA | TKTKAARMTEQT | C13 scFv | Thermal release | 114 | 2019 |
PIF6 | 22 amino acids | PhyB* beads | Light-control | 115 | 2019 |
Peptide | Other | ||||
SB7 | RQSSRGR | Silica or Shirasu | 0.5–0.3 M l-arginine | 116 | 2016 |
Glu6 | EEEEEE | Bare iron oxide nanoparticles | citrate buffered saline, pH 7 | 117 | 2019 |
S1v1 | (AEAEAHAH)2 | HisTrap™ FF Ni-nitriloacetic acid affinity column | 50 mM phosphate buffer containing 300 mM NaCl and 500 mM imidazole, pH 7.4 | 70 | 2018 |
HB | 34 amino acids | Heparin-sepharose | >500 mM NaCl | 118 | 2016 |
NCTR25 | 25 amino acids | Ni2+ | 300 mM imidazole | 71 | 2019 |
R5 | 19 amino acids | Silica gel | 1 M l-lysine containing 2% glacial acetic acid | 119 | 2018 |
Tryptophan-based tags | 6 amino acids | A4C8 | 0.1 M glycine-NaOH pH 11 0.1% tween | 120 | 2016 |
Protein | |||||
Cysta | Molecular weight = 11 000 | HisTrap | Imidazole | 121 | 2016 |
LSL | 187 amino acids | sepharose | 0.2 M lactose | 122 | 2016 |
CL7 | 132 amino acids | immunity protein 7 | 6 M guanidine 5 hydrochloride | 123 | 2017 |
mbT4L | 121 amino acids | Ni2+ | 300 mM imidazole (pH 8.0) | 124 | 2017 |
AK | Molecular weight = 27 000 | Blue-sepharose beads | 20 mM Tris-HCl, 50 μM Ap5A, 1 mM MgCl2 (pH7.4) | 125 | 2016 |
CRDSAT | 156 amino acids | Lactose | 200 mM lactose | 126 | 2019 |
ChBD-AD | Chitin | Factor Xa cleavage | 127 | 2020 | |
SmbP | Molecular weight = 10 000 | Ni2+ | 200 mM imidazole | 128 | 2020 |
CBM64 | 88 amino acids | Cellulose | 50 mM Tris-HCl, 1 mM EDTA, and 0.5 M NaCl (pH 6.5) | 129 | 2022 |
CBD | 157 amino acids | Magnetic cellulose | Magnet | 130,131 | 2022, 2021 |
ABD | molecular weight = 14 500 | Sepharose 6B-CL biotin agarose | 20 mM lactose, 5 mM biotin | 132 | 2017 |
CBM56-Tag | 85 amino acids | Curdlan | 2% (w/v) chitosan in 50 mm sodium acetate buffer, pH 5.5 | 133 | 2019 |
MagR | molecular weight = 14 600 | Fe3O4–SiO2 nanoparticles | Magnet | 134 | 2017 |
MhPA14 | molecular weight = 20 000 | dextran | 5 mM EDTA or 100 mM glucose | 135 | 2020 |
2. Utilization of interaction of peptide tags with non-protein ligands
Another type of peptide-based tag that does not use any protein ligands, immobilized metal affinity chromatography (IMAC), which is based on the affinity between the peptide and an immobilized metal ion, is also a frequently used purification system.136 Particularly, the polyhistidine tag that binds to metal ions using coordinate bonds is the most representative affinity tag in IMAC owing to its excellent characteristics, such as ease of engineering and inexpensiveness.137 Since the establishment of IMAC, various peptide-based affinity tags based on the principles of IMAC continue to be developed. Chaga et al. reported that the HAT tag (KDHLIHNVHKEEHAHAHNK) is derived from chicken lactate dehydrogenase.138 HAT tag can bind to Co2+ with high affinity, and HAT tag-fused recombinant proteins, such as chloramphenicol acetyltransferase, dihydrofolate reductase, and GFP-UV-enhanced variant, were expressed using the E. coli DH5α strain and purified under very mild (neutral pH, low salt) conditions. Since the number of amino acids constituting the HAT tag is greater than that of the polyhistidine tag, the tag portion exists outside the recombinant protein structure, and this orientation contributes to efficient binding to metal ions. Lee et al. developed an HN tag (HNHNHNHNHNHN), in which histidine and asparagine were alternately deployed.139 The authors assumed that the presence of an asparagine residue accelerated tag exposure on the protein surface and purification efficiency compared to that obtained using a polyhistidine tag. This characteristic enabled the efficient absorption of the HN-tagged recombinant protein (fibroblast growth factor-2) and mild elution using 250 mM imidazole elution buffer. Similarly, a new tag system (NCTR25 and S1v1 tags) was recently developed as an IMAC-based affinity tag.70,71 Because a polyhistidine tag can induce immunogenicity in vivo owing to its non-native origin, Pan et al. developed an NCTR25 tag (MDHSHHMGMSYMDSNSTMQPSHHHP) derived from human copper transporter 1 based on the principle of IMAC.71 The 14 residues of N-terminal (MDHSHHMGMSYMDS) bind to Cu(II) and Cu(I) with high affinity (10 and 0.2 pM, respectively).140 Purification was conducted using the NCTR25 tag-fused recombinant protein (transthyretin) and Ni(II)-IDA agarose, and efficient absorption and elution were achieved with 300 mM imidazole-containing buffer with a purity of over 95%. Moreover, the metal affinity was similar to that of the polyhistidine tag. In contrast, the S1v1 tag (AEAEAHAH)2, derived from a self-assembling amphipathic peptide in the Zuotin protein, was developed by Zhao et al.70 Purification of the recombinant proteins (PGL, LOX, and GFP) fused with the S1v1 tag through a PT linker (PTPPTTPTPPTTPTPTP) achieved better purities and acceptable recovery rates than those achieved with the polyhistidine tag-fused protein. In addition, fusing the S1v1 tag enhanced protein expression and stabilization. As another type of peptide-based affinity tag system, peptide-based tags that interact with silica gel as a purification matrix have come into the spotlight.116,119 The interaction between the silica gel matrices and the silica tags was dominated by the positive surface charge of the silica tag. R5 tag-fused recombinant protein (GFP, Venus YFP, or mCherry red fluorescent protein) was efficiently separated from the cell lysate of E. coli DH5α PRO strain using an elution buffer including a positively charged small molecule (lysine).119 The KD was calculated as 0.43–1.09 ± 0.4 × 10−6 M using a Quartz crystal microbalance, and target band purity was comparable with the behavior of the polyhistidine tag-fused protein. In contrast, Abdelhamid et al. developed a positively charged SB7 tag (RQSSRGR) derived from the coat proteins of Bacillus cereus spores.116 Alanine scanning mutagenesis of each residue showed that Arg was important for silica binding. Sb7 tag-fused recombinant protein (GFP, mCherry, or mKate2) expressed from E. coli Rosetta 2 (DE3) was eluted using a 0.5 M Arg-containing elution buffer (pH 8.0), with a high recovery rate. The SB7 tag showed similar purity and higher yield than the purification performance of the polyhistidine tag, and its versatility was also demonstrated using other recombinant proteins. Moreover, the authors considered using natural silica-containing volcanic ash particles (Shirasu) as an absorbent to minimize the purification cost. They demonstrated the usefulness of Shirasu beads to separate SB7 tag-fused proteins from cell lysates. Notably, using low-cost natural particles, such as volcanic ash, could contribute to reduced drug prices and eco-friendly manufacturing.
3. Utilization of interaction of protein tags with various ligands
Affinity tags consisting of 51 or more amino acids are generally defined as protein-based. One of the most frequently used protein-based affinity tags is glutathione S-transferase, which has a molecular weight of 26 000 and an affinity for glutathione. This tag system enables a high yield of the protein of interest in view of separation with high purity because of enzyme (glutathione S-transferase)–substrate (glutathione) binding and acceleration of expression in the soluble fraction.141 However, similar to peptide-based tags, various protein-based tags have been developed over the past few years, and their purification strategy is based on magnetic separation.130,134 Jiang et al. developed a magnetic affinity tag, clMagR, with a molecular weight of 14 600.134 clMagR was fused to recombinant proteins (GFP, lipase, -L-arabinofuranosidase, or pullulanase) through the thrombin site, and the complex was specifically adsorbed onto Fe3O4–SiO2 nanoparticles from E. coli lysate. A thrombin cleavage site enabled the separation of POI from nanoparticles for further use. A comparison with a purification method using a polyhistidine tag showed that the purity was much higher, and the negative impact on the protein function was lower when using the clMagR purification system. Similarly, Gennari developed a new magnetic-based affinity tag system in which a cellulose-binding domain (CBD) and magnetic cellulose supports were used for recombinant protein purification.130 The authors constructed a galactosidase fused to CBD via the factor X cleavage sequence as a model protein for sequential evaluation and showed high immobilization efficiency (approximately 90%) on magnetic cellulose. The interactions between CBD and cellulose are generally dominated by proline and hydroxyproline on CBD, followed by the formation of hydrogen bonds or networks of cellulose hydrogen bonds. Although these systems enable the separation of magnet-binding tag-fused proteins with high purity, large tags, especially non-human-derived tags, can affect the function of the POI or pharmacokinetics. Although protease digestion sequences are introduced between the tag and POI to avoid this problem, the increase in purification steps in biopharmaceutical manufacturing negatively affects the final yield or product quality.
In summary, various tag systems for the purification of recombinant proteins have been developed in the last few years, and some have superior characteristics to those of classic tag systems, such as hemagglutinin, FLAG, Myc, polyhistidine, and glutathione S-transferase. Because affinity tags can affect protein expression, stability, pharmacokinetics, and heterogeneous protein expression by tag degradation, a proper tag system should be selected based on the characteristics of the POI or recombinant host cells to avoid these problems. In addition, considering the manufacturing processes and facilities of large-scale plants, small human-derived peptide tags that do not require additional tag digestion processes may be preferred if affinity tag systems are used.
B. Engineering of antibody fragments for efficient purification
Recently, researchers have attempted to purify antibody fragments using antibody-binding proteins (also called alphabet proteins) by engineering antibody fragments. The most frequently used antibody-binding proteins are Streptococcus protein G, S. aureus protein A, and P. magnus protein L. Protein G binds to the Fc region and weakly binds to the CH1 region,142–144 whereas protein A can bind to the Fc region and the VH3 subfamily of the VH region.20,145 Although protein L does not bind to the Fc region, it interacts with the antibody-light region of the kappa chain.146 Based on these binding behaviors, proteins A and L were used as ligands for antibody fragment purification without affinity tags.
1. Protein engineering for protein L-binding
Graille et al. analyzed the three-dimensional structure of a complex of protein L and an antibody light chain to reveal that there were unexpectedly two different interfaces between the proteins. As shown in Figs. 4(a) and 4(b), Thr5, Pro8, Ser9, Ser10, Leu11, Ser12, Ala13, Arg18, Thr20, Thr22, Arg24, Lys107, and Glu143 in the light chain were related to the first interaction interface of protein L, whereas Ser7, Pro8, Ser10, Leu11, Ser12, Ala13, Asp17, Arg18, Thr20, Thr22, Arg24, Ser65, Thr72, Thr74, and Lys107 in the light chain were related to the second interaction interface of protein L (residues were numbered according to the Kabat numbering method).146,147 Many of these residues are located in the framework region 1 (FR1), which generally does not affect antigen binding. Based on the binding characteristics between protein L and the antibody fragment, Muzard et al. suggested that grafting protein L-binding FR1 would enable the alteration of protein L-non-binding antibody fragments to protein L-binding ones [Fig. 4(c)].148 In that study, murine antibody 9C2 (AJ235956, IGKV12-46*01) was used as a protein L-binding scFv, and FR1 of 9C2 was grafted onto protein L-non-binding scFvs (3H1: anti-Aah I and 4C1: anti-Aah II) or diabody (PA04: anti-anthrax toxin). Protein L binding was confirmed using sodium dodecyl sulfate polyacrylamide gel electrophoresis or immunoblotting after FR1 grafting with nanomolar affinity. As the KD between 9C2 scFv and protein L is 2.48 × 10−9 M, this FR1 grafting methodology to impart protein L-binding ability is valid for purification without affinity tags. As an alternative to FR1 grafting, Lakhrif et al. achieved protein L-binding ability in antibody fragments via the amino acid mutagenesis of certain residues [Fig. 4(d)].149 The candidate sites for mutagenesis were determined by sequence alignment. In some mutagenesis studies, Pro8 and Arg24, which are related to two different protein L interfaces, were found to be particularly important for protein L-binding. The mutagenesis of proline at the 8th residue forms interactions with Tyr34 and Tyr32 in the protein L domain. The presence of arginine at the 24th residue enhanced the interaction with protein L by improving the dissociation rate (kd). After mutagenesis of protein L-non-binding scFv, protein L-binding ability was acquired with high affinity (KD = 7.2 × 10−11 M). In addition, Paloni et al. demonstrated the importance of Pro8 and variations in the binding free energy using alanine scanning.150 The co-crystal structure of the protein L mutant and Fab (PDB: 1MHH) reported in another study by Graille et al. was selected for alanine scanning.151 Only 50% of the residues were conserved in the light chain of this protein L-binding interface, as compared to that on the first interface with protein L. An increase in ΔGbind between 2 and 4 kcal/mol was observed for Ser7, Pro8, Leu11, Val13, and Glu17. This result is consistent with the discussion of Graille et al. that the 5th–12th residues are important for interaction with protein L and that binding is mainly dependent on main-chain interactions. In conclusion, FR1 grafting or mutagenesis is an attractive strategy for imparting protein L-binding ability to antibody fragments for purification through tag-free protein L affinity chromatography. Here, it is concerning that implementation of these strategies to parental antibody fragments could relate to structural distortion and loss of function, similar to the case of CDR grafting for enhancement of solubility expression described above. Thus, researchers should carefully apply these strategies to improve protein L-binding ability.
Protein engineering methods integrate the protein L-binding ability to antibody fragments. (a) Interaction between protein L and the antibody-light chain. Residues related to the interaction on antibody-light chain and on protein L are colored yellow and cyan, respectively. (b) Amino acid sequences of the antibody-light domain of the co-crystal structure with protein L. Residues related only to the first and second protein L interfaces are colored green and light blue, respectively. The purple residues are related to both interactions. (c) and (d) strategies to provide protein L-binding ability on variable light domains [(c) CDR grafting and (d) mutagenesis].
Protein engineering methods integrate the protein L-binding ability to antibody fragments. (a) Interaction between protein L and the antibody-light chain. Residues related to the interaction on antibody-light chain and on protein L are colored yellow and cyan, respectively. (b) Amino acid sequences of the antibody-light domain of the co-crystal structure with protein L. Residues related only to the first and second protein L interfaces are colored green and light blue, respectively. The purple residues are related to both interactions. (c) and (d) strategies to provide protein L-binding ability on variable light domains [(c) CDR grafting and (d) mutagenesis].
2. Protein engineering for protein A-binding
Protein A is a well-known purification ligand for full-size antibodies due to its high affinity for Fc recognition.145 This protein can interact with VH classified in the VH3 subfamily, and 13 residues of VH interact with protein A: Gly15, Ser17, Arg19, Lys57, Tyr58, Tyr59, Gly65, Arg66, Thr68, Ser70, Gln81, Asn82a, and Ser82b. Utilizing this characteristic, some studies have been conducted to provide protein A-binding ability to antibody fragments. Henry et al. have shown that the interaction between VHH and protein A is dominated by the same set of residues as those in the co-crystal structure of VH3 and protein A.152 Although individual substitutions based on the VH3 consensus sequence enabled model VHHs to bind to protein A with adequate affinity (KD = 0.6–9.0 × 10−6 M), the expression yield and target antigen-binding ability were maintained compared to each parental VHH. Similarly, Crauwels et al. and Fridy et al. performed mutagenesis studies on VHH and showed that Arg19, Tyr58, Tyr59, Asn82a, and Ser82b are essential for protein A-binding.153,154 However, no studies have integrated the protein A-binding ability of VH, rather than VHH, by mimicking the amino acid sequence of the VH3 subfamily. Thus, we previously verified the possibility of integrating the protein A-binding ability on VH that did not originally bind to protein A. The selected model antibody fragment was a humanized version of the anti-CD3 antibody derived from the Orthoclone OKT3 (muromonab-CD3), which belongs to VH3 but does not have protein A-binding ability. Mutagenesis based on sequence alignment and structural analysis alters OKT3 scFv and bispecific scDb, including the OKT3 component, to bind protein A with adequate affinity, as confirmed using protein A-immobilized affinity chromatography and plate assay (KD of diabody against protein A: 1.4 × 10−6 M) (under submission). These antibody-binding proteins, especially protein A, are the gold-standard ligands for antibody purification. The knowledge that FR grafting or mutagenesis enables antibody fragments to bind to each ligand will advance the research on antibody fragments in the pharmaceutical industry.
V. CONCLUSIONS
In this review, we summarize and discuss protein engineering strategies for generating antibody fragments for therapeutic production applications. In recent years, various antibody fragment formats have been reported, ranging from simple structures represented by scFv or single-domain antibody fragments (VHH or VNAR) to complicated multivalent antibody fragments (Fig. 1). These antibody fragments have excellent characteristics, such as high penetration ability, cost-effectiveness, and a mode of action distinctive from those of full-size antibodies. The development of effective production methods has been investigated from multiple perspectives. From the perspective of protein engineering, various methodologies have been advocated for cultivation (solubility tags, CDR grafting, and domain arrangement) and purification (affinity tags and mutagenesis studies) processes.
To introduce a solubility tag for the high-level soluble expression of POI, various fusion tags ranging from small to large molecular weights have been developed (Table I). The most recently developed solubility tags have been for E. coli expression systems and have drastically improved the soluble expression of POI. E.coli expression system has received attention because it enables cost-effective antibody fragment expression, and the medium used in E.coli expression system is less expensive than that used in mammalian cell expression systems. In addition, the ease of genetic manipulation for basic research or scale-up positively works on the use of E.coli expression system for antibody fragment production.155 These characteristics have a possibility that it could be one of the solutions to the problems of far long drug development cycle. However, antibody fragment drugs produced from E.coli are not necessarily inexpensive, because the drug prices are mainly determined by the mode of action. Researchers should also pay attention to the fact that there are still important regulatory quality aspects to be resolved for drugs produced from E.coli, such as the complete removal of endotoxin.156 There has been an accelerated development of solubility tag systems, and the resulting variation in tags could support researchers in choosing an appropriate tag for the efficient expression of their POI. However, many of these are derived from nonhuman derivatives, and researchers must carefully construct POIs in view of their impact on immunogenicity or additional digestion steps for future therapeutic use. CDR grafting is another useful strategy for enhancing the soluble expression of POI (Fig. 2). CDR grafting onto framework regions is expected to enhance POI expression levels with low immunogenicity if the framework regions are derived from human sequences. In some cases, soluble expression was not achieved by CDR grafting alone, and three-dimensional structural analysis or solubility prediction was conducted as a pre-investigation. Rearranging the domain order is also an immunogenicity-friendly protein engineering strategy for improving protein expression in microbial expression systems (Fig. 3). The most appropriate arrangement differs according to the antibody clone used, and it is recommended to evaluate all possible arrangements of the POI.
Affinity tags of various sizes have been developed for the purification process (Table II), similar to those used for the cultivation process. Recently, various tag systems have been developed and reported in terms of their size, mechanism, and origin. Purification tags using antibodies as ligands can generally separate the target POI with high purity due to the strong affinity between the antibodies and the epitope tag; however, the purification cost increases in this case. Other peptide-based tags have also been developed; some use metal ions or silica gels as ligands, whereas others are based on separation factors, such as salt gradient, reaction temperature, or light. Moreover, protein-based affinity tags have been developed; purification tag systems based on magnetic force have been developed for the efficient purification of POI. Although these systems offer high POI separation from impurities, they must focus on tag digestion to decrease immunogenicity. Researchers should also pay attention to actual pharmaceutical use because an increase in the number of process steps in the pharmaceutical industry leads to a decrease in the final POI yield.
Modification of protein L- or protein A-binding by CDR grafting or amino acid mutagenesis is also energetically conducted for the efficient purification of antibody fragments (Fig. 4). The interaction sites against protein L or protein A are mainly dominated by FR1 of the kappa chain of VL or the VH3 subfamily of VH, and grafting of these FRs might offer binding ability against each alphabet protein to any antibody fragment. When providing binding ability by means of sequence grafting does not offer the intended result, a more detailed investigation, such as three-dimensional structure prediction, might be required. Recently, some studies have mentioned the core residues for alphabet protein binding, and this knowledge might be able to easily attain binding ability against alphabet proteins. Because either VL or VH is an essential component of antibody fragments, these strategies can become versatile for the future design of antibody fragments without any negative impact on human administration.
Finally, we describe a design strategy for future antibody fragments. Strategies for improving the pharmaceutical production of antibody fragments are classified into the presence or absence of tags in both the cultivation and purification processes. Because the presence of an additional tag might negatively impact antibody folding, function, and immunogenicity (in the case where tag digestion is not conducted) or decrease the final yield (in the case where tag digestion is conducted), it is desirable to refrain from using additional tags, regardless of whether they are peptide-based tags or protein-based tags. First, to simultaneously attain highly soluble expression and efficient purification, using kappa chains VL or VH classified into the VH3 subfamily with high secretory expression ability is recommended. When there are problems with soluble expression level or purification function, researchers must confirm the three-dimensional structure to optimize the sequences. If it does not improve as intended, using additional tag systems should be considered, with careful attention to the points described above. Actual development will require a detailed investigation of each molecule; however, this review provides clues for the design of new therapeutic antibody fragments, thus potentially accelerating future development.
ACKNOWLEDGMENTS
This work was supported by Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science (JSPS) (Grant Nos. 21K18321, 22H02915, and 23H01770).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
Atsushi Kuwahara: Writing – original draft (equal). Kazunori Ikebukuro: Writing – review & editing (supporting). Ryutaro Asano: Supervision (lead); Writing – review & editing (lead).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.