Size-based Degradation of Therapeutic Proteins - Mechanisms, Modelling and Control.

Protein therapeutics are in great demand due to their effectiveness towards hard-to-treat diseases. Despite their high demand, these bio-therapeutics are very susceptible to degradation via aggregation, fragmentation, oxidation, and reduction, all of which are very likely to affect the quality and efficacy of the product. Mechanisms and modelling of these degradation (aggregation and fragmentation) pathways is critical for gaining a deeper understanding of stability of these products. This review aims to provide a summary of major developments that have occurred towards unravelling the mechanisms of size-based protein degradation (particularly aggregation and fragmentation), modelling of these size-based degradation pathways, and their control. Major caveats that remain in our understanding and control of size-based protein degradation have also been presented and discussed.


Introduction
Bio-therapeutics have become the fastest growing segment of the pharmaceutical industry, owing to their ability to combat and treat important debilitating life threatening diseases like cancers, autoimmune disorders, and metabolism related diseases [1]. The demand and growth of this product class has been significantly higher than that of the traditional small molecule pharmaceuticals [2]. However, despite the significant growth in bio-therapeutic products, these products are extremely complex and vulnerable to the various degradation pathways, which threaten to impact the safety and efficacy of the biotherapeutic [3]. As a result, controlling degradation is a major task for the biopharmaceutical industry and the industry spends significant resources on production, transport and storage of these products so as to contain if not completely avoid degradation [4,5].
It is accepted that it is not economically feasible and, in some cases, even technically feasible to make a product that is completely devoid of any kind of impurity (both intrinsic and extrinsic) [4]. The process for the manufacturing of these bio-therapeutic drug products is quite complex, with multiple unit operations used for production, purification, characterization, storage, filling and shipping [6]. At various stages of production, the protein therapeutic may be exposed to different types of stresses such as chemical stress (pH), thermal stress (high temperature, cold temperature, freeze thaw, mechanical stress, and stress due to interaction amongst various types of stresses [7]. Structure and configuration of a protein determines its vulnerability towards the various degradation pathways such as oxidation, deamidation, and hydrolysis [8]. Degradation is influenced by a myriad of factors, which can be categorised as intrinsic and extrinsic. Intrinsic factors represent properties of the protein under consideration. Extrinsic factors include those that represent the environment (pH, temperature, buffer solutions, and changes in excipients) of the protein [9]. Hence, typical strategies for reducing protein degradation aim to either incorporate changes in the protein's structure (protein engineering) to make the protein resistant to aggregation or to control the environmental conditions [3].
To monitor protein size-based degradation, numerous analytical characterization techniques have been used. Some of the commonly used quantitative analytical tools include size exclusion -high performance liquid chromatography (SE-HPLC), dynamic light scattering (DLS), nano particle tracking analysis (NTA), micro flow imaging (MFI), analytical ultra-centrifugation (AUC), coulter counter, multi-angle laser light scattering (MALLS), and turbidimetry [10,11]. On the other hand, commonly used qualitative tools that can provide structural information about the protein include circular dichroism, fluorescence spectroscopy, infrared (IR) spectroscopy, Raman spectroscopy, ultraviolet (UV) visible spectroscopy, light and electron microscopy, fluorescence microscopy, and atomic force microscopy (AFM). Each of the above-mentioned analytical tools works well in a defined size range and hence a combination of analytical tools is required if one intents to monitor size-based degradation from small fragments and dimerization to formation of sub-visible particles and visible precipitates [12,13].
This review aims to provide a summary of major developments that have occurred towards unravelling the mechanisms of protein's size-based degradation (particularly aggregation and fragmentation), modelling of these degradation pathways, and control of protein's size-based degradation. This review focuses on the work done by the researchers on modelling, mechanism and control of protein's size-based degradation. Major caveats that remain in our understanding and control of protein's size-based degradation have also been presented and discussed.

Factors affecting size-based protein degradation
One of the major challenges in rationalizing protein's size-based degradation is represented by the variety of possible scenarios that can occur, which depend on the specific combination of protein, environmental conditions and exposed stress [14]. Figure 1 illustrates the various intrinsic and extrinsic factors that affect protein's size-based degradation. Table 1 summarises the various factors affecting size-based protein degradations.
Intrinsic factors include protein structure and sequence. One of the key contributors towards protein aggregation is the number of hydrophobic residues present in a protein sequence, which in turn affects the formation of aggregation prone regions [15][16][17]. Along with the primary sequence of the protein in IgG based mAb molecules, presence of glycosylation and types of glycans in CH2 domains also affects the stability of the complete immunoglobulin protein (Table I) [15][16][17]. Changes in the CH2 domain of mAb molecules affect the tertiary structure of the proteins, thereby affecting both the Fc and Fab regions of the IgG. The secondary structure of a protein also plays a key role in affecting protein stability. Proteins containing high percentage of alpha helices are more stable as compared to proteins primarily containing beta sheets (Table I) [9]. Those amino acids or blocks in a protein which are susceptible to aggregation can affect the overall aggregation propensity if they are exposed to the protein surface [13]. These aggregation prone blocks are generally up to seven amino acids long and in IgGs, 14 such blocks or motifs have been discovered [18].
The effect of sequence in protein stability is becoming more important with the advent of biosimilars. In the case of mAbs, although there might be differences amongst an innovator and its biosimilars in terms of PTMs, manufacturing methods, and product profiles, similar behaviour under forced degradation studies has been reported in case of infliximab innovator product (Remicade®) and its biosimilar (Remsima®) [19]. The authors concluded that this similarity in degradation pattern of these molecules is due to their common primary sequence. Other researchers have observed that mAbs belonging to different subclasses of IgGs, although having identical sequences of variable regions, can still differ in their aggregation susceptibility at low pH [20]. mAbs belonging to IgG1 subclass typically exhibit better stability as compared to IgG2 and IgG4 subclasses, likely due to enhanced flexibility of the hinge region in IgG1 mAbs causing a shielding effect for the hydrophobic residues in the hinge region [20]. In the case of IgG2 and IgG4 subclasses, unfolding of the Fc region at low pH results in exposure of the hydrophobic patches to the outside buffer environment and thereby inducing aggregation. However, if the mAb is glycosylated at these positions, then its aggregation propensity is significantly lowered [21].
Post translation modifications are known to significantly impact the overall stability of a biotherapeutic [22]. PTMs generally determine the final structure and function of a bio-therapeutic and are very cell and species specific and pose a major challenge for the biopharmaceutical industry during production and characterization [23]. PTMs like glycosylation help in maintaining the structural integrity and activity of protein therapeutics, as has been demonstrated by various techniques used for structural analysis like NMR and X-ray crystallography (Table 1) [22]. Glycans can affect the overall FcγR binding and complement fixation for mAb based therapeutics.
Aggregates generated due to variations in PTMs can sometimes cause immune response and generation of ADAs in the patient which are beyond control and can show enhanced immunogenicity in patient [23]. mAbs have natural capacity to form immune complexes (ICs) that may help them in binding to their target and enhance the response. Aggregated mAbs due to modifications in PTMs can have compromised functions and their ability to recognise their specific targets may get affected (Table 1) [23].
In addition to intrinsic factors, aggregation can be induced by a variety of extrinsic factors which can affect both chemical and physical stability [10,24] (Figure 1 and Table 1). Chemical degradation includes oxidation, photoirradiation, deamidation, disulphide reduction, and glycation [25,26]. Oxidation from exposure to hydrogen peroxide and metal ions can affect surface hydrophobicity and protein-protein interactions, and decrease the conformational stability of proteins [27][28][29]. Glycation affects protein aggregation by affecting surface charge of the protein [30][31][32].
Thermal exposure is well known to affect protein stability, leading to changes in protein conformations which in turn can modify colloidal interactions (Table 1) [32]. This complex effect often results in non-Arrhenius kinetics. As a consequence, it is challenging to estimate the shelf-life of biologics at low temperature from accelerated studies at higher temperatures [32]. For instance, a recent study showed that aggregation rates measured at 40°C do not correlate with that at 5°C, while the correlation was better between 30°C and 5°C [33]. Moreover, for the same monomer loss, different types of aggregates were formed at different temperatures, indicating changes in the aggregation mechanisms.
In addition to high temperature, exposure to ultralow temperatures can also unfold proteins via so called colddenaturation (Table 1) [34][35][36], which is important in the context of freeze thaw. The tertiary structure of a protein is greatly affected by freeze thawing, depending on the pH of the solution and additives present. These changes, which can be monitored by fluorescence spectroscopy, could be reversible as well as irreversible [37] and could lead to aggregation of protein molecules caused by alterations in their tertiary structure.
Formulation and buffer conditions such as pH, ionic strength, protein concentration, and buffer types strongly affect protein stability in liquid formulations [38][39][40][41][42]. Exposure to extremely low pH [43][44][45] and changes in ionic strength [46][47][48] can significantly affect protein stability, often in a non-monotonic and complex way [10,43,49] (Table 1, Figure 1). For instance, changes in pH experienced during downstream processing (DSP) of protein therapeutics like in protein A chromatography, viral inactivation, or neutralization mostly leads to increase in aggregation [4,6,50]. The level of aggregation is also dependent on the hold time of the protein under various DSP operations [41]. Researchers have characterized and established an aggregation assay to monitor aggregation due to changes in pH, mostly at pH 2 and then neutralizing it to slightly neutral pH of 6.5 [42]. At low pH there is unfolding of the protein but as the pH is raised, refolding occurs, which is due to intermolecular attraction. But if this intermolecular attraction between molecules further increases beyond refolding, it may result in protein aggregation (Table 1) [42]. Protein concentration influences the stability of protein therapeutics by affecting both aggregation and viscosity due to the increase in protein-protein self-interactions. This problem is a growing worry for the formulation scientists working in the domain of development of subcutaneous administration of these bio-therapeutics (50). Protein aggregation is a complex pathway consisting of multiple microscopic reactions of nucleation and growth. The effect of protein concentration on the rate of these individual steps is described by their corresponding reaction orders, which depend on the types of stresses and buffer conditions. Therefore, changes in protein concentration can have different effects on nucleation and growth reactions depending on the specific environmental

Mechanism of size-based degradation Reference(s)
Protein sequence Intrinsic Greater the number of hydrophobic residues in a protein, more it is susceptible to aggregation.
[ [15][16][17] Protein structure Intrinsic Proteins with high amount of beta sheets are more prone to aggregation as compared to alpha helix rich proteins. [9] Post translation modifications Extrinsic PTMs like glycosylation affects structure of proteins affecting its susceptibility to aggregation. [22,23] pH Extrinsic Low pH causes unfolding of the proteins, exposing the hydrophobic residues, increasing the chances of aggregation. [4,6,50] Temperature Extrinsic High temperature causes protein to unfold and increases the random Brownian motion, leading to more collision between molecules, hence enhancing aggregation. Low temperature causes protein unfolding via cold denaturation.

Extrinsic
High concentration of protein therapeutics can cause crowding effect and proteinprotein self-interaction in solution, leading to enhanced aggregation. [51] Interfacial shear stress Extrinsic At the interfaces, proteins experience stresses such as hydrophobic forces, charge and mechanical stress leading to the formation of visible and sub-visible particles in biologics.
[ [53][54][55][56] conditions. This, in turn, has consequence for the size distribution of the formed aggregates. For instance, one study showed that increase in protein concentration led to an increase in the concentration of smaller particles and a decrease in the concentration of the larger particles when the protein sample was subjected to freeze thaw stress under varying protein concentrations and buffer formulations [51]. In contrast, another study showed that concentration of larger particles increased as compared to smaller particles on increasing protein concentration as the therapeutic was subjected to heat stress at 70ºC [52]. Apart from the buffer conditions, effect of surfaces and interfaces is very critical in affecting the pace of protein therapeutics' degradation. One of the phenomena that is related to surface is shear stress ( Table 1) [53]. Shear stress at the interface is reported to be more aggregation causing as compared to bulk shear stress. These interfacial stresses are generally encountered during drug manufacturing, development and administration to patients. At the interfaces there can be many interactions like solid-liquid interaction, liquid-liquid interaction or vapor-liquid interaction. Due to these interactions, there are chances of formation of soluble as well as sub-visible and visible particles [53]. There is also probability of variation in concentration of therapeutic product because of surface adsorption of the drug product. Changes in the shape and structures of molecules due to interfacial effects are also probable. Conformational changes occur due to mechanical or hydrophobic stresses. Major surface interactions occur during drug product manufacturing and storage processes like filtration, container surfaces, mixing and mechanical processes, freeze-thawing during excipient addition. During storage, main aggregation prone exposures include vial filling operations. Air liquid interfacial stress can be introduced due to headspace of the vial, transportation agitation. Interfacial stress is also possible during drug product administration via exposure to surfaces like IV bags, infusion sets, etc. (Table 1) [53]. Shear stress could also make the proteins susceptible to protein aggregation. It is often found in high concentration mAb formulations. One of such recent studies has shown the effect of shear stresses (20,000 s -1 -2,50,000 s -1 ) on high concentration mAb formulations having concentrations greater than 100 mg/ml [54]. In this study, Bee et al. proposed that shear during production processes not only causes aggregation. Adsorption to solid surfaces, particulate contamination, air-bubble entrainment or pump cavitation has also a significant role in promoting aggregation mAb therapeutics. This also has been reported in other protein therapeutics apart from mAbs, like human serum albumin (HSA) as reported by Duerkop et. al. This group reported checked that whether there are aggregates formation due to formation of hydroxyl radicals caused by cavitation bubble collapse, as reported in some previous literatures. It has been suggested in the conclusion of this study that aggregation occurs due to increase in the surface area because of cavitation bubble growth and not because of hydroxyl radical release or shear stress [55].
Another study by Arosio et al. studies the role of hydrodynamic flow on protein aggregation [56]. In this study they have studies the interfacial stress effect by setting up an experimental design where flow and interface induced aggregation of two IgGs have been investigated. Shear rates of 10 s -1 -1000 s -1 have been kept but negligible change in % aggregation has been observed. Rather considerable monomer loss has been observed when the material of the syringe barrels has been changed. This led them to the conclusion that there is significant of solid-liquid interface in promoting aggregation during flow operations. They have suggested a synergistic effect of hydrodynamic flow and interfaces in causing protein aggregation [56].

Mechanisms of protein's size-based degradation
Two of the most common degradation pathways of proteins therapeutics include aggregation and fragmentation.

Mechanism of protein aggregation
Proteins can form aggregates following a variety of microscopic mechanisms which include multiple events of nucleation and growth. Nucleation is often promoted by conformational changes or chemical modifications. Heterogeneous nucleation triggered by interactions with surfaces is also commonly observed. All these steps can be reversible or irreversible, although non-native protein aggregation often leads to irreversible aggregates, and differs from other types of reversible protein phase transitions such as liquid-liquid phase separation and protein crystallization. (Figure 2): a) Nucleation -Changes in conformation of protein molecules can result in partial exposure of hydrophobic patches or residues [57], which can lead to formation of oligomers or nuclei, the first step towards aggregation [17,57] (Figure 3). These oligomers are typically dimeric and non-native [58][59][60]. The understanding and prediction of nucleation rates remains one of the most challenging aspect in protein aggregation. Nucleation is mediated by a variety of interactions such as electrostatic interactions, hydrophobic interactions, hydrogen bonds, and the van der Waals forces. Net and average parameters describing protein-protein interactions often do not correlate with protein aggregation [61][62][63][64] , since interactions that are relevant for protein aggregation can be confined to specific aggregation protein regions. These regions can be located in various locations of protein molecules like the complementary determining region 1 (CDR1) regions, Fab and Fc, and CH2 or CH3 domains [17,65] (Figure 3). The nucleation rate typically increases with protein concentration [62,63], unless unimolecular conformational changes are the rate-limiting step in the process. b) Formation of higher order aggregates -Formation of oligomers is often followed by formation of larger insoluble and visible particulates [66]. Aggregates can grow in size either by addition of monomers via chain polymerization or by cluster-cluster interactions [59,60,67,68] (Figures 2 and 3). The relative impact of these steps is controlled by conformational and colloidal stability [43,60,69,70].

Mechanism of protein fragmentation
Fragmentation in bio-therapeutics, especially monoclonal antibodies, is another common degradation pathway [71]. Fragmentation refers to the breakage or disruption of the peptide covalent bond between the amino acid residues of a protein molecule. Fragmentation, either spontaneous or enzymatic, leads to changes in the primary sequence and consequently in higher order structure [72]. Peptide covalent bonds are generally considered strong associations unless they are exposed to harsh environments like drastic pH or extreme temperature [71,73]. Another key factor affecting protein fragmentation is the flexibility of the side chains of the molecule. Fragmentation is typically more frequent in those regions of the molecule which are somewhat flexible in nature or are exposed to the solvent [74] and less frequent in regions that are buried inside the molecule, although the primary sequence of that region is prone to cleavage. Fragmentation could be either due to β-elimination or due to peptide bond hydrolysis, primarily depending on the pH of the protein solution [71]. Presence of certain amino acids (Asp, Gly, Ser, Thr, Cys or Asn) makes these regions more susceptible to fragmentation [71]. Each of these amino acids aids in cleavage of peptide bonds in mAbs, 2) Figure 2 are also replaced, since I have changed the background with better visuality. thereby accelerating fragmentation by acting through their side chains [75]. The given amino acids generally act in a wide range of pH (pH 0.3-10) and temperature (25⁰C -70⁰C) [76]. Apart from side chain mediation, fragmentation can also occur due to peptide bond hydrolysis [74]. Many computational and experimental studies have shown the significant role of hydrolysis in peptide bond cleavage. Metals ions also play a crucial role in initiating or accelerating protein fragmentation. Copper has been reported to accelerate fragmentation of IgG1 at the hinge region. This copper mediated cleavage is pH and mAb specific and has been shown to decelerate upon addition of chelating agents like EDTA [77]. Various IgG molecules behave differently in the presence of cupric ions, depending on the affinity between the protein molecules and the metal ions as well as on the isotype of IgG since each IgG isotype varies in its sequence and conformation of the hinge region [77]. Apart from the copper ions, other metal ions like Mg II , Mn II , Zn II , Fe III and Ni II can also impact protein fragmentation, although to a lesser degree [74,77].
Hydrolysis of the peptide bond (which is one of the mechanisms of fragmentation) in the hinge region of the mAb yields Fc-Fab and Fab fragments [71]. Fragmentation can be observed during protein production in cell culture, purification, storage, and even while circulating in blood. Better understanding of the mechanism behind hydrolytic fragmentation is required for its prevention, including its origin, solvent conditions (pH, ionic strength), temperature and kinetic behaviour [78]. There is a significant difference among the different mAbs with respect to their vulnerability towards fragmentation. Conformational stability and local structural dynamics are also known to influence fragmentation [77]. Fragmentation is usually detected by size exclusionhigh performance liquid chromatography (SE-HPLC) [75,76]. Other analytical techniques which can be used for characterizing fragmentation include mass spectrometry (MS), sodium dodecyl sulphate-poly acrylamide gel electrophoresis (SDS-PAGE), and DLS [79].

Modelling of protein's size-based degradation
An important step towards controlling protein's sizebased degradation is understanding its mechanism and this is typically achieved via kinetic and thermodynamic modelling [80]. A powerful strategy involves acquiring time resolved data of the degradation process and comparing this information with simulations based on different theoretical models. This comparison leads to the identification of the aggregation mechanism and to the quantification of the corresponding rate constants of nucleation and growth [80].
Numerous models for aggregation have been proposed in the literature [81][82][83] in different contexts. Indeed, protein aggregation is observed in a variety of biological and biotechnological systems. In some cases, aggregation inside biological systems is undesirable and leads to pathological species associated with a variety of neurodegenerative diseases [10], while in other cases  protein self-assembly is functional, as in the case of actin [10]. In the biotechnological context, protein aggregation is undesirable, leading to product loss and to formation of potential immunogenic species [10,84]. Although aggregate types vary significantly with respect to their heterogeneity, the Lumry Eyring model is the most commonly used model for predicting kinetics of protein aggregation [85]. According to this model, the monomer first converts into an unfolded or misfolded species and then forms small oligomers. Small oligomers may undergo conformational changes to form nucleus species which are highly reactive and contribute to formation of larger oligomers. These large oligomers can interact with each other to form either fibrils, amorphous aggregates, or protofilaments. All through this aggregation process, numerous intermediate species which have ring or annular structures are formed and this has been confirmed by researchers using atomic force microscopy (AFM) and electron microscopy (EM) [86][87][88]. The Lumry Eyring model, in combination with population balance equations, has been widely applied to describe the aggregation of a variety of biologics under several stresses.

Recent developments in the domain of protein's size-based degradation mechanisms and modelling
A host of analytical techniques are available for the aggregate characterization like SE-HPLC, DLS, SLS, fluorescence spectroscopy, circular dichroism, mass spectroscopy, micro flow imaging (MFI), nanoparticle tracking analysis (NTA) [87]. Each of these characterization techniques gives different information about aggregation. SE-HPLC helps us to separate lower order aggregate species of various sizes and with the help of a molecular ladder or an integrated MALLS, sizes and molecular weights of different aggregates can be easily computed [87]. DLS helps us to calculate average hydrodynamic radius of an aggregated sample. Spectroscopic tools like circular dichroism, FTIR and fluorescence spectroscopy give information about the protein's secondary and tertiary structure. Flow imaging and tracking tools like NTA and MFI compute the particle sizes in a flow medium [87].
On the basis of the data obtained from these characterization tools, sizes of the various particles formed during aggregation are calculated and this size data can be used to deduce various plausible aggregation pathways taken by protein molecules. Researchers have described one such mechanism where it has been mentioned that reversible protein aggregation is a complex phenomenon where the smaller aggregates like dimers and oligomers form rigid and stable complexes which are in an equilibrium state with the monomer and their stability depends on the dissociation constant (K d ) [89]. Here they have compared two molecules -insulin and mAb. In the case of insulin, the K d value is around 10 µM without zinc. Insulin is generally formulated and administered at a higher dose than this K d value. This causes insulin to exist as dimer at physiological pH. But at lower pH (pH = 2.0), it exists as a monomer. At protein concentrations around this K d , both monomers and dimers coexist in equal amounts [89]. But at concentrations below K d, dimers are more stable than monomers. In the case of mAb therapeutics, K d values are much higher than insulin. Stable dimers and other oligomers are generally observed at high mAb concentrations (100 mg/ml). Larger aggregates are stable and observable by imaging techniques as compared to dimers and oligomers, which are quite unstable and are in dynamic equilibrium with each other [89]. But one thing to mention here is that aggregates with number of monomeric units up to 100 are considered as molecular aggregates and macroscopic particles are considered only when the particles become very large in dimensions and macroscopic phase separation occurs [89]. Besides reversible aggregation, irreversible aggregation or formation of non-native aggregates is also important where the aggregates are not dissociable upon dilution or changing protein's environmental conditions [89]. These non-native aggregates formed as a result of complete loss of protein's secondary and tertiary structures and formation of strong non-covalent interactions amongst multiple molecular chains. This non-native aggregation includes formation of nucleus misfolded species [89]. This nucleus specie is generally very stable and further growth of non-native aggregates occurs by adding of other monomeric units to this nucleus. The process or step of nucleation is generally slow and rate limiting in nonnative aggregation, which is kinetically controlled and depends on the concentration of monomer and various aggregated species present in a solution [89].
Protein stability can be estimated and quantified in terms of a parameter known as Fuchs stability ratio (W) [90,91]. It is calculated on the basis of interaction potential of two particles. This value is highly dependent on changes in surface potential or interactions. It is accurately determined by fitting experimental data to population balance modelling equations. Researchers have described the relation between aggregate reactivity and aggregate size. For the ideal case, the kernel size was determined using this Fuchs stability ratio (W) assuming the aggregating spheres have a uniform reactivity [91]. But in actual situation, reactivity of a protein is highly dependent on confirmation of protein in three-dimensional space and amount of exposure of hydrophobic patches to the external buffer environment. It can be done by taking into consideration all kinds of monomeric and aggregated species like monomer, nucleus species, oligomeric species and aggregated species. The relationship between protein reactivity and coverage of patches which are hydrophobic in nature is inverse because as the aggregation increases, the number of available hydrophobic patches for contributing to aggregation decreases, ultimately affecting the reactivity of aggregates [80,91]. Researchers have used this approach to predict heat induced aggregation in mAbs. Their study confirmed that although two mAbs when treated by same stress condition, behave differently and follow different aggregation pathways and can have high W value. The factors which contribute to this differential behaviour could be due to electrostatic repulsive forces, presence of scattered hydrophobic patches on protein's surface and hydration forces, which lead to stability and differential aggregation propensity of mAbs in solution [52,80,91].
These approaches can be used to model and understand the effect of excipients on individual steps of aggregation [80]. Researchers have used population balance for predicting aggregation kinetics for monoclonal antibody products in acidic conditions [90]. Changes in the radius of gyration and hydrodynamic radius of the mAb samples were predicted as a function of time and used to model aggregation kinetics using Smoluchowski's population balance modelling [90]. In situ light scattering was used to compute these parameters and to derive fractal dimensions, and colloidal stability of protein particles was described in terms of Fuchs stability ratio, a parameter that offers insight into the Arrhenius dependency of the different rate constants of aggregation and also into the mechanism of aggregation. This ratio is hence indicative of whether aggregation is due to monomer-monomer addition or cluster-cluster interaction. The researchers concluded that reactive monomer species are more prone to aggregation as compared to reactive aggregate species because of a decrease in number of hydrophobic patches that are available for conglomeration [90]. Other researchers have reviewed the various theories that have been applied for understanding protein aggregation [91]. They have used coarse grain modelling for computing the colloidal and conformational stability of therapeutic proteins and explained the use of fractal dimension for characterizing irregular aggregate morphologies.
Studying molecular mechanisms and kinetics of protein's size-based degradation have been an interesting area to explore and many researchers in the past have demonstrated the same using various computational tools. One such recent investigation examines aggregation kinetics of different classes of mAbs (IgG1 and IgG2) when subjected to thermal stress [80]. SE-HPLC and light scattering techniques were deployed in combination to compute the time-based growth of various aggregate species (monomer, dimer and trimer) along with their molecular weight and hydrodynamic radius. Using experimental data, population balance modelling was performed to evaluate the role of each basic step on aggregation. The main outcome of the study was that subclass type has a significant role to play in deciding the aggregation route even if both the molecules are kept under similar operating conditions. In mAbs, while monomer depletion is rate limited by conformational changes occurring inside the monomer molecule, collision between two monomers is the rate limiting step for further aggregation. Fuchs stability ratio (W) was used to calculate the kinetic rate constants for measuring interaction potentials amongst various protein molecules. Protein aggregation is generally considered as a multi-step process that starts with formation of a non-native unfolding intermediate which is highly susceptible to aggregation. This unfolded protein then accelerates the formation of nucleus species as well as other higher order oligomers. This study reported that protein aggregation depends on both colloidal and conformational stability of protein. Colloidal stability on one hand depends on interactions between the different protein molecules which are prone to aggregation. On the other hand, conformational stability is dependent upon both the rate of protein unfolding (kinetics) as well as the free energy change that occurs during protein unfolding (thermodynamics) [80]. In another work, researchers have focussed on aggregation pattern in concentrated conditions under thermal stress [91]. A kinetic model using population balance equations has been described and significance of colloidal and conformational stability towards overall aggregation rate has been examined [91]. The model accounts for the impact of rate constant, protein-protein interactions, solution viscosity and aggregation compactness under both dilute and concentrated conditions.
Protein's size-based degradation mechanisms (both aggregation and fragmentation) have been related to various factors and correlations of these mechanisms with factors have been the research interests of many scientists. Researchers have described the aggregation of IgG1 based mAb therapeutics under various buffer conditions, comparing different factors like temperature, pH, salt concentrations and buffers and finding their correlation using protein aggregation models like Lumry Eyring nucleated polymerization models, extended Lumry Eyring model and Finke Watzky model [6]. In their study, pH was found to be the most important factor that plays a crucial role in aggregation of mAb based therapeutics. Temperature, salt concentration and buffers come next with respect to their effect on protein aggregation kinetics. LENP model provided better fit with the experimental data as compared to the ELE model showing that the aggregation is primarily nucleation dominated and controlling the monomer half-life [6].
Proteins in their native conformations can form multitude of aggregates varied in size and shape, which is highly dependent on external buffer conditions of the sample. Researchers have reviewed the various mechanisms and factors leading to formation of larger aggregates [92]. They have categorized the various supramolecular structures for higher order aggregates which are as follows: amorphous aggregate, oligomers, fibrils, protofibrils, superstructure, amyloid like sephrulites, and protein particulates. In terms of mechanisms for the formation of these structures, transitions between soluble precursors and insoluble aggregates occur due to formation of various inter and intra molecular interactions [92]. These interactions are largely affected by the protein structure, concentration, and the physicochemical conditions of the buffer. The models for these higher order aggregates are largely dependent on the formation of nucleus and growth of these aggregates. Various steps in formation of these structures include micelle formation, protein conformational changes, and joining of various filaments. A number of characterization tools have been listed as used for characterization of these higher order aggregate structures like -FTIR, thioflavin/ ANS/ tryptophan fluorescence, far-UV CD, DLS, AFM, TEM, and fluorescence microscopy [92,93]. As discussed in the previous section, of the factors affecting protein aggregation, protein concentration is an important factor that affects protein stability. Researchers have related the interactions between protein molecules and aggregation propensity to their dependency on protein concentration [94]. Aggregation rates were measured across a pH range of 5-6.5 in the presence of various excipients like sucrose and NaCl using SE-HPLC, SLS and DLS, and it was observed that at low protein concentrations, small changes in protein conformation and protein-protein interactions greatly impact the protein aggregation kinetics. Various hypotheses have been tested for correlating protein interaction to changes in protein aggregation propensity like thermodynamic changes during protein aggregation, statistical mechanical fluctuation theory, and surface contact probabilities. Amongst these hypotheses, only the hypothesis of surface contact probability has been found to be consistent with experimental results and a semiquantitative relationship has been established between protein-protein interactions and protein aggregation rates [94].
Recent usage of computational approach for studying protein aggregation has been gaining interest. Researchers have described and reviewed various methods and applications for computational approaches for protein aggregation [95]. They compared atomistic and coarse -grained models for predicting and deeper insights into protein aggregation mechanism. In the atomistic approach, monomeric and small oligomeric molecules were examined for their interactions with dyes, small molecules and peptide inhibitors. Atomistic models provide information related to early stages of aggregation. The unstable states like partially folded structures and aggregation prone structures are very difficult to study experimentally [95]. Simulation based analysis has been proven useful in analysing these unstable states. In coarsegrained model approach, peptide aggregation has been extensively studied. Coarse grained models have been used to study higher order aggregation. Various coarsegrained models have been used like lower resolution models, phenomenological models, Martini force field model, prime model, discontinuous molecular dynamics, and the optimized potential for efficient protein structure prediction (OPEP) protein model [95].
Researchers have described the use of kinetic approach for revealing various mechanisms used by molecular chaperones for prohibiting the aggregation [96]. Aggregates play a crucial role in a myriad of human diseases. The formation of aggregating proteins is strongly affected by molecular chaperones. The authors used kinetics of protein aggregation for shedding light on interactions amongst chaperones and protein molecules. This method helped in identifying protein components which are actively targeted and various steps that are inhibited by these chaperones [96].
Kinetics of protein fragmentation has also been studied in detail recently, showing the significance of various factors affecting mAb fragmentation using kinetic modelling [71]. The fragmentation studied by them is nonenzymatic in nature under high temperature conditions. The fragmentation rate and fragment species formed have been characterized by various analytical tools like SE-HPLC, DLS, SDS-PAGE, LC-MS. Based on the results obtained from kinetic analysis, the authors concluded that at high temperature, mAb gets broken or cleaved at hinge region, resulting in formation of Fab-Fc and Fab/Fc fragments [71].

Control of protein's size-based degradation
Protein stabilization is critical for commercial success of a bio-therapeutic. Excipients are known to stabilize protein therapeutic formulations via different mechanisms such as reinforcement of stabilizing forces or weakening of denatured protein state. Some excipients also act by directly binding to the protein molecule [97,98]. Protein excipients could be categorized into various classes such as salts, antioxidants, surfactants, polyols, buffers, amino acids like arginine and lysine, and sugars like sorbitol, sucrose, mannitol and polysaccharides and small peptides [98][99][100]. These excipients are generally used in concentrations ranging from 0.1M to 1M. They act by decreasing surface adsorption and facilitating physiological osmolality. The chief interaction of these excipients is with protein, container surface, and water. Excipients stabilise proteins either by direct or indirect interactions. Direct interactions are generally encountered in dried or lyophilized state [101]. Approaches to reduce aggregation and increase protein stability could include strategies like lyophilisation or spray drying which transform therapeutics into drug conjugates such as antibody drug conjugates (ADCs), and use of inline filters prior to administration.
Excipients have been shown to stabilise protein solutions by weakening of denatured state, reinforcement of protein stabilizing forces, and sometimes also by direct binding of protein to the excipient at various stages of formulation development like drying, lyophilisation, purification [102]. Formulations that are in liquid form as compared to dried or lyophilized are more susceptible to instability due to the higher mobility of the protein molecule, which increases the chances of molecular collisions amongst molecules [103,104]. Lyophilized formulations are generally more stable and offer greater shelf life. In liquid formulations, interactions at liquid air interface or solid-liquid interface are responsible for instability due to adsorption and protein unfolding. But sometimes, lyophilisation cycles can cause instability in protein formulations. Therefore, proper optimization of the lyophilisation process can aid in minimizing lyophilisation induced aggregation [104][105][106].
Antimicrobial agents like benzyl alcohol that are present in the liquid formulations have been known to promote aggregation by enhancing the formation of partially unfolded states. Some solutes may also stabilize the protein by minimizing the surface area which is exposed to the excipient or co-solute [102]. This reduction in exposed surface area can be achieved by either self-association of monomer or adsorption of the monomer to a surface.
A key strategy to control aggregation is via modification of the hot spot regions of proteins that are more susceptible to instability. Hot spot modifications can help in reducing aggregation by modifying the protein surface charge and improving protein solubility. However, major changes in the protein's primary sequence or structure can also significantly impact protein activity in an undesirable fashion [97][98][99]. Attachment of hydrophilic moieties such as polyethylene glycol (PEG), hydroxyethyl starch (HES), or glycan moieties to the protein sequence may improve protein solubility in aqueous solutions [97][98][99]101].
Since there are many risks associated with changes in the primary structure of the protein w.r.t. immunogenicity and bioactivity, creating a stable formulation is preferred [102]. This entails choice of environmental conditions (pH, conductivity) and use of stabilizers. Monovalent cations are known to be more efficient in stabilising protein formulations as compared to multivalent cations [103]. Stabilizers commonly act by increasing conformational and colloidal stability, shielding the hydrophobic patches or enhancing protein disaggregation. High hydrostatic pressure with or without assistance of high temperature can also be used to dissociate aggregates. Researchers have successfully demonstrated this for green fluorescent protein inclusion bodies and growth hormone aggregates [107]. Inline filters can be also used to filter out large aggregate particles that may be formed due to presence of saline or dextrose as diluents in these formulations [108]. Commonly used surfactants that have been demonstrated to improve stability of protein formulations include polysorbates (PS 80 or PS 20), alkyl saccharides, and amino acid derivatives [9,13,27,109,110]. These stabilizers generally act by minimizing self-interaction between protein molecules and reducing adsorption to surface interfaces.

Recent developments in control of protein's size-based degradation
Use of the various types of excipients has been in demand recently for preventing protein therapeutics from aggregating. Here we focus on the important categories of salts, surfactants and sugars. Researchers have defined excipient as an ingredient other than the main protein component in a protein formulation which improves the quality and performance of a drug product [111,112]. As discussed earlier, several classes of excipients are used in protein formulations. Excipients used in biologics include buffering agents, antioxidants, surfactants, salts, sugars, carbohydrates, and amino acids. These excipients help in maintaining protein solubility, osmolarity and osmolality in the solution, thereby enhancing the therapeutic's shelf life. Excipients help proteins in maintaining their structure and conformation in solution. Buffers have a critical role in maintaining pH of the protein solution and providing stability to it. Commonly used buffers are citrate, acetate, phosphate, glycine and histidine. One study has showed that citrate at near neutral pH is favourable for protein formulation due to its lesser crystallization tendency and more collapse temperature. Another study has shown that glycine when added as excipient in phosphate buffer reduce its crystallization tendency, hence enhancing overall formulation stability [113]. Sugars and carbohydrates have shown their ability to provide stability to protein formulations. These are especially used during freeze drying and freeze thawing operations. They operate by various mechanisms like reduced mobility, water replacement mechanism and preferential exclusion. Commonly used sugars are trehalose and sucrose. Amongst these two sugars, trehalose is much preferred to sucrose because of higher glass transition temperature of the former [113,114].
Salts have been used to stabilize protein therapeutic solutions at high concentration by reducing the viscosity of the protein formulations [115]. Use of these salts directly like NaCl, NH 4 Cl, NaAC, Na 2 SO 4 or salt derivatives of some common amino acids like ArgHCl, HisHCl, LysHCl could help in stabilising high concentration protein formulations. High concentration protein solutions have high viscosity due to increase charge-charge interactions. Salts and salts derivatives of amino acids stabilise the high concentration therapeutic formulations by lowering these charge-charge interactions amongst protein molecules in a solution [115].
Commonly used surfactants in protein therapeutics formulations like PS 80, PS 20, Triton X-100, Pluronic F-68, and caprylic acid have been shown to be effective in providing stability to protein formulations against various types of mechanical, thermal and photo degradations. Surfactants play a critical role in controlling stability of protein therapeutics by affecting their interfacial properties like surface tension and competing for surface adsorption. Examples include polyethylene glycol, poloxamer-188 and PS 20 [112,114]. These surfactants adsorb to the interface, thereby effectively reducing the concentration of soluble aggregates (<100 nm). Insoluble aggregates are controlled due to the formation of cohesive network between adsorbed mAb and surfactant [112]. In this study, various interfacial parameters and properties were used to elucidate mechanism of stability provided by surfactants to protein therapeutics. These mechanisms include surface tension, surface tension gradients for various surfactants, interfacial rheology, and surface tension [112]. In a recent study, several ionic and non-ionic surfactants were evaluated for their ability to provide stability to mAb therapeutics against degradations [114]. It was observed that caprylic acid, an anionic surfactant, has a slightly negative effect on the stability of protein therapeutics against mechanical and light related degradations as compared to other non-ionic surfactants like PS 80 or PS 20. A concentration of at least 0.005% (w/v) is needed to offer stability to mAb based therapeutics [114].
A recent publication explored the stabilizing effects (against both aggregation and conformational perturbation) of different commercially available surfactants used in therapeutic protein formulations [116], including Tween 20, Tween 80, PEG dodecyl ether (Brij35), Chaps, Triton X-100, Pluronic F68 and F127 and SDS. These surfactants were selected based on their diverse physical and chemical properties and stability of two different mAbs was examined using various analytical techniques. The researchers found that non-ionic surfactants are better performers in terms of safety and efficacy as compared to ionic surfactants in the concentration range of 0.02 -2 mg/ml. The concentration of the surfactant in protein formulation should be below their critical micelle concentration (CMC). However, lower surfactant concentration (<0.002 mg/ml) is generally found to be ineffective in providing stability to protein formulation. Above 2 mg/ml, structural perturbation is likely a phenomenon that is highly dependent on the surfactant type. This proves that enhancing stability of protein formulation via surfactants is highly dependent on their absolute mass concentrations and not on their molecular weights. Hydrophobic surfactants such as tween 80 and triton x 100 offer stable protein formulations at or above concentrations of 2 mg/ml. Ionic surfactants like Chaps are also effective in providing stability to protein formulations when used below 2 mg/ml of concentration. But these have to be used with caution as they can decrease the thermal and long-term protein stability. SDS falls under the category of ionic surfactants should never be used for stabilising protein formulations due to their significant impact on protein structure. Poloxamers like F68 and F127 are also effective in providing stability to protein formulations due to their aggregation preventing and protein stability enhancing role. Stabilising effects of surfactants is also highly dependent on protein concentration and its type [116]. At protein concentration of 50 mg/ml, one of the proteins displayed higher stability in presence tween 20 and tween 80 as compared to the other protein, indicating that surfactant stabilization of proteins is highly protein and concentration dependent. The stabilising property of a surfactant is generally credited to their ability to reduce the structure perturbation caused by surface adsorption to various interfaces and also to their capacity for inhibiting protein-protein interactions by hydrophobic surface binding [116].
To conclude, usage of surfactants in IgG formulations is highly dependent on type and concentration of surfactant and protein and its mechanism of aggregation. There should be a fine balance between the stability enhancing and perturbing forces for achieving the desired outcome of a stable protein formulation [116].
In addition to surfactants, sugars are another common class of excipients. The mechanism of stabilization of different protein therapeutics by various sugars in solid and drying state has been reviewed recently [117]. Sugars interact with protein molecules during drying operations and this helps in preserving the protein in its original native form and lowers the molecule's overall movement during storage. It has been observed that smaller the sugar molecules, better is the interaction between protein molecules and sugar molecules due to less steric interference. Overall stability provided by a sugar or a saccharide is dependent on many factors like protein under study, conditions under which formulation is prepared as well as storage conditions [117].
Excipient crystallization also has a strong impact on the unfolding abilities of protein molecules, ultimately affecting protein aggregation. Researchers have demonstrated the consequences of crystallization of mannitol and trehalose on the stability of IgG1 at different pH conditions by repetitive freeze-thaw cycles [118]. At low pH, the recurring stress of freeze thaw induce aggregation could be lessened by the use of a surfactant. Increment in the number of freeze-thaw cycles, along with decrease in the formulation pH resulted in formation of soluble and insoluble particles [118].
Researchers have shown the use of approach of modelling as well as experimental characterization for exploring the effect of sucrose and polyols (sorbitol, mannitol, glycerol and threitol) on the mAb aggregates formed by heating [119]. The authors concluded that the protein unfolding rate governs the overall monomer depletion kinetics [119].
Apart from conventional excipients, use of nonconventional excipients is also gaining interest in providing stability to protein formulations. One of the recent publications has demonstrated use of gold nanoparticles (AuNPs) for treatment of Alzheimer's disease [120]. These are used as anti-aggregating particles which makes them one of the promising candidates as neuro-medicines for the treatment of Alzheimer's disease [120]. Use of peptides and peptide based dendrons have also been shown to stabilise mAb and therapeutics formulations [100,121]. Researchers have shown the use of small peptide-based ligands complementary to hydrophobic regions of insulin in preventing aggregation and providing stability [90]. A similar work has shown the use of peptide based dendrons for providing stability to mAb based therapeutics. They have shown the use of lysine based dendrons of third generation in effectively preventing aggregation as well as fragmentation [121].

Conclusions
Protein's size-based degradation continues to be a potent problem for the biopharmaceutical industry. It is imperative to understand how protein stability, mechanisms and modelling of protein's size-based degradation impact quality, safety and efficacy of a bio-therapeutic. Lumry Eyring Model and its variants (LENP and ELE) are the most accepted models for describing aggregation kinetics in bio-therapeutic proteins. Knowledge of the mechanism as well as the various species that are formed in the process can be useful in creating appropriate strategy for monitoring product stability during production and storage. Strategies for mitigating protein's size-based degradation include making changes in formulations by using excipients, which can provide stability to protein formulations. It is hoped that the information presented here will be of interest to researchers working on formulation development and stability of bio-therapeutic products.

Acknowledgements:
The authors would like to express their gratitude to the unknown referees for carefully reading the paper and giving valuable suggestions.
Funding information: Authors state no funding involved.

Conflict of interest: Authors state no conflict of interest.
Data availability statement: Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.