The Real-life Experience of Developing and Commercializing TruGraf, a Validated Non-Invasive Transplant Biomarker

Despite improvement in short-term outcomes, long-term results for kidney transplant recipients remain suboptimal. Immunological rejection is a leading cause of graft failure and recent research points to undetected “silent” subclinical acute rejection as a key component of this problem. While biopsies remain the gold-standard method for detecting silent rejection, non-invasive methods offer significant advantages especially in terms of patient safety and for serial monitoring of stable patients. This manuscript details the real-life challenges involved in the ultimately successful development and commercialization of TruGraf, a clinically validated, blood-based gene expression assay that offers the potential to reduce the use of surveillance (protocol) biopsies in renal transplant recipients with stable renal function.


Introduction
Kidney transplantation is the optimal treatment for most patients with chronic renal failure (1). However, the long term success of kidney transplantation is far from optimal (2). In 2017, 10-year all-cause graft failure was 49.7% for deceased donor kidney recipients and 34.1% for living donor kidney transplants (3). Immunological rejection, a major cause of graft failure, is driven by attack of the graft by T cells (T cell mediated rejection, or TCMR) or antibodies (antibody mediated rejection, or ABMR), or in some cases a combination of these two mechanisms (mixed rejection). A key early contributor to long-term graft loss is subclinical immune injury that leads to chronic damage of the renal allograft (4)(5)(6)(7)(8). Until recently there have been no commercially available fully validated non-invasive tests to monitor patients with stable renal function for silent rejection (9). As a result, a significant number of centers rely on surveillance (protocol) biopsies to detect early silent rejection, whereas other centers who choose not to perform these wait for clinical evidence of graft injury and damage (10,11).

Situational Analysis
Standard non-invasive monitoring to detect kidney injury secondary to rejection or other causes includes measuring serum creatinine levels and immunosuppressive drug levels, both of which are insensitive and nonspecific. Clinical manifestations of severe rejection, such as fever, pain over the graft, or decreased urine output may be present, but are infrequent findings with current immunosuppressive regimens. Thus, current non-invasive monitoring only detects rejection when it is advanced and only after significant, and potentially irreversible damage to the graft has occurred. Indication or for cause biopsies are typically performed to determine the cause of acute renal dysfunction.
Biopsies are expensive, invasive, and suffer from significant variability in interpretation (12). Moreover, biopsies put patients at risk for significant complications such as infection, bleeding, and even graft loss, in addition to being painful and inconvenient (13). However, indication biopsies remain essential in the management of patients with renal dysfunction and are used ubiquitously by transplant programs. In sharp contrast, while a number of transplant programs have adopted the routine use of surveillance biopsies to detect subclinical acute rejection (subAR) in patients with stable renal function, several factors have discouraged other programs from following suit. These include but are not limited to all the issues stated above, but, in addition, stable patients undergo indiscriminate biopsies resulting in negative (unnecessary) invasive procedures the vast majority of the time. Thus, a non-invasive monitoring strategy that replaces invasive protocol biopsies is sorely needed and has been the focus of several investigators in the past few years.
Previous investigators focused on developing non-invasive biomarkers in the urine and blood to diagnose rejection in patients with graft dysfunction (clinical acute rejection -cAR) in an attempt to replace indication biopsies. There are two major fallacies to this approach: first, while some patients with subAR develop cAR, others exhibit ongoing subAR causing more chronic graft injury; second, in the absence of paired biopsies for each sample, it is difficult to be certain that bio-informatics approaches which yield positive results from these samples are real. For this reason, we set out to develop a biomarker specific for subAR by using only blood samples paired with protocol biopsies in patients with stable renal function.

Development of a validated peripheral blood biomarker for subAR
Identifying the need for a non-invasive replacement for biopsies in stable patients, we set out to discover and validate a peripheral blood biomarker to detect subAR in these patients as a "rule in" test, similar to biopsies. While our clinical trials and sample collection regimens were well designed, the evidentiary data and biomarker performance that resulted caused us to rethink the context of use (COU) of the biomarker.
Subclinical acute rejection (subAR), also referred to as "silent" rejection, is histologically defined acute rejection characterized by tubulointerstitial mononuclear cell infiltration identified from a biopsy specimen in a patient with normal or stable renal function (4)(5)(6)(7)(8). In the NIH-sponsored CTOT-08 study of 307 kidney transplant recipients (7), the natural prevalence of subAR, based on surveillance biopsies, was 20% at 3-6 months, and 25% at 12 and 24 months surveillance biopsies, with an overall prevalence of 35% (7). Of note, 80% of the subAR was of the borderline variety when classified by central pathology using the Banff criteria (14), and importantly, the biopsy was normal in 75% of cases. At the two year time point, patients with subAR on surveillance biopsies had worse outcomes than patients who did not. This was based on a composite clinical endpoint (CCE) consisting of biopsy-proven acute rejection (BPAR) on any "for-cause biopsy" by central read, or a 24-month biopsy (central read) showing evidence of chronic injury measured by interstitial fibrosis and tubular atrophy (IFTA) of Banff grade [?]II IFTA (ci [?]2 or ct [?]2), or a decrease in estimated glomerular filtration rate (GFR) by >10 mL/ min/1.73 m 2 between 4 and 24 months posttransplant (7). SubAR was also associated with a higher frequency of both class I and class II de novo donor specific antibody (dn DSA) development (7,15).
In addition to the CTOT-08 data shown above, a number of clinical studies have also recently associated subAR with poor outcomes (4)(5)(6)(7)(8)(15)(16)(17)(18)(19). A study in recipients with a rapid steroid withdrawal protocol compared outcomes in patients with no inflammation and those with subclinical inflammation on a 3month surveillance biopsy. In the patients with subclinical inflammation, the serum creatinine levels were significantly higher at 24 months, and the allograft chronicity index on biopsy, the rate of subsequent BPAR and development of dn DSA were all significantly increased at 12 months (16). A large Australian study compared outcomes in patients with normal biopsies, those with borderline rejection, and those with T cell mediated acute rejection. Compared to patients with normal biopsies, patients with borderline rejection had worse renal function, more IFTA, subsequent acute rejection, allograft failure and patient mortality (17). A recent study in 103 pediatric renal transplant recipients that examined subclinical inflammation phenotypes and long-term outcomes after pediatric kidney transplantation, highlights the importance and treatment of subAR (18). In this study, surveillance biopsies were performed in first 6 months and a composite endpoint (CEP) of acute rejection and graft failure was measured at 5 years. The CEP was reached by 41% for treated borderline rejection vs. 67% for untreated (p<0.001) (18). Additionally, another recent publication has shown that borderline early acute rejection is associated with the development of late acute rejection and graft loss (19).

The Trials and Tribulations of a) developing and b) commercializing a non-invasive biomarker for subAR a) Development
The TruGraf® Blood Gene Expression Test (Transplant Genomics, Inc, Mansfield, MA) is a microarraybased assay that analyzes gene expression profiles (GEP) in the peripheral blood. Our initial strategy was to develop a "rule in"' test, whereby a positive test would be highly predictive of a positive biopsy (subAR). We used a locked support vector machine (SVM) based classifier with a bootstrap to prevent over-fitting of the discovery set for internal validation as the bio-informatics approach (20). We found two interesting observations: first, at different thresholds, we traded PPV for NPV to the point that a "rule in" test was not possible using this approach. We then switched to Random Forest (RF) as the bio-informatics approach (21) and used a different threshold, but again it was evident that the intended use of the biomarker would need to change. Because the performance metrics were better with RF, we proceeded to use RF but picked thresholds more favorable for a "rule out" test (21). The product was a GEP classifier that associates with either a normal protocol kidney biopsy (Transplant eXcellence-TX) or the absence of a normal biopsy (not-TX) in patients with stable renal function. All aspects of discovery and external validation of the TruGraf test were performed on blood samples paired with biopsies from prevalent cohorts. For the purpose of validation, the model derived from pre-selected bio-informatics and the threshold used to test performance on the discovery cohort were locked. These data led us to use this approach for external validation in an early access program (EAP) for patients (22). The external clinical validation from seven EAP transplant centers defined the key clinical performance parameters for this assay, as summarized in Table 1 and Figure 1. In this study, the high negative predictive value (NPV) of TruGraf was demonstrated in clinical use, making it a strong rule-out test. Over 90% of stable patients who received a TX results were confirmed to have an immune quiescent phenotype, meaning that a physician can have a high degree of confidence that a patient who tests as TX does not harbor silent subclinical rejection. Importantly this study also found that up to 65% of surveillance biopsies could be avoided in the cohort tested. Unpublished data involving analysis of an additional 129 biopsy-confirmed blood samples provided by Northwestern University (originally used for the CTOT-08 study) revealed identical performance metrics for TruGraf (NPV of 90%). A fourth publication described the impact of TruGraf results on physician decision making for clinical decisions (23). This study highlighted the high degree of confidence physicians place in the ability of TruGraf to provide valuable, added information that could lead to avoidance of unnecessary surveillance biopsies as summarized in Table 2.
As a result of these experiences, we changed the proposed COU from replacing surveillance biopsies for detecting subAR, to reducing the number of necessary biopsies in stable patients which should lead to many less invasive procedures (Table 1) as well as significantly less negative or unnecessary biopsies. The COU proposed in the recent approval from CMS states that "The TruGraf test is intended for use in kidney transplant recipients with stable renal function as an alternative to surveillance biopsies in facilities that utilize surveillance biopsies". While primarily used to rule out subAR, it is expected that both centers that perform or do not perform surveillance biopsies can use the test to assess the need for a surveillance biopsy in stable patients (24). Figure 2 illustrates a proposed approach for implementation of TruGraf into clinical care for kidney transplant recipients. For patients with stable renal function, a TruGraf result of "TX" identifies those who have a high likelihood of immune quiescence and a low likelihood of histologically defined rejection at the borderline level or higher. A result of "Not-TX" identifies those in whom silent rejection cannot be confidently ruled out, and thus carry a higher risk of immune activation and borderline or higher rejection. Patients with a "Not-TX" result might benefit from further evaluation and possibly a change in therapy. Early identification of these patients potentially allows better allocation of physician resources, and potential reversal of the process before permanent damage to the donated kidney occurs.

b) Pathway to Commercialization of TruGraf
Developed in 2011, the Molecular Diagnostic Services (MolDX) program is run by Palmetto GBA, a Centers for Medicare and Medicaid Services (CMS) Medicare administrative contractor. It performs the following functions: • Facilitates detailed and unique identification through registration of molecular diagnostics tests to facilitate claims processing and to track utilization. • Establishes clinical utility expectations. Completes technical assessments of published test data to determine clinical utility and coverage.
CMS approved reimbursement for commercial TruGraf testing on November 25, 2019.

Conclusions
Silent subclinical rejection is frequent and a significant contributor to worse long term outcomes for kidney transplant recipients. Until now subAR could only be ruled in or out by invasive and risky per protocol surveillance biopsies, resulting in a significant number of unnecessary biopsies and therefore unnecessary risk to patients compromising safety. Thus, non-invasive tests are clearly needed to identify patients with stable renal function who are harboring subAR in their grafts. In response to this statement of need, we first set out to develop a "rule in" test to replace the routine use of protocol biopsies as the context of use. However, based on the evidentiary performance data of our biomarker, we determined that it is best used as a "rule out" test and then revised the proposed COU as the reduction of a large proportion of protocol biopsies in programs that currently utilize these; in those that do not, subjecting far fewer patients to the risks of biopsies together with a reduction in the number of unnecessary (negative) biopsies may provide an attractive monitoring strategy (24). To these ends, TruGraf is the first and only non-invasive test designed and validated for use in ruling out silent subclinical rejection in kidney transplant recipients with stable renal function.
Non-invasive blood testing can be done more frequently than surveillance kidney biopsies, is significantly less invasive, less painful and risky for patients, and may result in a considerable cost savings to the health delivery system.