Public policies promoting ‘screening’ for a cancer, while directly bearing on the cancer’s early diagnosis and treatment, are really aimed at reducing mortality from the cancer in the populations in question. In the adoption, or continuation, of such a policy, the policy-makers are informed by an estimate of the magnitude of the mortality reduction resulting from the policy. Even though this estimate has to do with mortality in a strictly epidemiological – community-medicine – meaning of the word, the relevant scientific input to it is about mortality in a very different, expressly clinical meaning of it. But a third, seriously malformed concept of mortality from the cancer is the focus in the type of research that has been, and still is, commonly viewed as the only relevant, and also solely sufficient, source of the requisite empirical inputs into the mortality-reduction estimates used in those policy decisions. I here outline the essence of the fundamental misguidedness of this still-orthodox original research – and of its correspondingly misguided reviews – for scientific inputs into policy decisions, with special reference to breast cancer. I also sketch the type of research that really is needed for the estimations at issue – and comment on the larger implications of the prevailing orthodoxy in both original and derivative research for the advancement the knowledge-base of those eminently important public-health policies.
Among the major concerns in modern epidemiology – modern community-medicine, that is – are the cared-for population’s rates of mortality from particular types of cancer, lung and breast cancers in particular. In the control of these rates, the first-line programs are aimed at prevention of cases of the cancer in question, the modalities of prevention in these programs – of population-level preventive medicine – being education and regulation, not service. Such programs are of major importance in the control of mortality from lung cancer, while only a relatively minor aspect of the control of breast-cancer mortality.
Important second-line, back-up types of program in the control of mortality from each of these two types of cancer in community-medicine are aimed at prevention of deaths from the cancer in unprevented cases of the cancer. To this end the community-level health-education needs to be directed to promotion of the population’s use of clinical services aimed at reducing the cancer’s case-fatality rate – the proportion of cases of the cancer resulting in death from it. In these services – diagnostic and therapeutic, rather than preventive (of the cancer) – the community-medicine focus generally is on the pursuit of the cancer’s early – latent-stage, preclinical – diagnoses and their consequent early, more commonly curative treatments.
Epidemiological promotion of the use of those clinical services involves, also, community doctors’ advocacy of the adoption of a public policy to publicly cover the cost of the initial testing in the (clinical) pursuit of the cancer’s early diagnosis – or, perhaps, to provide that testing as a population-level (epidemiological) program of service, with the test-positives referred to further diagnostic work-up and possible treatment in clinical medicine. (The cost of clinical care prompted by the test’s positive result gets to be covered independently of the policy in question.)
The initial testing toward early (latent-stage, preclinical) detection/diagnosis of an illness is commonly termed screening for that illness in the circles of community medicine and in those of public policies governing this branch of medicine. The basis for the use of this term, in this meaning in these circles, has been the idea that the initial testing can be a simple procedure, doable on a ‘mass’ basis on the community level to ‘screen’ people for possible referral to clinical diagnostics, etc. Such simplicity, however, hardly is a feature of tests that involve – as in the cases of lung cancer and breast cancer – not only heavy technology of imaging but also expert ‘reading’ of the images for the test result (on the binary scale of positive vs. negative, the former implying a need for further diagnostics).
From the clinical vantage, the pursuit of a cancer’s early diagnosis (in a person free of symptoms and clinical signs of the disease) is a continuum from the initial test through the diagnostic work-up following a positive result of that test. In this framework, the initial testing is not viewed as ‘screening,’ at least not in the meaning of this term in the circles of community medicine and public policy (above). And the clinical continuum actually does not end even at the potential (rule-in) diagnosis but involves its consequent early treatment as well.
This is to say that public policy about ‘screening’ for a cancer should be seen to actually be about an element in early clinical care directed to the cancer, and that ‘screening’ is a misleading misnomer for this element of care, obfuscating the clinical focus of the public policy at issue.
The decision to adopt, or maintain, a public policy to promote early clinical care for a cancer in a particular jurisdiction rests, principally, on the answer to this question: To what extent would, or does, the increase in this care, resulting from the policy, serve to reduce the mortality from the cancer in (particular segments of) the population at issue? The answer to this question should come from appropriately expert epidemiologists and epidemiological researchers together with clinicians expert on the cancer’s diagnosis and treatment, but it now commonly is provided by ‘task forces’ and ‘expert panels’ quite wanting in each of these three types of expertise.
Eminently illustrative of the prevailing practices in informing policy decisions about the mortality reduction resulting from ‘screening’ for a cancer is a recent case of such information-input in respect to ‘screening’ for breast cancer (Biller-Andorno and Jüni 2014). For this case, the background was that “In January 2013, the Swiss Medical Board, an independent health technology assessment initiative under the auspices of the Conference of Health Ministers of the Swiss Cantons, the Swiss Medical Association, and the Swiss Academy of Medical Sciences, was mandated to prepare a review of mammography screening.”
As is commonplace in these situations, that review was prepared by an “expert panel that appraised the evidence,” a panel that was not composed of suitably qualified epidemiologists and epidemiological researchers together with relevant clinical experts. The panel’s members were: an ethicist, a clinical epidemiologist, a clinical pharmacologist, an oncological surgeon, a nurse scientist, a lawyer, and a health economist. This panel noted that the evidence used for policy-decisions on “screening” for breast cancer has been derived from “a series of reanalyses of the same, predominately outdated trials.” Without any further critical appraisal of the evidence, the panel reported what those “reanalyses” (resyntheses) of the data from those trials had produced.
This panel made two quantitative statements about the results of those trials. One of them – a quotation from a report by a U.K. panel on this ‘screening’ (Independent U.K. Panel on Breast Cancer Screening 2013) – was about the results of prior reviews of this evidence: “The relative risk reduction of approximately 20% in breast cancer mortality … is currently described by most expert panels.” The other was this Swiss panel’s own, earlier-published “acknowledgment” that “systematic mammography screening might prevent about one death attributable to breast cancer for every 1,000 women screened.” No elaboration of the meanings was associated with these two statements.
“The report caused an uproar and was emphatically rejected by a number of cancer experts and organizations” (Biller-Andorno and Jüni 2014). The basis for this seems not to have been that the panel was deemed to have misrepresented the relevant literature by focusing on those trials, nor that the quantitative statements in it (above) were too obscure to be meaningful, nor that the trials themselves were fundamentally flawed. All of these features of the review were, after all, in line with the quality standards of the prevailing ‘normal science’ in this area. Against this normal-science character of the review it is astonishing that “one of the main arguments against [this panel’s report] was that it contradicted the global consensus of the leading experts in the field” (Biller-Andorno and Jüni 2014). The critics appear to have left unspecified whom they take those “leading experts” to be, and on what basis; what those experts’ beliefs are based on; and what their “global consensus” belief actually is.
I am in this context an equally emphatic contrarian, but in a very specific, fundamental way: I say that experts should have a consensus about the meaninglessness of the evidence that the panel (only superficially) reviewed and (only minimally) appraised.
The trials reviewed by that panel – a total of over 600,000 women have been involved in them – have followed the same pattern ever since the first of these, designed and conducted by an internist physician in collaboration with a statistician in the 1960s. In each of these trials, the subjects were (supposed to be) randomly assigned either to the ‘screening’ – the initial diagnostic testing (mammographic, mainly) – or to ‘usual care’ in this respect, and the comparison between these two cohorts was focused on mortality from the cancer over a period of follow-up after the entry into the trial. These two features – random allocation of the study subjects to such compared cohorts and the comparison of these cohorts in terms of mortality from the cancer – are now commonly held as cardinal requirements for policy-relevance of studies (original) on the effectiveness of ‘screening’ for a cancer.
Critical appraisal of this orthodoxy in those studies on ‘screening’ for breast cancer, on the most superficial level already, leads to very disturbing realizations about their quality, even if one accepts those two elements at the core of the prevailing methodologic doctrine. For one, the randomizations have repeatedly been disastrously flawed (Mukherjee 2012). One example of this is the “randomization” in the Canadian trial, in the 1980s, the nature of which “completely undid the trial” (Mukherjee 2012). The result of the process was not only preferential enrolment into the trial’s experimental arm of women with relatively high risk for the cancer (on account of a positive history about it, i.a.); enrolled into this arm of the trial were also women with clinical signs of the cancer’s presence. “Teams of epidemiologists, statisticians, radiologists, and at least one group of forensic experts have [tried to determine] what went wrong” (Mukherjee 2012).
And for another, the empirical measure of proportional reduction in mortality from the cancer, resulting from the ‘screening’ (really from early treatments replacing late ones) has not been a measure of any parameter of Nature. The empirical rate of death from the cancer for each of the two subcohorts in each of the trials has been taken to be that for the aggregate of population-time of follow-up as of the entries into the trial, with the measure of the proportional reduction in mortality derived from the ratio of these two rates. This ratio has thus been treated as though it were constant (apart from chance variation) over the duration of the follow-up, whatever this is, and also independent of the duration of the screening, whatever this is – treating this ratio as though it were a parameter of Nature and, as such, relevant to policies about the ‘screening.’ (These gross misconceptions have allowed those two design parameters to be set quite arbitrarily in the trial designs, and then to be ignored in syntheses [‘meta-analyses’] of the study results.)
While the Swiss reviewers characterized these trials as being, by now, mostly outdated, I add that each of these trials was fundamentally misguided and thereby seriously misleading, even for its time (cf. above). The cardinal flaws in these studies have been, first, the profound misconception of the measure of mortality of epidemiological concern in this context; and secondary to this, the lack of understanding what parameter of Nature needs to be studied, and how, for the scientific input to policy decisions about the ‘screening.’
As any policy about ‘screening’ for breast cancer is about clinical care as a determinant of the mortality from this disease in the population of a community (cf. above), understanding of the requisite knowledge inputs to these policies begins on the clinical level.
When a woman – one of the ‘worried well’ – consults a doctor about her risk of dying from breast cancer and, specifically, about the degree of reduction in this risk were she to be ‘screened’ for the disease, the doctor thinks about the reduction in the cancer’s case-fatality rate resulting from its early treatments following its early – latent-stage, preclinical – diagnoses in lieu of treatments of the cancer when it already is clinically manifest. For the doctor intuitively sees a certain quantitative equivalence between these two, with some subtleties in this.
A clinician thinks of the ‘screening’ in terms of successive rounds of pursuing the cancer’s early diagnosis, the algorithm of which, in each of the rounds, begins with the ‘screening’ test; and (s)he thinks of the intended consequence of this pursuit in reference to a single round of it – the ‘baseline’ round or one of the repeat rounds. In a given round of the pursuit, a genuine, life-threatening case of the cancer may get to be (rule-in) diagnosed in consequence of it, and the case may be curable by early treatment while (ultimately) fatal otherwise. The intention in any given round of diagnostic pursuit is to take advantage of this possibility to prevent the cancer’s fatal outcome, should it get to be diagnosed consequent to this pursuit.
And the quantitative aspect of all of this is the following: If some X% of the genuine (not overdiagnosed) cases of the cancer diagnosed under the ‘screening’ (as a result of the diagnostic process prompted by a positive result of the initial testing or by symptoms emerging between the rounds) are curable specifically on account of early diagnoses replacing late ones (in most of these cases), then early treatments of the early-diagnosed cases reduce the cancer’s case-fatality rate by that same X% relative to their treatments in the absence of ‘screening.’ Thus, if the doctor’s client submits to the ‘screening’ and the cancer is diagnosed under the ‘screening’ (and promptly treated), her risk of dying from this case of the cancer will be reduced by the same X%.
For the assessment of the magnitude of that proportional reduction in the cancer’s case-fatality rate resulting from its early treatments (provided for by diagnoses under the ‘screening’), one might first contemplate the theoretically ideal trial: Suitably-informed volunteers from the domain of the study (of freedom from the cancer’s clinical manifestations, etc.) would be enrolled to application of the diagnostic protocol, and those diagnosed with the cancer (preclinically) – with a genuine, not overdiagnosed case of it – would be randomly assigned to the defined early (undelayed) treatment or to its late (overt-stage) alternative. The members of these two cohorts would be followed long enough to determine whether the cancer was cured. (If the cure rates are R1 and R0, respectively, then the result for the proportional reduction in the case-fatality rate is 1–[1–R1]/[1–R0].) But the actual conduct of such a trial would be marred by (unavoidable) overdiagnoses, and the feasibility of its execution (with whatever flaws in the assurance of validity) would be negated by ethical considerations if not by unavailability of volunteers.
An ethically feasible trial would involve suitably-informed volunteers’ allocation (randomly) to a few rounds of the ‘screening’ – to the defined pursuit of early diagnosis and the defined treatment of the thus diagnosed cases – or alternatively to no ‘screening’ (and its associated, defined diagnosing and treatment of overt cases of the cancer), with further particulars in this so as to provide for actually obtaining a valid, and also reasonably precise, measure of the proportional reduction in the (genuine) cancer’s case-fatality rate attributable to the early treatments, and of the rate of overdiagnosis too. But implementation of the needed design – the necessary particulars in this (Miettinen 2014a) are not relevant here – would be very demanding if not downright impossible.
So, as a practical matter, the proportional reduction in the cancer’s case-fatality rate by its early diagnoses and treatments is not subject to proper quantification by clinical research involving only a few rounds of the ‘screening.’ However, a theoretically simple design involving long-term ‘screening’ has been implied (Miettinen et alii 2002): identification of eligible subjects from the designed domain; randomization of informed volunteers among these either to very long-term pursuit of early diagnosis and its associated early treatment (both protocol-defined) or to complete refraining from ‘screening’ for the cancer throughout the same period (with protocol definition of the respective diagnostic pursuits and treatments); and focus on the incidence of death from the cancer in a suitably defined period of follow-up of the two cohorts. But valid execution of such a design would, again, be challenging to the point of being practically prohibitive.
The concept of rate of mortality from a cancer in community medicine is profoundly different from the cancer’s case-fatality rate, which is of such great importance in clinical medicine: it is incidence density of death from the cancer. This is a dimensioned quantity (involving inverse time) and it has to do with a cared-for population which has turnover of membership and is in this sense dynamic (by being open to exit), in contrast to the dimensionless (purely numerical) case-fatality rate, having to do with a cohort-type (closed for exit) population (of study subjects). As for breast cancer, the former rate for women 60–69 years of age is of the order of 50 per 100,000 person-years and generally a strong function of age, while the latter rate is of the order of 30% at any age.
Another major difference is related to this: Thinking about that (density-type) mortality rate prevailing in a given stratum of the community population at a given time in reference to the cancer’s early diagnosis and treatment has to do with the people’s histories about these matters of clinical care, about the pattern of these histories as an explanation of the level of that mortality in that (sub)population at that time. This contrasts with the clinical thinking about an individual’s potential submission, at present, to the pursuit of the cancer’s potential early diagnosis and treatment and about the bearing of this – probabilistic – on the person’s risk for future occurrence of death from an already existing latent but detectable case of the cancer, with change in the cancer’s case-fatality rate central to this thinking (cf. above).
Despite those major differences, there is a quantitative interrelation between the ‘screening’-related mortality concerns in community medicine and clinical medicine, respectively: If the ‘screening’ (in the meaning of a particular way of pursuing early diagnosis and effecting early treatment) provides for some X% reduction in the cancer’s case-fatality rate (among genuine cases of the cancer diagnosed under the ‘screening’), then the incidence density of death from the cancer prevailing at the time in question has been reduced by that same X% in that segment of the community’s population in which the ‘screening’ histories are survival-optimal in the sense of being maximally preventive of death from the cancer occurring at that time. This is the segment in which a possible case of the cancer that was going to be fatal at this time was going to be diagnosed under the ‘screening’ and treated early (given that the diagnostics actually were prompted by the ‘screening’ rather than by interim symptoms).
Thus, if the prevalence of that optimal history (regarding prior ‘screening’) in a given stratum of the cared-for population is P, then in this stratum of the population the prevailing proportional reduction in mortality from the cancer (incidence density of death from the cancer), attributable to the early treatments that were afforded by the early diagnoses, is PX%. And if a new community-level program of the ‘screening’ will, after many years of its existence, have led to a new stable state in which the earlier P in that stratum has risen to its new counterpart P′, then the proportional reduction in mortality, PRM, from the cancer in that stratum, attributable to the new program of ‘screening,’ will be
where Q=X/100, the proportional reduction in the cancer’s case-fatality rate on a scale from 0 to 1.
So, the only mortality parameter (of Nature) relevant to policy decisions about population-level programs of screening for the cancer in question is that clinical parameter Q, which is not subject to quantification by clinical research (cf. above). But if epidemiological research – on etiology/etiogenesis of death from the cancer – addresses the causal incidence-density ratio, IDR, contrasting the index history of ideal diagnostics and treatment (above) with the reference history of no screening, these in a defined domain, then the parametric interrelation is, simply,
(those P and Pʹ being ad-hoc rates, not parameters of Nature).
When the concern is to provide policy-makers on ‘screening’ for a cancer the mortality input relevant to this – as in Switzerland just recently (Biller-Andorno and Jüni 2014) – needed is a review of the studies that have addressed that causal IDR. For, these studies, if valid, allow estimation of that elusive clinical parameter Q, which in turn allows estimation that which is the policy-makers’ concern – the PRM above – under whatever ad-hoc premises about P and P′. And as for these ad-hoc inputs, a reasonable treatment of them is to bypass this topic and to focus, simply, on that segment of the population stratum at issue in which the histories are optimal (in the sense specified above).
Study of the magnitude of that causal IDR, in whatever domain of people in a community, is a special case of etiologic/etiogenetic research, which generally is the source of the knowledge-base of community-level preventive medicine (of epidemiology, i.e.). The essentials of any such study are (Karp and Miettinen 2014): study base (of population-time), defined within the adopted source base; identification (complete) of the cases occurring in the source base and reduction of this series to the cases that occurred in the actual study base, and documentation of the relevant facts on this reduced series of cases; drawing a fair sample of instances (person-moments) from the source base (constituted by an infinite number of these), with reduction and documentation of this series analogously to the counterpart of this for the case series; and synthesizing the data on these two series into the study result on the causal (confounder-conditional) IDR in the study base.
The policy-relevant studies along these lines involve some rather subtle particulars, but these need not be addressed here. The need here is to emphasize the main point, namely this: Meaningful studies on mortality for policies about ‘screening’ for a cancer are very different from the now-orthodox trials for these purposes. And I add that the results of these still-heterodox studies – specifically in reference to population strata with survival-optimal histories of the screening (specified above) – can be expected to be very different from those of the now-orthodox trials, much more consistent with what true experts intuitively surmise.
The attempts thus far have been to estimate the epidemiological PRM (above) – on the (unrealistic) premises that P=0 and P′=1 (in which case PRM=Q) for the target population at large – on the basis of the very flawed trials, with the results for mammographic ‘screening’ quite disappointing (Biller-Andorno and Jüni 2014; Independent U.K. Panel on Breast Cancer Screening 2013). But the result for this PRM=Q from the only one of those trials with long-enough duration of the ‘screening’ (Andersson et alii 1988), derived from the relevant segment of follow-up time, is about 50% (Miettinen et alii 2002), even when substantially downward-biased by incomplete adherence to the assigned ‘screening,’ lack of refraining from ‘screening’ in the control cohort, and involving outdated early care for the cancer. The nonexperimental etiogenetic study outlined above is not subject to unpreventable bias from the first two of those sources, as it can be focused on actual use of defined types of early diagnostics and treatments in the index histories and absence of even any semblance of these in the reference histories.
Regarding public policies about the promotion of early diagnoses and treatments of breast cancer – the modern varieties of these – the value of the causal PRM, on the usual premises of P=0 and P′=1, could well be as high as, say, 70%, in contrast to “The relative risk reduction of approximately 20% in breast cancer mortality” cited by the Swiss panel (Biller-Andorno and Jüni 2014) from the recent report of a U.K. panel (Independent U.K. Panel on Breast Cancer Screening 2013).
While research for the knowledge-base of policies on ‘screening’ for a cancer has been very disappointing to me (Miettinen 2008), the foregoing gives an indication of the principal reason for the disappointing misguidedness of this research, derivative as well as original. We epidemiological researchers and, especially, theoreticians thereof have mostly been missing in action on the ‘screening’ front of the ‘war on cancer,’ inexplicably and unjustifiably. We, therefore, are largely responsible for misguided research continually misinforming public policies on ‘screening’ for cancers; and we thus are, also, largely responsible for the casualties that have unnecessarily been sustained on this front of the ‘war on cancer.’ Untold numbers of people whose cancers could have been detected early and cured by early treatment have died, and still die, from them because of the misguidedness of the research we’ve chosen to stay out of and to implicitly condone. As epidemiologic academics, our pursuits are to be guided not by personal interests (scientific) but by our common obligation (professional) to advance the knowledge-base of community medicine – its practice and policy-making for it – with a keen sense of priorities in this.
There is, also, another important lesson for us in all of this. Those trials on ‘screening’ for breast cancer (and their sequels for other cancers) are exceptionally instructive about something I’ve been saying for a long time, with only modest success: In statistical science for the knowledge-base of medicine, whether clinical and epidemiological, any study’s methods design is to be predicated on and governed by its principled objects design. In those trials, of the clinical type as they’ve been, the generic nature of the object of study should have been understood to be the clinical one: the risk/probability of death from the cancer as a function (causal) of its treatment (early vs. late) in a defined domain (preclinical) of applying the compared diagnostic-therapeutic regimens. Thoughtful attempts at the design of the particulars of the form of this function presumably would have prevented such seriously misguided conception of the of the requisite study design as has routinely marred these trials – and might well have led to the realization that the trial’s methods design is a very challenging task, notably in reference to short-term screening in the context of potential for appreciable tendency for overdiagnosis (Miettinen 2014a).
Beyond these core lessons per se, but having great bearing on learning them, I repeat also something I’ve been saying without any success at all: For there to be progress in epidemiology and in population-level (rather than laboratory-based, ‘basic’) epidemiological and clinical research, in the most stagnant aspects of these in particular, needed is public discourse on the issues (Miettinen 2008, 2014b). But I now am delighted to know that the Editors of this journal actually will arrange for a discussion of the fundamentals of research toward the scientific knowledge-base of public policies on ‘screening’ for a cancer, as a follow-up to this screed of mine.
The initial focus in this, I suggest, would best be the question of what is the parameter of Nature whose estimation really is relevant to decisions about public policies promoting a cancer’s early diagnosis and treatment; and in particular, is it, as I argue, the reduction in the cancer’s case-fatality rate resulting from its early clinical care replacing the late counterpart of this? And a related early focus in this discourse needs to be, I suggest, on the question of whether the relevant parameter has been, as I argue, seriously misrepresented by the results of such studies as now are viewed as the only source of policy-relevant information on ‘screening’ for a cancer.
It would be, I suggest, most instructive to have these questions initially addressed by some relevant Swiss expert(s) – with international commentaries on all of this to follow. If, as I hope, this public discourse converges to rejection of the prevailing normal-science paradigm on cancer-‘screening’ research and on the adoption of the ‘paradigm shift’ I here advocate, then a called-for sequel to the present discourse would be one on the particulars of the proposed ‘new paradigm.’
Andersson, I., Aspegren, K., Janzon, L., Landberg, T., Lindholm, K., Linell, F., Ljungberg, O., Ranstam, J., Sigfusson, B. (1988). Mammographic screening and mortality from breast cancer: The Malmö mammographic screening trial. British Medical Journal 297:943–948. Search in Google Scholar
Biller-Andorno, N., and Jüni, P. (2014). Abolishing mammography screening programs? A view from the Swiss medical board. The New England Journal of Medicine 370:1965–1967. Search in Google Scholar
Independent U.K. Panel on Breast Cancer Screening (2013). The benefits and harms of breast cancer screening: An independent review. Lancet 380:1778–1786 Search in Google Scholar
Karp, I., Miettinen, O. S. (2014). On the Essentials of Etiological Research for Preventive Medicine. European Journal of Epidemiology 29:455–457. Search in Google Scholar
Miettinen, O. S. (2008). Screening for a cancer: A sad chapter in epidemiology. European Journal of Epidemiology 23:647–653. Search in Google Scholar
Miettinen, O. S. (2014a). Toward Scientific Medicine. New York: Springer, 154–158. Search in Google Scholar
Miettinen, O. S. (2014b). Screening for breast cancer: What truly is the benefit? Canadian Journal of Public Health 104:e435–e436 (Editorial). Search in Google Scholar
Miettinen, O. S., Henschke C. I., Pasmantier, M. W. Smith, J. P. Libby, D. M., Yankelevitz, D. F. 2002. Mammographic screening: No reliable supporting evidence? Lancet 359:404–405. Search in Google Scholar
Mukherjee, S. (2012). The Emperor of All Maladies. A Biography of Cancer. New York: Scribner, 294–302. Search in Google Scholar
©2015 by De Gruyter