Department of Philosophy
John Hopkins University
Mill's Sins
In her chapter on Peirce, Deborah Mayo argues that Peirce is a saint and Mill a sinner with respect to the Commandments of the Error-Statistical Program. Mill's sins are these: (a) his characterization of induction completely omits the idea of severe testing; (b) his inductions conform to the puerile "straight rule"; (c) he supposes that an induction is an argument in which the conclusion is assigned a high probability, (d) he assumes that all inductions presuppose a principle of uniformity of nature, a presupposition that is vague, unwarranted, and unnecessary. In my paper I will attempt to offer at least enough of a defense of Mill to free him from ranks of the sinners and to offer some challenge to the aforementioned Commandments.
Philosophy Department
The Flinders University of South Australia
Can Scientific Theories be Warranted?
Sir David Cox
Dept of Statistics, University of Oxford
Some Remarks on the Nature of Scientific Inference (coauthored with Deborah Mayo)
After a brief discussion of the role and importance of objectivity in science and statistical analysis a short outline is given of a frequentist basis for statistical interpretation of data. The central role of a probability model of the data-generating process is stressed and the contribution of various kinds of conditioning explained. Some of the complications that can arise in applications are mentioned and a brief critique of alternative Bayesian approaches set out.
Recommended Background Readings:
Cox, D. (1958) "Some problems connected with statistical inference"
Mayo, D. & Cox, D. "Frequentist Statistics as a Theory of Inductive Inference"
Dr. Clark Glymour
Dept of Philosophy, Carnegie Mellon University
Senior Research Scientist at the Institute for Human and Machine Cognition (West Florida)
Bayesian Ptolemaic Psychology
There is a meta-method with a long tradition that saves the phenomena by fitting a potential infinity of parameters by some procedure that is guaranteed to fit any data of a very general kind. Ptolemy's Almagest is the most ancient known example. The contrast is with Kepler's version of Copernican theory, in which many empirical patterns and regularities are closely constrained by the theoretical framework. The Ptolemaic approach has been popular in cognitive psychology in the form of universial programming systems with a psychological gloss. The Ptolemaic approach has recently taken another form in psychology, as "rational" Bayesian modeling of human judgement, for example of causal relations. Considering four recent psychological papers, I argue that Bayesian models for these experiments are computationally implausible, and I suggest that, quite generally, low complexity dynamic learning models are more plausible.
Dr. Henry Kyburg
Depts of Philosophy, Computer Science, University of Rochester
Recommended Background Readings:
Dr. Larry Laudan
The Defendant's Burden: the Onus Probandi and the Anomaly of Affirmative Defenses
Error minimization concerns will doubtless form the focus of many of the contributions to this conference. But the distribution of such errors as do occur is just as important as their avoidance. The criminal law offers fertile terrain for anyone interested in studying the mechanisms for distributing error, since a host of familiar doctrines in the law (e.g., reasonable doubt, the presumption of innocence, and the benefit of the doubt) are directed specifically at insuring that false positives will be rarer than false negatives.
Conventional wisdom among legal scholars has it that the standard of proof plays the principal role in skewing such errors as do occur in the direction of false acquittals rather than false convictions. Without disputing that claim, I will suggest that another important determinant of the distribution of errors in a criminal trial is the location of the burden of proof. Contrary to folk mythology about the law, that burden is often made to fall on the defendant. When it does, the whole pattern of expected error distributions shift dramatically and in ways that raise doubts about the commitment of the judicial system to the old saw that "it is better that 10 guilty men go free than that one innocent defendant is condemned." More generally, I will claim that epistemologists have largely ignored the subtle ways–both within the law and outside it --- in which the location of the burden of proof on one party rather than another in an argument can impact not only the distribution but the likelihood of error.
Recommended Background Readings:
Dr. Deborah G. Mayo
Philosophy Department
Virginia Polytechnic University
Day 1
Severe Testing, Error Statistics, and the Growth of Theoretical Knowledge
PDF of paper Sidebar 1 Sidebar 2 Sidebar 3
I take up a challenge posed by a number of philosophers of science (contributing to this conference) to show how low-level experimental knowledge is relevant for probing high-level theories. The recommendation that high level theories be accepted if they are comparatively "best-tested" so far is shown to be at odds with the goal of severe testing. In an experimentally grounded account of theory appraisal, the key goal is to measure, not comparative support, but how far off a given theory may be from what a "correct" theory would need to say about a specific aspect of a phenomenon by setting severe bounds on possible violations. We illustrate with experimental general relativity (GTR). Progress hinged on systematically considering why GTR has not passed a severe test as a whole, by identifying ways that discrepancies in key gravitational parameters could have failed to be detected with existing data.
Recommended Background Reading:
Day 2
Mayo, D. and Spanos, A. (2006) "Severe
Testing as a Basic Concept in a Neyman Pearson Philosophy of Induction"
Link to Abstract
Link
to Paper
Recommended Background Readings:
Mayo, D. (2005) "Evidence as Passing Severe Tests: Highly Probable versus Highly Probed Hypotheses"
Mayo, D. & Kruse, M. "Principles of Inference and Their Consequences"
Day 3
Recommended Background Reading:
Dr. Alan Musgrave
Dept of Philosophy, University of Otago, New Zealand
alan.musgrave@stonebow.otago.ac.nz
Critical Rationalism, Explanation, and Severe Tests
My paper has three parts. First, I explain the version of critical rationalism that I defend. Second, I discuss explanation, and defend critical rationalist versions of inference to the best explanation and its meta-instance, the Miracle Argument for Realism. Third, I ask whether critical rationalism is compatible with Deborah Mayo's account of severe testing. I answer that it is, contrary to Mayo's own view. I argue, further, that Mayo needs to become a critical rationalist - as do Chalmers and Laudan, too.
Dr. Aris Spanos
Dept of Economics, Virginia Tech
Statistical Induction, Severe Testing and Model Validation
Paper (updated 6/11/06)
A number of important methodological issues in statistical modeling and inference depend crucially on the notion of statistical induction adopted. An attempt is made to articulate the notion of statistical induction underlying modern frequentist inference going back to Fisher (1922). The paper brings out the di•erences in the nature of inductive reasoning underlying estimation, testing and prediction, emphasizing the distinction between factual and counterfactual reasoning.
Particular emphasis is placed on the role of type I and II error probabilities,
pre-data, as measures of the ‘trustworthiness’ and optimality of test procedures.
Post-data, error probabilities can be used to render the traditional coarse
accept/ reject decision more informative by evaluating the severity with which
a hypothesis or a claim passes a particular test, with data x. The severity
assessment provides a data-specific inferential interpretation of the accept/reject
decision and can be used to address several methodological problems raised
in the context of Neyman-Pearson testing. Particular emphasis is placed on
the nature of the severity assessment and the associated post-data error probabilities,
as they relate to the pre-data error probabilities.
The evaluation of error probabilities (pre-data or post-data) assumes the validity of the statistical premises, because any departures will render the inductive inference unreliable to a greater or lesser extent. The paper discusses the importance of ensuring statistical adequacy using thorough misspecification testing and respecification. It also demonstrates how statistical adequacy can be used to shed light on a number of methodological problems such as model validation vs. model selection and statistical vs. substantive adequacy.
Recommended Background Readings:
Spanos, A. (2000) "Revisiting data mining: 'hunting' with or without a license"
Spanos, A. "Revisiting the Omitted Variables Argument: Substantive vs. Statistical Adequacy"
Dr. John Worrall
London School of Economics
Error, Tests and Theory-Confirmation in Science
In earlier work (notably my (2002)), I defended the ‘heuristic’ account of confirmation against a number of (strongly supported) criticisms. I argued that these criticisms force the realisation that the heuristic account must distinguish two quite separate kinds of theory-confirmation by evidence:
(1) some evidence gives very strong (perhaps conclusive) confirmation of
a certain specific theory , if a general theoretical framework is presupposed
as given, but that evidence gives no confirmation for that general theoretical
framework itself – this type of confirmation is ineliminably conditional
on a given general framework (or research programme);
(2) some evidence, although - in line with Duhem’s analysis - only
deductively entailed by a specific theory built within a general framework,
nonetheless confirms not only that specific theory but also the general
framework.
Once this distinction is made then all of the intuitive judgments about
confirmation in particular cases in science – both those that had
been used to defend the heuristic account and those that had been used to
criticise it – are captured by the account.
Deborah Mayo’s account of confirmation in her (1996) also claims to
capture all these same intuitions about particular cases of confirmation.
In this paper, I systematically compare Mayo’s account with my own.
I argue that although it might initially seem that my account and hers are
simply two different ways of saying much the same thing, in fact her account
is importantly skewed. The distinction pointed to earlier is real and highlights
two quite separate ways in which evidence is used in science. Mayo’s
challenging attempt to produce a ‘one size fits all’ account
that sees all confirmation as the failed result of attempts to identify
error is itself an error.
References
Mayo, D. (1996), Error and the Growth of Experimental Knowledge, Chicago:
The University of Chicago Press.
Worrall, J (2002), ‘New Evidence for Old’ in P. Gardenfors et
al. (eds.) In the Scope of Logic, Methodology and Philosophy of Science,
Amsterdam: Kluwer.
Recommended Background Reading: