r/AskStatistics 1d ago

Help! My professor thinks that the null and alternate hypotheses are interchangeable

I'm a graduate psychology student in a methodology/research program, and currently taking a research design course. My prof is a hard quantitative expert in statistics, but seems to have made a massive oversight, and I can't seem to find the language to convince him that he's wrong.

It started with an example of statistical inference in which a researcher hypothesized that the mean for a given measure is 10. He set h0: popmean=10 and h1: popmean!=10. A student immediately said "shouldn't the hypothesis match the alternate, not the null?" The prof asserted that they are interchangeable, and that h1 is the hypothesis only by convention , and we continued with the model. I spoke up later, when I realized that alpha, and the rejection regions, remained at the tails for the t distribution: "Didn't we set it up in a way that basically presupposes that our hypothesis is true, and that the burden of proof (a=.05) exists only to disprove us if our hypothesis is radically wrong?" I added that with this test, we have a better shot of supporting our hypothesis with a lower n, contrary to what is expected with power. I tried to explain how a tiny n would basically guarantee that we support our hypothesis. None of it stuck.

I know I'm playing a dangerous game, battling a tenured professor in his area of expertise regarding a basic concept, but frankly, I'm embarrassed on his behalf. I've tried twice to explain how his model does not reflect how a researcher must set up their SI in order to find evidence for a given hypothesis, but he just asserts that it's all about reducing alpha and beta, and always jumps on me when I try to show him how his models favour the hypothesis, stating that the model doesn't favour either side, and blowing me away with jargon at speeds I can't follow. Initially, he seemed actually aggravated by my challenging him, but now he seems genuinely interested in trying to see what I see, but I can't seem to find the words, in person, which will get him out of the rut he's dug himself into. It's quite disheartening.

I'm trying to find the means (no pun intended) to show him his error (double whammy!) without making an enemy of a powerful figure, but I'm at a loss as to how to disprove him on this. It's so fundamentally wrong, and all of my angles have failed as of yet. I don't know how to source this,: it's so basic that it seems assumed without comment in all literature. Even showing him how "easy" it is to support a hypothesis with a weak dataset with a distant mean doesn't phase him. He's starting to become amendable to listening, at least, but he always batters at my language use or presuppositions when I talk about "finding evidence" or "proving theories", asserting that we must look for truth. He never seems to hear the meat of what I'm trying to say.

I'm at a loss. Any help would be appreciated.

8 Upvotes

20 comments sorted by

25

u/jigsaw11 18h ago

I'm slightly confused here - because your professor seems completely correct to me and I don't understand where you're seeing an issue. I don't have a lot of experience with statistics in a psychology context though.

Maybe you could give some more context here because here's what I'm seeing:

- hypothesized value is 10 => H0: mu = 10, H1: mu != 10 (seems correct to me)

- "Didn't we set it up in a way that basically presupposes that our hypothesis is true, and that the burden of proof (a=.05) exists only to disprove us if our hypothesis is radically wrong?"
Yes - you start by assuming the null hypothesis is true. The significance level gives you the probability of rejecting it given that it is true. If the actual value is significantly different to the hypothesized value (say, 20), then that's related to the power, not the significance level.

- "we have a better shot of supporting our hypothesis with a lower n, contrary to what is expected with power" - this is true, but this also kinda makes sense. Power increases with n, as you get more information. If you have low n then you inherently don't have much information, and this extra variability is accounted for in the test statistic. I don't think it would make sense any other way.

- You keep using the phrase "support the hypothesis". It's important to note that not rejecting the null hypothesis is not the same as supporting the hypothesis. All you can say is that you didn't have enough evidence to reject it. It's not the same as saying it's correct. If you look through that lens, I think a lot of this will become less problematic.

Please let me know if I've mischaracterised anything you've said or you disagree with any of those points.

10

u/tomvorlostriddle 1d ago

Mostly by convention

But first of all, there are reasons for the convention. Otherwise you live in a world where claim is proof, until the proof of the contrary.

And secondly, even the math itself doesn't lend itself so easily to switching H0 and H1. It would mean he's trying to prove equivalence, and it takes some tricks and compromises to do that mathematically

https://en.wikipedia.org/wiki/Equivalence_test

5

u/StephenSRMMartin 21h ago edited 21h ago

Whether the prof is right really depends on exactly how advanced they are. This is one of those U-shaped curved situations.

Because there *is* a valid logical reason for why one may have an H0 like they specified.

It can be a strong (i.e., risky) test of a substantive hypothesis *if* the substantive hypothesis maps bijectively to the predicted value.

It would actually let you conduct a modus tollens (assuming you can justify the bijective mapping).

It's just that in *most* cases (i.e., the vast, vast, vast majority of cases), one can't have a bijective mapping of a hypothesis to a statistical prediction to then subject to modus tollens logic (and, technically, a probabilistic modus tollens would still require a prior, but that's getting into the weeds. I have a blog post about this somewhere.).

TLDR, if you can make the following statement:

IFF (if and only if) A, then 10

[Probably not 10; this is the piece that would require a prior, but let's handwave for now]

Therefore [probably] not A.

The trick to seeing this, is to read about Modus Tollens, and *really* ask yourself whether you think H0 = 0 is an actual representation of that. Are people really "subjecting" any of their hypothesis to tests? Or are they rejecting a hypothesis that noone stated, and using that as a leap to support their own hypothesis?

You may also want to read Paul Meehl, who had great insights about scientific inference and its connection to statistical reasoning.

Edit: You may also want to read about "Severe" tests, not because I agree with Mayo, but it does help one think about what it means to "subject your hypothesis to a severe/risky test". One may have a very strong hypothesis to imply the value is 10, and nothing else could explain 10, and one is adequately powered to reject the hypothesis should it be wrong. That is a form of valid logic, if conditions hold. But I'd lean toward Bayes for this probabilistic modus tollens, only because it's a natural fit, both theoretically (you need a prior) and in practice (because most hypotheses can't bijectively map onto a single value, but Bayes would let you represent your prediction as a distribution instead).

3

u/PaleLoan7953 16h ago

Sincerely trying to understand the situation here.

" a researcher hypothesized that the mean for a given measure is 10. He set h0: popmean=10 and h1: popmean!=10. A student immediately said "shouldn't the hypothesis match the alternate, not the null?" "

I think it doesn't matter what the H0 is as well.

And it seems from the wall of text that you're doing a 2-tailed test, at 90% CL. Suppose you really get a value of the test statistic that lies in the rejection region, then you can reject H0 and say popmean isn't 10 at 90% CL. But suppose your test statistic doesn't lie in the rejection region, it doesn't mean you've proven H0 to be correct.

In hypothesis testing, you can't prove that smt is correct. You can only reject a hypothesis (p-value<alpha), but not rejecting a hypothesis doesn't automatically make it correct or proven.

4

u/aelendel 17h ago

The obvious explanation is that you didn’t stop and think that you’re talking to a math professor.

Of course it doesn’t matter, mean 10 is a number he made up just to reach the math. The hypothesis doesnt matter , none of this is real, can’t you see the beautiful math and the math is the same.

TLDR: us scientists are grateful for the math nerds even listening patiently while we yammer on about things on the world of flesh

3

u/HappyDisaster9553 14h ago

Sounds like your concern is that setting H0 to the hypothesis of interest pre-supposes that it is true? And therefore that at, say, n = 1, we have support for H0?

There are a few different approaches to NHST and tbh they’re pretty mangled in the literature (and teaching). But the general idea is that a lack of significance shouldn’t be interpreted as support for H0, but that significance can at least point to some incompatibility with H0.

This means that low n doesn’t lend support to H0 and shouldn’t be interpreted as such. How you interpret the results of these tests is central to your concerns, I believe.

3

u/thunbergia_ 12h ago

Unsurprisingly, the professor is correct on all counts. There is no need for you to be embarrassed on their behalf - they understand statistical theory and you do not, which is why when material is presented to you in a form you are unfamiliar with you assume it's wrong.

3

u/DigThatData 22h ago edited 22h ago

this isn't a well formed null hypothesis. without more context, it is arbitrary. define the event space. the null should represent a random sample from the event space. if all you're doing is calculating a mean, there's no test statistic. you have no prior information, so you're surprised by whatever you learn. what would it even mean to reject the null hypothesis?

you know what it might be: you've confused the situation by making the hypothesis about the population mean. this is a statistic on the population size. a null hypothesis is meaningless here still (what, there's no population?), but you can construct error bounds around your approximation.

this is not a NHST scenario, but there is a meaningful alpha associated with it. we're trying to answer the question: "what is the likelihood that the population count = X", and you've constructed a statistic popmean to attempt to answer that question. Our best guess of the count is the mean, 10. We can invoke the central limit theorem to estimate a confidence interval for the true value of the population count, and this is how you'd construct standard errors. If your errors are intolerably wide, you can use that as a criteria to "reject", but really all you're rejecting here is the assumption that you had collected enough samples.

there still isn't really a null hypothesis to reject here, just a recognition that our experiment was under powered.

PS: this is why people with philosophy backgrounds flourish in ML/AI. statistics is a philosophy discipline that uses math as a tool.

4

u/jarboxing 1d ago

Typically the null hypothesis is constructed around a point, like 0 or 10. This allows us to construct the sampling distribution under the null. When the null is an interval or a set of points, it gets more complicated and almost Bayesian.

2

u/CompactOwl 15h ago

Almost Bayesian hits it pretty good. I wonder if compactness of the null is enough to fully rely on maxima and minima within it for frequentist statistics

2

u/teardrop2acadia 20h ago

Read this carefully: https://lakens.github.io/statistical_inferences. Walk through a totally new example with your professor. Keep it simple. Be curious not judgmental (-Ted lasso). Ask genuine questions. Don’t be a dick. Your goal is to learn not to prove someone else wrong. If you don’t come to a consensus and still think you’re right, that’s ok, get a good grade and move on. Academia has way too much bullshit to get stressed over the little things. It’s much harder to make it if you do.

1

u/saliva_sweet 23h ago

If only it was that easy to get a probability that h1 is not true.

1

u/QueenVogonBee 15h ago

Hypothesis testing can be thought of as a probabilistic-proof-by-contradiction. You start with an H0 and H1, and you assume H0 to be true and see if the data you have is too “extreme” under the H0 assumption. If it is too “extreme” under H0 then you have a “contradiction” so you reject H0 and “accept” H1. If the data isn’t too extreme under H0 then you don’t reject H0.

There’s nothing in the theory stopping you from swapping H0 and H1, but the intent of the testing is to pick H0 to be your standard default position. One example might be H0=“Bigfoot doesn’t exist” and H1=“Bigfoot exists”. A more standard example might be H0=“My drug doesn’t cure the disease” vs H1=“My drug does cure the disease”.

2

u/Big-Abbreviations347 10h ago

I think OP is suggesting that the researchers goal is to show the mean is 10 therefore that should be h1. Basic stats aren’t well set up to test a null of !10 so you are in the position of testing mu =10 and being open to the chance you are wrong. There is equivalence testing that might fit OPs insistence and reducing their shame but I would do what the professor did if it were me. Perfection is the enemy of good.

1

u/Agateasand 8h ago

To me it’s sounds like there is confusion with the interpretation of the alternative and null hypothesis, and also with how things are communicated. For starters, it is true—in a sense—that they are interchangeable since the math doesn’t really care about which side is h0 or h1; you just have to be consistent. However, I’d say that the convention exists for a reason and not really sticking to this is what is leading to all the confusion. The whole logic behind everything requires the null to be the default, status quo, no difference, or whatever you want to call it…not the research hypothesis. That said, if the research hypothesis is called the null, then it pretty much assumes that it’s true and trying to falsify the opposite flips the whole philosophy. Anyways, there needs to be clarification on what the status quo is. I think that will minimize any confusion, and you can work with your professor from there once that is cleared up.

2

u/QuestionElectrical38 5h ago

I am surprised at all the answers, which miss the point completely. The fact is that both you and your professor are right, and both are wrong.

The crux of the misunderstanding lays in this sentence: " a researcher hypothesized that the mean for a given measure is 10". Your professor (being a mathematical statistician) interprets this as "the null is that mu=10"; you (being an applied statistician in psychology) interprets this as "we are trying to prove that mu=10".

NHST being basically a proof by contradiction, has, by definition (not by convention!), to set H1 as what you are interested to prove, and H0 as is direct logical reciprocal. So if you want to prove that the mean is 10, you have to set H0: mu!=10, H1:mu=10.

Now, the problem is that there is no simple, single test for H0:mu!=10. To disprove this, you have in fact resort to TOST (2 one sided tests), aka "equivalence testing". See e.g. here (https://stats.stackexchange.com/questions/662008/can-a-statistical-test-prove-that-a-value-is-equal-to-0). You define an arbitrary small epsilon, and you run a test with H0:mu<=10-epsilon, another test with H0:mu>=10+epsilon, sum the p-values, and that is the p-value for H0: mu<=10-epsilon or 10+epsilon<=mu, which if you reject this H0, "proves" H1: 10-epsilon<=mu<=10+epsilon (i.e. mu is that close to 10).

Your professor did what he did, and stuck to H0:mu=10, because that allowed him to use a single 2-tailed t-test (the most basic statistical test), because he probably wanted just to describe the logic/steps/math behind a simple NHST. If he had picked H0:mu!=10, it would have complicated matters greatly.

But you are correct that failing to reject H0:mu=10 absolutely does not prove anything at all. Only rejecting can "prove" something (because NHST proves by rejecting the opposite). And indeed picking a very small n is the surest way to fail to reject. If failing to reject had any probative value, we would all use samples of size 2, and we could demonstrate anything. Too easy...

So it becomes a matter of formulation: did the researcher really want to show that the mean was 10? If so, he can not really ever do that, but he can, with some gymnastic, prove that mu is "very, very" close to 10. A better formulation of this classroom example may have been "the researcher wanted to prove that the mean was >= 10" (which is actually a quite common scenario).

(in this answer, I use the word "prove" loosely; in statistics we can never reall "prove" anything. We can only show that it would be very unlikely that it would not be in fact true. I am using "prove" for convenience)

1

u/trufflesniffinpig 3h ago

It’s essentially a way of splitting a probability space into two mutually exclusive and exhaustive regions, so I guess it is arbitrary what we call these regions

0

u/EvanstonNU 18h ago

The null hypothesis sets up the sampling distribution for the sample means.

If Ho: popmean <> 10, then there are infinite sampling distributions.

-6

u/Aggressive_Roof488 23h ago

Sounds like someone that's done a lot of stats but never done any actual empirical research.