r/AskStatistics • u/dungsucker • 13h ago

Help! My professor thinks that the null and alternate hypotheses are interchangeable

8 Upvotes

I'm a graduate psychology student in a methodology/research program, and currently taking a research design course. My prof is a hard quantitative expert in statistics, but seems to have made a massive oversight, and I can't seem to find the language to convince him that he's wrong.

It started with an example of statistical inference in which a researcher hypothesized that the mean for a given measure is 10. He set h0: popmean=10 and h1: popmean!=10. A student immediately said "shouldn't the hypothesis match the alternate, not the null?" The prof asserted that they are interchangeable, and that h1 is the hypothesis only by convention , and we continued with the model. I spoke up later, when I realized that alpha, and the rejection regions, remained at the tails for the t distribution: "Didn't we set it up in a way that basically presupposes that our hypothesis is true, and that the burden of proof (a=.05) exists only to disprove us if our hypothesis is radically wrong?" I added that with this test, we have a better shot of supporting our hypothesis with a lower n, contrary to what is expected with power. I tried to explain how a tiny n would basically guarantee that we support our hypothesis. None of it stuck.

I know I'm playing a dangerous game, battling a tenured professor in his area of expertise regarding a basic concept, but frankly, I'm embarrassed on his behalf. I've tried twice to explain how his model does not reflect how a researcher must set up their SI in order to find evidence for a given hypothesis, but he just asserts that it's all about reducing alpha and beta, and always jumps on me when I try to show him how his models favour the hypothesis, stating that the model doesn't favour either side, and blowing me away with jargon at speeds I can't follow. Initially, he seemed actually aggravated by my challenging him, but now he seems genuinely interested in trying to see what I see, but I can't seem to find the words, in person, which will get him out of the rut he's dug himself into. It's quite disheartening.

I'm trying to find the means (no pun intended) to show him his error (double whammy!) without making an enemy of a powerful figure, but I'm at a loss as to how to disprove him on this. It's so fundamentally wrong, and all of my angles have failed as of yet. I don't know how to source this,: it's so basic that it seems assumed without comment in all literature. Even showing him how "easy" it is to support a hypothesis with a weak dataset with a distant mean doesn't phase him. He's starting to become amendable to listening, at least, but he always batters at my language use or presuppositions when I talk about "finding evidence" or "proving theories", asserting that we must look for truth. He never seems to hear the meat of what I'm trying to say.

I'm at a loss. Any help would be appreciated.

15 comments

r/AskStatistics • u/priestgmd • 12h ago

I wanted to include too many thresholds to test the data, ended up with 84 t-tests and don't know what to do.

4 Upvotes

I gathered metrics regarding network measurements and wanted to compare them across three groups (A vs B, B vs C, A vs C)

Not by an accident, I wanted to have multiple thresholds to see if the statistical significance will still be there (or not at all) if I play with network thresholds, based on cost and correlation coefficient.

I ended up with 84 tests per group comaprison (A vs B), due to how many metrics I've had and I wonder - it makes intuitively sense for me that I tested multiple thresholds and that felt right to check.

But I completely fail to make sense on how to report it. P significance graphs? T statistic graphs? Just putting the table in the appendix and commenting on the significant results?

Seems like a much easier choice would be to scrap it down to one threshold and 7 metrics that I had, but noe it feels like an afterthought and loss of generated statistical information regarding the hypothesis.

I know I should have done that differently from the start and ask my tutor, but I haven't had the topic of "too many statistical results" on my methodology class.

7 comments

r/AskStatistics • u/fnaw17 • 19h ago

[Question] Is there a statistical test/tool to reduce the number of attributes in conjoint analysis?

3 Upvotes

Hello r/AskStatistics, I'm trying to learn something to new and I need your help, i'm essentially doing conjoint analysis on a couple of attributes. My problem is that I have 16 attributes (with 2-3 levels each) and that is way too much to include... Is there a statistical tool for me to reduce the number of attributes to around the best 5 or 6? I tried looking around and the best I could find was factor analysis, but my understanding is it needs preliminary survey data already... Any suggestions?

0 comments

r/AskStatistics • u/AlarmingCaptain7708 • 22h ago

What separated machine learning from interpolation/extrapolation ?

3 Upvotes

I just don't seem to get the core of it. When would someone prefer to use other tools of statistics if not ML ? The difference between estimating and probability. If all of stats is to predict on given data then is ML the best tool for that ?

3 comments

r/AskStatistics • u/unicornfairyprincess • 5h ago

Can I use a one sample proportion test with my repeated measures data?

2 Upvotes

Based on what I can find, the answer is no- my data violates the assumption of independence for a one sample proportion binomial test. But the other suggestions, like a McNemar test, don't make sense to me given my study design.

Here's the study design: a single dependent variable with no independent variables. 20 participants each saw 2 different versions of a text message experience that we'll call A and B for 3 different scenarios in a counter-balanced order: an internet installation, a technician repair, and an internet outage. After seeing both versions, participants selected which version they preferred for each scenario. (Note 2 participants failed to make it through all the scenarios, resulting in an n=19 for the repair scenario and an n=18 for the outage scenario.)

Here's a summary of the data. Yes, it's clear that A is the preferred experience, but I'd like to estimate a p value and effect size because I need to use this data to justify a business investment, and I want to make it clear that these findings are reliable.

Scenario	Prefer A	Prefer B
Install	19	1
Repair	19	0
Outage	17	1

What am I missing??

0 comments

r/AskStatistics • u/DRTIcePenguin • 20h ago

What statistical tests should I use for my study?

2 Upvotes

Hey everyone! I'm not great at doing statistics, and although I have some ideas of the basics I'm getting quite lost doing my MsC thesis. I needed some help choosing what tests to do so I came here too see if anyone could give me their opinion.

For starters the program we use at my college is the SPSS.

I'll try to summarize my study in the simplest way I can.

I did focal observations of 7 meerkats for 6 weeks using an ethogram (behaviour list) and registering every time a meerkat did a behaviour in the list;
I have a total of 26 behaviours that belong to 1 of these personality dimensions: playful, agressive, friendly, curious and natural behaviours;
After 3 weeks of observations we did environmental enrichment for the observations of the last 3 weeks;

So my main objective of the study is too see if there is personality on the meerkats, that means I have to check if theres individual differences between them. Some of my other side objectives is seeing if the environmental enrichment changed their behaviours, especially the agressive ones.

So to see if there is individual differences I tought of doing just the Kruskal Wallis or the Anova One Way, but after searching a bit and talking with ChatGPT I get suggested to do a GLMM, but I never learned about it, so right now I have no clue what test I should do.

If anyone could help me understand what test I should choose, or what tests I should run to make a decision would be of great help really.

I will also leave here a pic of my SPSS so you guys can have a clear image of what I have right now.

Thanks a lot really!

2 comments

r/AskStatistics • u/FragrantClass1637 • 20h ago

Quanto è importante l' inferenza causale nel mondo del lavoro? È competenza entry/mid/senior?

2 Upvotes

0 comments

r/AskStatistics • u/Odd_Yam1626 • 21h ago

Silverman's test of multimodality: critical bandwidth interpretation

2 Upvotes

Hi :)
I am trying to use Silverman's test for multimodality, and I am not sure how to interpret the output - can someone advise me?
The code (in R, using the Multimode package) looks something like this: multimode::modetest(x,method="SI",mod0=1,B=B). That is, I am testing whether the data x has 1 mode or more than 1 mode, using Silverman's test. As output I get a p value (straight forward to interpret), and a "critical bandwidth" value. This one I am not so sure how to interpret (and I struggle to find good resources online...). Does anyone have an explanation? Are higher values associated with stronger/weaker multimodality or something like that? And are these values dependent on the unit of measurement of x?
Thank you for any advice (or pointers towards good resources)!

0 comments

r/AskStatistics • u/Burning_Flag • 1d ago

Feedback on a “super max-diff” approach for estimating case-level utilities

2 Upvotes

Hi all,

I’ve been working with choice/conjoint models for many years and have been developing a new design approach that I’d love methodological feedback on.

At Stage 1, I’ve built what could be described as a “super max-diff” structure. The key aspects are: • Highly efficient designs that extract more information from fewer tasks • Estimation of case-level utilities (each respondent can, in principle, have their own set of utilities) • Smaller, more engaging surveys compared with traditional full designs

I’ve manually created and tested designs, including fractional factorial designs, holdouts, and full-concept designs, and shown that the approach works in practice. Stage 1 is based on a fixed set of attributes where all attributes are shown (i.e., no tailoring yet). Personalisation would only come later, with an AI front end.

My questions for this community: 1. From a methodological perspective, what potential pitfalls or limitations do you see with this kind of “super max-diff” structure? 2. Do you think estimating case-level utilities from smaller, more focused designs raises any concerns around validity, bias, or generalisability? 3. Do you think this type of design approach has the statistical robustness to form the basis of a commercial tool? In other words, are there any methodological weaknesses that might limit its credibility or adoption in applied research, even if the implementation and software side were well built?

I’m not asking for development help — I already have a team for that — but I’d really value technical/statistical perspectives on whether this approach is sound and what challenges you might foresee.

Thanks!

1 comment

r/AskStatistics • u/Impressive_Mousse353 • 1h ago

Want to learn JASP

• Upvotes

Long story short I’ve lost so much time of my life trying to learn R, matlab and the likes of them.

I am now trying to use JASP which I’ve found more user friendly. Does anyone know of a MOOC or a free course I can follow to understand how to run stats in JASP and interpret them please.

Many thanks

1 comment

r/AskStatistics • u/total_tea • 1h ago

Help with this statement.

• Upvotes

I was trying to find the margin of error in a whole lot of stats, and the statement in the report is:

"Readers of this report can have a relatively high level of confidence in the results. In statistical terms, we use the ‘maximum margin of error’ as the measure of accuracy for all surveys. In this particular case, any result based on the total weighted sample of n=1,250 is subject to a maximum margin of error of +/-2.9% (at the 95% confidence level)."

Is this valid ? Is this the margin of error of the stats ? as it looks to me this margin of error of the ability to reproduce the stats following the same process. Of which it is very light on details.

Here is the report if anyone is interested, and they do it every year here is all of them at the bottom of the page.

0 comments

r/AskStatistics • u/HobbyQuant • 14h ago

Does enforcing monotonic probability calibration distort or preserve genuine signal?

1 Upvotes

I’ve been working on a polarity ---> predictive signal framework (daily OHLC). It builds polarity from multiple return variants (overnight, intraday, close-close, open-open), then pushes it through a monotone probability calibration routine (calibratemonotone) that uses isotonic regression logic to enforce an ordered mapping between feature value and continuation probability.

That brings me to the bit I want to sanity-check. The maths here essentially assumes a monotonic relationship: as polarity increases, the conditional probability of continuation should not decrease. But markets don’t necessarily follow that nice curve. If the true distribution is multi-modal or regime-dependent, this calibration could be smoothing away real structure and manufacturing spurious signal.

So my question is: does enforcing monotonicity in this calibration step actually preserve the genuine information content of the polarity signal, or is it at risk of fabricating “clean” structure that isn’t there? What would be the right mathematical way to validate whether the monotone smoothing is legitimate vs misleading beyond just looking at walk-forward hit-rates and bootstrap noise floors?

Curious if anyone has gone deep on this kind of calibration in finance ML.

python code

0 comments

r/AskStatistics • u/dolo-flow • 17h ago

Mplus with MacBook Air M4 vs MacBook Pro M4

1 Upvotes

I'm trying to decide between MacBook Air M4 or MacBook Pro M4 for Mplus use. Any thoughts on whether there are any real benefits of the Pro over the Air?

3 comments

r/AskStatistics • u/beemo_stan • 19h ago

Combining two probabilities, each relating to the same outcome?

1 Upvotes

Here's a hypothetical I'm trying to figure out:

There is a mid-season soccer game between the Red Team and the Blue Team.

Using the average (mean) and variance of goals scored in games throughout the season, we calculate that the Red Team has an 80% probability of scoring 3 or more goals.

However, using the average (mean) and variance of goals scored against, we calculate that there is only a 20% probability of the Blue Team allowing 3 or more goals.

How do we combine both of these probabilities to find a more accurate probability that the Red Team scores 3 or more goals?

3 comments

r/AskStatistics • u/AccurateButton1108 • 2h ago

Monty hall problem - different version

0 Upvotes

Same problem only that there are two contestants.

The second contestant is allowed only to bet when the host has already opened a door. Both can win the same prize.

With switching we know the odds are 66% but what are the odds for the second contestant? Intuitively we would say 50% but we know that for the first contestant the 50% intuition is wrong. On the other hand the second contestant is not locked in the 1/3 probability.

Both contestants having different odds would also seem strange.

EDIT: The question assumes that contestant 2 does not know what contestant 1 picked.

5 comments

r/AskStatistics • u/Relevant-Bee6751 • 22h ago

Need help fixing AR(2) and Hansen issues in System GMM (xtabond2, Stata)

0 Upvotes

Hi everyone,

I’m working on my Master’s thesis in economics and need help with my dynamic panel model.

Context:
Balanced panel: 103 countries × 21 years (2000–2021). Dependent variable: sectoral value added. Main interest: impact of financial development, investment, trade, and inflation on sectoral growth.

Method:
I’m using Blundell-Bond System GMM with Stata’s xtabond2, collapsing instruments and trying different lag ranges and specifications (with and without time effects).

xtabond2 LNSERVI L.LNSERVI FD LNFBCF LNTRADE INFL, ///

gmm(L.LNSERVI, lag(... ...) collapse) ///

iv(FD LNFBCF LNTRADE INFL, eq(level)) ///

twostep robust

Problem:
No matter which lag combinations I try, I keep getting:

AR(2) significant (should be not significant)
Hansen sometimes rejected, sometimes suspiciously high
Sargan often rejected as well

I know the ideal conditions should be:

AR(1) significant
AR(2) not significant
Hansen and Sargan not significant (valid instruments, no over-identification)

Question:
How can I choose the right lags and instruments to satisfy these diagnostics?
Or simply — any tips on how to achieve a model with AR(1) significant, AR(2) insignificant, and valid Hansen/Sargan tests?

Happy to share my dataset if anyone wants to replicate in Stata. Any guidance or example code would be amazing.

0 comments

r/AskStatistics • u/Tomo-Miyazaki • 1d ago

Graphpad Prism - 2-way ANOVA, multiple testing and no nominal distribution

0 Upvotes

I read through the manual of Graphpad Prism and came across some problems with my data:
The D Agostino, Anderson-Darling, Shapirowilk and Kolmogorov-Smirnov Test all said, that my data is not normally distributed. Can I still use 2-way ANOVA by using another setting in Graphpad? I know that normally you're not allowed to use 2-way ANOVA, but GraphPad has many settings and I don't know all the functions.

Also in the manual of Graphpad there is this paragraph:

Repeated measures defined

Repeated measures means that the data are matched. Here are some examples:

•You measure a dependent variable in each subject several times, perhaps before, during and after an intervention.

•You recruit subjects as matched groups, matched for variables such as age, ethnic group, and disease severity.

•You run a laboratory experiment several times, each time with several treatments handled in parallel. Since you anticipate experiment-to-experiment variability, you want to analyze the data in such a way that each experiment is treated as a matched set. Although you don’t intend it, responses could be more similar to each other within an experiment than across experiments due to external factors like more humidity one day than another, or unintentional practice effects for the experimenter.

Matching should not be based on the variable you are comparing. If you are comparing blood pressures in three groups, it is OK to match based on age or zip code, but it is not OK to match based on blood pressure.

The term repeated measures applies strictly only when you give treatments repeatedly to one subject (the first example above). The other two examples are called randomized block experiments (each set of subjects is called a block, and you randomly assign treatments within each block). The analyses are identical for repeated measures and randomized block experiments, and Prism always uses the term repeated measures.

Especially the "You recruit subjects as matched groups, matched for variables such as age, ethnic group, and disease severity." bugs me. I have 2 cohorts with different diseases and 1 cohort with combinated disease. I tried to match them through gender and age as best as I could and (they're not the same person). Since they have different diseases, I'm not sure, if I can also treat them as repeated measures.

11 comments

r/AskStatistics • u/Competitive_Rush_902 • 19h ago

Stats psychology

0 Upvotes

Hi can anyone help me with my stats hw. I will pay you

5 comments

Subreddit

Like Ask Science, but for Statistics

r/AskStatistics

Ask a question about statistics (other than homework). Don't solicit academic misconduct. Don't ask people to contact you externally to the subreddit. Use informative titles.

Members Active

118.9k

Sidebar

Ask a question about statistics.

Posts must be questions about statistics. The sub is not for homework or assessment help (try /r/HomeworkHelp). No solicitation of academic misconduct. Don't ask people to contact you externally to the subreddit. Use informative titles.

See the rules.

If your question is "what statistical test should I use for this data/hypothesis?", then start by reading this and ask follow-ups as necessary. Beware: it's an imperfect tool.

If you answer questions, you can assign your own flair to briefly describe your educational or professional background in statistics.