## A Statistics Question

Since my amateur discursion into stochasticity appears to have flushed out all of the mathematically savvy, I’m going pose a real life statistics question for you. I have some data that are non-normally distributed (in fact, they really don’t seem to fit any distribution well–and, yes, I’ve tried various transformations and the data still don’t fit anything). If the data were normally distributed, I would like to perform ANOVA to partition the sources of variance (using percent sums of squares).
Since the data aren’t normally distributed, ANOVA is not the right test to use. Because I’m using a full-factorial model with four factors, using Friedman’s test or any of the one-way non-parametric tests doesn’t seem to apply either. I should add that, while my data are continuous, I can transform the data into ordinal and/or nominal categories if that would help. While logistic regression could circumvent the non-normality problem, I’m unaware of how one partitions variance with that test (if there’s a way to do so, please let me know).
Any ideas? If there are any statistics programs or R packages out there that can handle this, please let me know. Don’t be shy…

This entry was posted in Statistics. Bookmark the permalink.

### 5 Responses to A Statistics Question

1. igor eduardo kupfer says:

Warning: there is an excellent chance that I don’t know what I’m talking about.
The Kruskal Wallis test is the nonparametric analogue of ANOVA.
Or you could force your data into a gaussian distribution using ranks, a la the following R code:
x = (rank(x)-.5)/length(x)
x = qnorm(x)
Here’s a paper that might be of service, “Rank-Based Analyses of Linear Models Using R”:
http://www.jstatsoft.org/v14/i07/v14i07.pdf
The R library MASS has some other robust stuff as well, from what I recall.

2. eric bloodaxe says:

Perhaps your results are purely random, so there would be noway you get them to mean anything.

3. Josh says:

Do you get significant results from an ANOVA? Violating the assumptions will usually make your test less powerful, so if you still get significance, that’s a good sign. ANOVA is generally considered robust to non-normality. Are your variances fairly constant? ANOVA is less resistant to heteroskedasticity than to non-normality.
This posting gives some hints about using proportional odds models as a path to robust ANOVA. For logistic regressions, look especially at the conditional logit functions.

4. Steff Z says:

Mike, you’ve got two copies of the original post.