I am new to the use of the R programming language and I have tried with this to do a Kruskal Wallis and then a Dunn test (with Bonferroni correction) using data from 6 fish species, 15 months of sampling, different number of samples ( between 25 and 45 replicates per month) between months and the nature of the data is the number of animals present in a given space. My samples are neither normal nor homocedastic (I have tried log and square root transformations without success). I want to know what months are different to what months for each of the species. I have 3 problems related to the dunn test:

  • The Kruskal Wallis (package kruskal.test) for the fish rockfish gave me a p-value of 0.01, therefore significant (there are significant differences between months). Surprisingly the dunn test (package dunn.test) did not give me any combination of groups (in this case months) with p significant value.
  • For the fish Anoplopoma fimbria the problem was another. Kruskal Wallis gave me a significant p-value but in dunn test in certain combinations of months, let's say AB for putting a generic name to a couple of months gave me a p-value was not significant and BA (the same combination in opposite order) ) gave me significant. The same thing happened with a given month (which I will generically call A) with the same month (combination A with A and combination A with A again). This has been repeated for the other species.
  • I tried an alpha = 0.05 for the dunn test and repeated again with an alpha = 0.01 to see if the results were less chaotic (many possible combinations of months). The result was exactly the same. The same p values.
  • If it helps here I leave the script that I used and the result that it gave me for rockfish and below for Anoplopoma fimbria :

    > rockfish.krustal.wallis <-read.table(file.choose(), header=T)
    > names(rockfish.krustal.wallis)
    > library("dunn.test", lib.loc="~/R/win-library/3.2")
    > dunn.test(rockfish$counts, g=rockfish$months, kw=TRUE, method = "Bonferroni", alpha = 0.01)

    The result of Kruskal Wallis for rockfish:

      Kruskal-Wallis rank sum test
    data: x and group
        Kruskal-Wallis chi-squared = 29.2316, df = 13, p-value = 0.01

    Here the result of the dunn test for rockfish:

    The script for Anoplopoma fimbria :

    > rockfish.krustal.wallis <-read.table(file.choose(), header=T)
    > names(rockfish.krustal.wallis)
    > library("dunn.test", lib.loc="~/R/win-library/3.2")
    > dunn.test(Anoplopoma_fimbria.krustal.wallis$counts, g=Anoplopoma_fimbria.krustal.wallis$months, kw=TRUE, method = "Bonferroni", alpha = 0.01)

    The result of the Kruskal Wallis test:

    Kruskal-Wallis rank sum test
    data: x and group
    Kruskal-Wallis chi-squared = 346.7977, df = 13, p-value = 0

    The result of the dunn test for Anoplopoma fimbria:

    As in the tables that result from your Dunn test, you make a Bonferroni correction to the results. When you compare, as in your case, 13 averages between them, between 13 x 13 = 169 comparisons. You can imagine that the chance of something resembling chance is greater when you compare 13 groups to, for example, comparing only 2 averages (2 x 2 = 4 combinations instead of 169).

    In technical terms you are losing degrees of freedom by increasing the number of comparisons. And in practice the most common solution is a Bonferroni correction as you do above.

    For the same reason, it should not surprise you that a significant result with Kruskal Wallis does not result in significant results with Dunn when you have 13 groups. To see differences at the group level, the result suggests that you need more data.

    The next question is obviously how to continue with this data? My answer would be: look again, well, what was your sampling design and why? It seems that the design in this case was to see if there is an effect of the month, and not to determine the differences between months.

    You have support to conclude that there is a significant effect of the month with the result of Kruskal Wallis. A monthly trend may be seen on a graph. Although you can not assign statistical significance, you can describe the trend in the data with a visual interpretation.

    It usually does not make much sense to search and search for a statistical method until you find one that gives you the results you want to see. It is better to take the results as a basis for a better sampling design next time, or as a basis to recommend a better design when you publish these results.

    answered by 21.03.2016 в 14:23