Skip to content

some more thoughts for the future (dplR v2.x) #22

@sklesse

Description

@sklesse

Hi Andy & everyone contributing and maintaining dplR /dplPy.

I repeatedly am confronted with the unnecessary behavior of rwi.stats.running to not correctly calculate rbar when you detrend with difference=T.

That the detrending via division throws you an error when the estimated line goes below zero and reverts to detrending via the mean is great. But when people say difference=T they probably have already converted their values (log- or power-transformed) or they work with stable isotopes. In the case of log-transformed RW or d13C we automatically work with negative values, and it would be great if there was an option in detrend() to say: "I know what I am doing, fit "spline" curve anyways, even when fit goes negative". I am aware that adding a simple line in your code transposing the data by +100 is a very effective workaround, but sometimes I am asking myself if there couldn't be a fix for that. :-) The workaround is probably easier than changing detrend.series() with the multiple detrending options.

Anyways, I digress, the "dirty dog" part is actually not the reason why I opened this issue. Why is the rbar and hence eps and snr calculation affected by close-to-zero mean values? That does not make sense.

tmp <- normalize1(rwi, n, prewhiten)
if(!all(tmp$idx.good)) {
warning("after prewhitening, 'rwi' contains column(s) without at least four observations",
call.=FALSE)
cat(gettext("note that there is no error checking on column lengths if filtering is not performed\n",
domain="R-dplR"))
}

After some step-by-step testing I found normalize1() to be the "culprit". This function needs to be changed. At least, I do not understand why it does what it does. Why do anything, if I do not want to prewhiten the data? Why divide the data by their mean at all? Even for the pre-whitening this steps seems unnecessary. You can do the ar-modeling at any mean. And for the rbar calculation the mean doesn't matter. Am I missing something here, or why was this implemented? I would suggest deleting the lines that say "divide by time series mean" in normalize1(). Seems like an easy fix. What do you think?

BTW: This "set mean to 1" functionality in the detrend.series "Ar" method created also the funny constant offset of Anderegg's 2015 Figure S10, where their legacy baseline is weirdly at ~-0.01 and not 0. because each normally detrended time series actually has a usually a mean of something around 0.99-something (there is variation around the means, but by and large they are slightly <1), but the mean of the prewhitened time series is always exactly 1.

In the detrend.series() function the following lines

if (difference) {
     resids$Ar <- Ar - mean(Ar, na.rm = TRUE)
   }
   else {
     resids$Ar <- Ar/mean(Ar, na.rm = TRUE)
   }

could be changed to

if (difference) {
     resids$Ar <- Ar - mean(y2, na.rm = TRUE) 
   }
   else {
     resids$Ar <- Ar/mean(y2, na.rm = TRUE)
   }

Just an idea.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions