some more thoughts for the future (dplR v2.x)

Hi Andy & everyone contributing and maintaining dplR /dplPy.

I repeatedly am confronted with the unnecessary behavior of rwi.stats.running to not correctly calculate rbar when you detrend with `difference=T`. 

That the detrending via division throws you an error when the estimated line goes below zero and reverts to detrending via the mean is great. But when people say `difference=T` they probably have already converted their values (log- or power-transformed) or they work with stable isotopes. In the case of log-transformed RW or d13C we automatically work with negative values, and it would be great if there was an option in `detrend()` to say: "I know what I am doing, fit "spline" curve anyways, even when fit goes negative". I am aware that adding a simple line in your code transposing the data by +100 is a very effective workaround, but sometimes I am asking myself if there couldn't be a fix for that. :-) The workaround is probably easier than changing `detrend.series() ` with the multiple detrending options.

Anyways, I digress, the "dirty dog" part is actually not the reason why I opened this issue. Why is the rbar and hence eps and snr calculation affected by close-to-zero mean values? That does not make sense.

tmp <- normalize1(rwi, n, prewhiten)
  if(!all(tmp$idx.good)) {
    warning("after prewhitening, 'rwi' contains column(s) without at least four observations",
            call.=FALSE)
    cat(gettext("note that there is no error checking on column lengths if filtering is not performed\n",
                domain="R-dplR"))
  }

After some step-by-step testing I found `normalize1()` to be the "culprit". This function needs to be changed. At least, I do not understand why it does what it does. Why do anything, if I do not want to prewhiten the data? Why divide the data by their mean at all? Even for the pre-whitening this steps seems unnecessary. You can do the ar-modeling at any mean. And for the rbar calculation the mean doesn't matter. Am I missing something here, or why was this implemented? I would suggest deleting the lines that say "divide by time series mean" in `normalize1()`. Seems like an easy fix. What do you think? 

BTW: This "set mean to 1" functionality in the detrend.series "Ar" method created also the funny constant offset of Anderegg's 2015 Figure S10, where their legacy baseline is weirdly at ~-0.01 and not 0. because each normally detrended time series actually has a usually a mean of something around 0.99-something (there is variation around the means, but by and large they are slightly <1), but the mean of the prewhitened time series is always exactly 1. 

In the `detrend.series()` function the following lines
 ```
 if (difference) {
      resids$Ar <- Ar - mean(Ar, na.rm = TRUE)
    }
    else {
      resids$Ar <- Ar/mean(Ar, na.rm = TRUE)
    }
```
could be changed to
 ```
 if (difference) {
      resids$Ar <- Ar - mean(y2, na.rm = TRUE) 
    }
    else {
      resids$Ar <- Ar/mean(y2, na.rm = TRUE)
    }

```
Just an idea. 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some more thoughts for the future (dplR v2.x) #22

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

some more thoughts for the future (dplR v2.x) #22

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions