extract_input_target_forcings add option for left-justification of train/eval#56
Closed
tvonich wants to merge 1 commit intogoogle-deepmind:mainfrom
Closed
extract_input_target_forcings add option for left-justification of train/eval#56tvonich wants to merge 1 commit intogoogle-deepmind:mainfrom
tvonich wants to merge 1 commit intogoogle-deepmind:mainfrom
Conversation
Provides optionality for a left-justified evaluation and training with kwarg justify = 'left' or 'right'
Author
|
Tried to accomplish the CLA...may be doing it wrong. If any clarification is available that'd be great. |
tewalds
reviewed
Feb 22, 2024
Contributor
tewalds
left a comment
There was a problem hiding this comment.
This would need to be rebased.
| dataset: xarray.Dataset, | ||
| input_duration: TimedeltaLike, | ||
| target_lead_times: TargetLeadTimes, | ||
| justify: str |
Contributor
There was a problem hiding this comment.
This is probably better as an enum instead of string.
| if justify == 'left': | ||
| # Inputs correspond to the first time elements within the input duration | ||
| # Targets follow immediatly after per the target lead times | ||
| target_start_time = int(input_duration.total_seconds()/3600/6) |
Contributor
There was a problem hiding this comment.
This shouldn't be 6h specific.
| inputs = dataset.isel(time=slice(int(target_start_time))) | ||
| inputs['time'] = inputs['time'] - input_duration + time[1] | ||
|
|
||
| targets = dataset.isel(time=slice(target_start_time,target_end_time)) |
Contributor
There was a problem hiding this comment.
Please take a look at the google python style guide. Some general comments:
- There should be spaces between function arguments, no trailing spaces, etc.
- Comments are in english and should use punctuation (eg missing periods above).
- Remove debugging statements (commented out code, print statements below).
| (inclusive) lead times, or a sequence of lead times. Lead times should be | ||
| Timedeltas (or something convertible to). They are given relative to the | ||
| final input timestep, and should be positive. | ||
| justify: Defines whether inputs and targets are extracted from the beginning |
Contributor
There was a problem hiding this comment.
Any chance you can add a test that this does what you'd expect?
Author
|
I'll work through these in the next week or two when able. And yes, I can show that it's working as expected. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Provides optionality for a left-justified evaluation and training with kwarg justify = 'left' or 'right'. Changes are from lines 281 - 322, line 361, and line 381.
Per issue discussion with Timo Ewalds #32
justify = 'left' is now the default and I believe that makes sense based on normal user expectation of left to right progression of time when doing a forecast and selecting data.
To explain what is different now, I'll use the following example.
We load dataset_source-era5_date-2022-01-01_res-1.0_levels-13_steps-12.nc in the Jupyter Notebook.
There are 14 time periods along the data dimension for this netcdf: 2 initial states + 12 steps
Let's visualize them as: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14
We set Eval Steps and Train Steps to 4 on the accompanying slider and extract the data with extract_input_target_forcings.
With the previous extract_input_target_forcings function, the inputs would've been 9, 10 and the targets would've been 11, 12, 13, 14 for the designated Eval and Train Steps. This is a bit misleading because if I want to do a 24-hour forecast starting at 0Z 01-01-2022, I expect my forecast end time to be 0Z 01-02-2022. In the right justified method, the forecast always ends on the last time element within the example_data.
With the new left-justified default, the inputs are always the 1 and 2, with the targets following immediately after. So in the above example, that would be 1, 2 for inputs and 3, 4, 5, 6 for targets.