Skip to content

Cohort Sample PR Updates#4

Merged
blootsvoets merged 4 commits into
thehyve:cohort-samplingfrom
chrisknoll:cohort-sampling-cknoll1-update1
Nov 9, 2020
Merged

Cohort Sample PR Updates#4
blootsvoets merged 4 commits into
thehyve:cohort-samplingfrom
chrisknoll:cohort-sampling-cknoll1-update1

Conversation

@chrisknoll
Copy link
Copy Markdown

Making the following changes to avoid potential issues with sql rendering:

  • Rename rank to rank_value.
  • Remove default to include event counts.

Remove default to include event counts.
Copy link
Copy Markdown

@MaximMoinat MaximMoinat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I want Joris to at least be aware before we merge.

@chrisknoll
Copy link
Copy Markdown
Author

Could you hold on merging this for a day or two: I'm testing the front end, and while I was sure I saw somewhere that when you save a cohort definition or re-generate a cohort definition, the samples get deleted. However, that behavior is not happening on the atlas branch for the enhanced profile. But, this is fine, I'd like to see if we can put a 'refresh' fucntion into the cohort sample service so that given a current sample, it can re-generate a new sample of patients (without having to create a new sample).

The back-end change is just to add a 'reresh' function to the service.

In addition, I think i'd like to tag the sample with the 'design hash' of the cohort definition. We do this elsewhere, but we can use this infot o know if a sample came from the current cohort definition.

@chrisknoll
Copy link
Copy Markdown
Author

I've added an additional commit to this PR to add a refreshSample function. This can be invoked if you call /WebAPI/cohortsample/{cohortId}/{source_key}/{sample_id}?refresh=true.

I plan on using this on the front end to allow users to take an existing sample and 'refresh it' with a new sample (if a cohort has changed, for example).

Copy link
Copy Markdown

@blootsvoets blootsvoets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the addition of a refresh functionality. I wan’t aware of this design hash, that seems like a good addition!

@PathParam("sampleId") Integer sampleId,
@DefaultValue("recordCount") @QueryParam("fields") String fields
@DefaultValue("") @QueryParam("fields") String fields,
@QueryParam("refresh") boolean refresh
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To semantically distinguish reading and updating, could the refresh be put in a separate POST method? I.e. POST {sampleId}/refresh. This may require an update to the permission paths.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for your review.

If we want to consider semantics of the call: If we're saying POST, that means some resource was created, but it's not like a new resource (the cohortSample with a new resource ID) is created. On the other hand, POSTs are not idempotent (each call has a side effect) which there is a side-effect, it's true. You could say that this call results in a 'new sample' but when you do a POST, you send a request to an endpoint ie: /someResource and the new resource is found at /someResource/{newId}. At least, that's how all the other POST ops work in WebAPI: POST to create, PUT to update.

But the thing is with PUT is that multiple calls to the same url (ie /cohortsample/{sampleId}/refresh should not have side effects, but it does (PUTs must be idempotent). Ie: you can't have the HTTP layer cache this call because it's supposed to go to the server every time.

Niether of these cases is actually sending data up to the server, so that's why I wasn't thinking a POST/PUT made sense. And, the result of the call to the GET is the same in both the 'fetch' and 'refresh' case...while the POST to /cohortsample retruns one datastructure while the POST to /cohortsample/{id}/refresh would return something else.

For these reasons, I think what is put forward here is the correct approach, although I do understand that this type of 'remote procdure call' doesn't meld well with a pure REST-ful methodology...but we do this elsewhere in the app: to generate a cohort: you do a GET on cohortdefinition/{id}/generate. So, I suggest we keep it consistent with the other elemetns of the application.

Copy link
Copy Markdown

@blootsvoets blootsvoets Nov 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, alternatively, the seed could be specified by the user via a PUT request to {sampleId}/seed? It can be any string, and its hash will be used to actually seed the RNG. In practice, Atlas could send a string containing the current date-time to refresh the state. The DB scheme would need to be updated.

Or use the more generic /algorithm which could include a seed but also the sampling algorithm being used (right now, only uniform).

Copy link
Copy Markdown
Author

@chrisknoll chrisknoll Nov 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update to my last response: I was looking through other examples of refhreshing datasets via REST, and seen several examples of post, so I'll go through that approach.

@chrisknoll
Copy link
Copy Markdown
Author

I've made the update requested, and finalized the UI for Atlas in this PR: OHDSI/Atlas#2373. If we can get these approved, we can get this into 2.8. Time is sensitive, so please let me know about any issues at your earliest convienence.

Copy link
Copy Markdown

@blootsvoets blootsvoets left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the update!

@blootsvoets blootsvoets merged commit b1e15fc into thehyve:cohort-sampling Nov 9, 2020
@chrisknoll chrisknoll deleted the cohort-sampling-cknoll1-update1 branch November 9, 2020 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants