Discussion Request: Lookup-Data - Repo and Warehouse Storage

### Is your feature request related to a problem? Please describe

### Scope: Ensure a logical pipeline and warehouse structure for current and future pipeline data needs, **for use in the short term.**

Document any results of this discussion into their own GH Issue or ADR, as necessary.

Primary Questions:
1. When to use `dbt-seeds` vs `source dbt-references`
2. Define the types of data that we **may** want to represent within the dbt-pipeline.  
3. How will the data organized within the repo?  File structure confirmation for each type.
4. How will the data be stored in the warehouse? Schema confirmation for each type.

### Scope Creep:
1. Seed data-file conventions
2. Placement specific data files into a Type. Examples are added to the proposal for clarity, but may not be accurate. 
3. Non-Tabular data - This data can be stored in various yaml files and be available in the dbt-docs as `Markdown`. i.e. Release notes, harmonization notes, etc



### Describe the solution you'd like

### Proposal 

Possible types of lookup data:
- **Type1:** Study-specific lookup data - Not for use in `dbt-docs`.
-- Data used for harmonization and not necessary for display within the `dbt-docs`.
-- i.e. Annotations
- **Type2:** Study-specific lookup data - For use in `dbt-docs`.
-- i.e. Site information, value-sets
- **Type3: ** Study-specific lookup data  - For use in `dbt-docs` ONLY.
-- i.e. Dataset metadata. Harmonization metadata. Links to current documentation or sources, skimmed down version of value-set data, etc.
-  **Type4:** Cross-project / Pipeline-specific data - Not for use in `dbt-docs`.
-- Site information, study identifiers, AccessDM vocabulary tables
-  **Type5:** Cross-project / Pipeline-specific data - For use in `dbt-docs` ONLY.
-- Site information, study identifiers



#### When to use source `dbt-references`.
- This is the process used to bring `Raw Study Data` into the pipeline, as a dbt-reference, using a sources.yml file.
- **Use dbt-references for** Type1, and Type4 data, 
-- Storage - Type1: All study-specific data that is referenced via a `sources.yml` should be imported into the studies raw schema. i.e. inc_brainpower_raw
-- Storage - Type4: Cross-project data can be stored together.
--- Repo: models/cross_project/filename.sql
--- Schema: cross_project_raw


#### When to use `dbt-seeds`
- **Use seeds when** a lookup is important to outside users who want to see the csv reference in GH, dbt-docs, GH-Pages.   Also, when version control is necessary, and appropriate. **Requirement:**  Data size must fall within the dbt-seed size limitations. 
- Storage and schema naming.
-- seeds/inc/inc_filename_1     > {environment}_seeds.inc_filename_1         > Type2
-- seeds/kf/kf_filename_1        > {environment}_seeds.kf_filename_1           > Type2
-- seeds/docs/filename_1        > {environment}_seeds.docs_filename_1       > Type3
-- seeds/filename_1                 > {environment}_seeds.filename_1                > Type5


### Describe alternatives you've considered

_No response_

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discussion Request: Lookup-Data - Repo and Warehouse Storage #34

Is your feature request related to a problem? Please describe

Scope: Ensure a logical pipeline and warehouse structure for current and future pipeline data needs, for use in the short term.

Scope Creep:

Describe the solution you'd like

Proposal

When to use source `dbt-references`.

When to use `dbt-seeds`

Describe alternatives you've considered

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Discussion Request: Lookup-Data - Repo and Warehouse Storage #34

Description

Is your feature request related to a problem? Please describe

Scope: Ensure a logical pipeline and warehouse structure for current and future pipeline data needs, for use in the short term.

Scope Creep:

Describe the solution you'd like

Proposal

When to use source dbt-references.

When to use dbt-seeds

Describe alternatives you've considered

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

When to use source `dbt-references`.

When to use `dbt-seeds`