Learn data engineering fundamentals by constructing a modern data stack for analytics and machine learning applications. We'll also learn how to orchestrate our data workflows and programmatically execute tasks to prepare our high quality data for downstream consumers (analytics, ML, etc.)
👉 This repository contains the list of all modern tools for Data Engineering
- Apache Spark - Unified engine for large-scale data analytics
- SnowFlake - A tool for Cloud Data Warehousing
- Apache Flink - Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams.
- Apache Hadoop - The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models
- MySQL - MySQL is an open-source relational database management system.
- Add more tools...................
- Fork the repo
- Make a seperate branch like data-feature
- Update Readme.md and Commit Changes
- Make PR for approval to main branch
This repo is part of Opensource project for Hacktoberfest2023.Please make pull requests to participate in this repo