Understanding ETL
Data Pipelines for Modern Data Architectures
Book Details
| Author | Matt Palmer |
| Publisher | O'Reilly Media |
| Published | 2024 |
| Edition | 1st |
| Paperback | 107 pages |
| Language | English |
| ISBN-13 | 9781098159238, 9781098159252 |
| ISBN-10 | 1098159233, 109815925X |
| License | Compliments of Databricks |
Book Description
Extract, transform, load (ETL) is at the center of every application of data, from business intelligence to AI. Constant shifts in the data landscape - including the implementations of lakehouse architectures and the importance of high-scale real-time data - mean that today's data practitioners must approach ETL a bit differently.
This updated technical guide offers data engineers, engineering managers, and architects an overview of the modern ETL process, along with the challenges you're likely to face and the strategic patterns that will help you overcome them. You'll come away equipped to make informed decisions when implementing ETL and confident about choosing the technology stack that will help you succeed.
- Discover what ETL looks like in the new world of data lakehouses
- Learn how to deal with real-time data
- Explore low-code ETL tools
- Understand how to best achieve scale, performance, and observability
This book is published as open-access, which means it is freely available to read, download, and share without restrictions.
If you enjoyed the book and would like to support the author, you can purchase a printed copy (hardcover or paperback) from official retailers.
Download and Read Links
Share this Book
[localhost]# find . -name "*Similar_Books*"
Graph Databases For Beginners
So someone has heard about graph databases and wants to understand what all the buzz is about. Are they just a passing trend - here today and gone tomorrow - or are they a rising tide that businesses and development teams can't afford to ignore? Whether they're a business executive or a seasoned developer, something - perhaps a pressing business ch
Accelerating Data Pipeline Development
Today's data engineering teams are overwhelmed - juggling fire drills and endless requests while relying on manual, repetitive processes for building data pipelines. This much-needed tech guide from author Josh Hall introduces a practical approach to streamlining pipeline development, empowering teams to work smarter, not harder. Using Coalesce, a
Modern C
Modern C focuses on the new and unique features of modern C programming. The book is based on the latest C standards and offers an up-to-date perspective on this tried-and-true language. C is extraordinarily modern for a 50-year-old programming language. Whether you're writing embedded code, low-level system routines, or high-performance applicatio
Modern Data Visualization with R
Modern Data Visualization with R describes the many ways that raw and summary data can be turned into visualizations that convey meaningful insights. It starts with basic graphs such as bar charts, scatter plots, and line charts, but progresses to less well-known visualizations such as tree maps, alluvial plots, radar charts, mosaic plots, effects
Elements of Data Science
Elements of Data Science is an introduction to the practical skills of working with data, written for people with no programming experience. Concepts are explained clearly and concisely, and exercises in each chapter demonstrate the real-world use of each feature. - Step-by-Step Approach: Learn how to execute a data science project from start to fi
Data Mesh For Dummies
Data Mesh is a relatively new approach to data management. It combines several important trends in data management, including domain-driven design and data as a product, to decentralize the ownership of ingestion, processing, and serving of data. Zhamak Dehghani defined the term in 2019 as "a decentralized sociotechnical approach to share, access,