Data Science & AI Books


Understanding ETL

Extract, transform, load (ETL) is at the center of every application of data, from business intelligence to AI. Constant shifts in the data landscape - including the implementations of lakehouse architectures and the importance of high-scale real-time data - mean that today's data practitioners must approach ETL a bit differently. This updated tech

Intermediate Statistics with R

R Statistics

Introductory statistics courses prepare students to think statistically but cover relatively few statistical methods. Building on the basic statistical thinking emphasized in an introductory course, a second course in statistics at the undergraduate level can explore a large number of statistical methods. This text covers more advanced graphical su

Learning Analytics Methods and Tutorials

R

This open access comprehensive methodological book offers a much-needed answer to the lack of resources and methodological guidance in learning analytics, which has been a problem ever since the field started. The book covers all important quantitative topics in education at large as well as the latest in learning analytics and education data minin

Essential GraphRAG

A Retrieval Augmented Generation (RAG) system automatically selects and supplies domain-specific context to an LLM, radically improving its ability to generate accurate, hallucination-free responses. The GraphRAG pattern employs a knowledge graph to structure the RAG's input, taking advantage of existing relationships in the data to generate rich,

Delta Lake: The Definitive Guide

Ready to simplify the process of building data lakehouses and data pipelines at scale? In this practical guide, learn how Delta Lake is helping data engineers, data scientists, and data analysts overcome key data reliability challenges with modern data engineering and management techniques. Authors Denny Lee, Tristen Wentling, Scott Haines, and Pra

The AI Ladder

AI may be the greatest opportunity of our time, with the potential to add nearly $16 trillion to the global economy over the next decade. But so far adoption has been much slower than anticipated. With this practical report, business leaders will discover where they are in their AI journey and learn the steps they still need to take to implement an

Mastering Shiny

R Shiny

Master the Shiny web framework - and take your R skills to a whole new level. By letting you move beyond static reports, Shiny helps you create fully interactive web apps for data analyses. Users will be able to jump between datasets, explore different subsets or facets of the data, run models with parameter values of their choosing, customize visu

Tidy Modeling with R

R

Get going with tidymodels, a collection of R packages for modeling and machine learning. Whether you're just starting out or have years of experience with modeling, this practical introduction shows data analysts, business analysts, and data scientists how the tidymodels framework offers a consistent, flexible approach for your work. RStudio engine

Data Science at the Command Line, 2nd Edition

Unix

This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed wit

Deep Learning for Coders with Fastai and PyTorch

Python

Deep learning is often viewed as the exclusive domain of math PhDs and big tech companies. But as this hands-on guide demonstrates, programmers comfortable with Python can achieve impressive results in deep learning with little math background, small amounts of data, and minimal code. How? With fastai, the first library to provide a consistent inte