Principles of Data Science


Principles of Data Science
Principles of Data Science
CC BY-NC-SA

Book Details

Authors Shaun V. Ault, Soohyun Nam Liao, Larry Musolino
Publisher OpenStax
Published 2025
Edition 1
Paperback 573 pages
Language English
ISBN-13 9798385161850, 9798385161867, 9781961584600
ISBN-10 8385161856, 8385161864, 1961584603
License Creative Commons Attribution-NonCommercial-ShareAlike

Book Description

Principles of Data Science is intended to support one- or two-semester courses in data science. It is appropriate for data science majors and minors as well as students concentrating in business, finance, health care, engineering, the sciences, and a number of other fields where data science has become critically important.

The authors have included a diverse mix of scenarios, examples, and data types for analysis and discussion purposes. These include both fictional contexts and real-world sources, such as the Federal Reserve Economic Database and Nasdaq. Data sets focus on a range of topics: business, science, social sciences. Applications include healthcare, physical sciences, demographics, policy, and finance. Data ethics and the emergence of artificial intelligence are covered deeply - both in their own chapters and as consistent threads throughout the course material.

The authors and contributors have developed rich in-chapter example problems and extensive practice exercises that encourage students to apply concepts in a variety of situations. Technical illustrations and Python code support and supplement the principles and theory. The text also includes direct links to downloadable data sets and Python code, as well as guidance on how to use them.


This book is available under a Creative Commons Attribution-NonCommercial-ShareAlike license (CC BY-NC-SA), which means that you are free to copy, distribute, and modify it, as long as you credit the original author, don't use it for commercial purposes, and share any adaptations under the same license.

If you enjoyed the book and would like to support the author, you can purchase a printed copy (hardcover or paperback) from official retailers.

Download and Read Links

PDF

Share this Book

[localhost]# find . -name "*Similar_Books*"


Data Science at the Command Line, 2nd Edition

Unix

This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed wit

Introduction to Data Science

R

Introduction to Data Science: Data Analysis and Prediction Algorithms with R introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling, data vi

R for Data Science

R Analysis

Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data

Python Data Science Handbook

Python Pandas

For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all - IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other relate

Data Science with Microsoft SQL Server 2016

SQL R

R is one of the most popular, powerful data analytics languages and environments in use by data scientists. Actionable business data is often stored in Relational Database Management Systems (RDBMS), and one of the most widely used RDBMS is Microsoft SQL Server. Much more than a database server, it's a rich ecostructure with advanced analytic capab

OpenIntro Statistics, 4th Edition

Statistics

OpenIntro Statistics provides a traditional college-level introduction to the field of statistics. This widely adopted textbook offers an exceptional and accessible foundation for a diverse range of students, from those at community colleges to attendees of Ivy League institutions. It is estimated that approximately 20,000 students use this thoroug