Principles of Data Science
Book Details
| Authors | Shaun V. Ault, Soohyun Nam Liao, Larry Musolino |
| Publisher | OpenStax |
| Published | 2025 |
| Edition | 1st |
| Paperback | 573 pages |
| Language | English |
| ISBN-13 | 9798385161850, 9798385161867, 9781961584600 |
| ISBN-10 | 8385161856, 8385161864, 1961584603 |
| License | Creative Commons Attribution-NonCommercial-ShareAlike |
Book Description
Principles of Data Science is intended to support one- or two-semester courses in data science. It is appropriate for data science majors and minors as well as students concentrating in business, finance, health care, engineering, the sciences, and a number of other fields where data science has become critically important.
The authors have included a diverse mix of scenarios, examples, and data types for analysis and discussion purposes. These include both fictional contexts and real-world sources, such as the Federal Reserve Economic Database and Nasdaq. Data sets focus on a range of topics: business, science, social sciences. Applications include healthcare, physical sciences, demographics, policy, and finance. Data ethics and the emergence of artificial intelligence are covered deeply - both in their own chapters and as consistent threads throughout the course material.
The authors and contributors have developed rich in-chapter example problems and extensive practice exercises that encourage students to apply concepts in a variety of situations. Technical illustrations and Python code support and supplement the principles and theory. The text also includes direct links to downloadable data sets and Python code, as well as guidance on how to use them.
This book is available under a Creative Commons Attribution-NonCommercial-ShareAlike license (CC BY-NC-SA), which means that you are free to copy, distribute, and modify it, as long as you credit the original author, don't use it for commercial purposes, and share any adaptations under the same license.
If you enjoyed the book and would like to support the author, you can purchase a printed copy (hardcover or paperback) from official retailers.
Download and Read Links
Share this Book
[localhost]# find . -name "*Similar_Books*"
Data Science at the Command Line, 2nd Edition
This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed wit
Introduction to Data Science
Introduction to Data Science: Data Analysis and Prediction Algorithms with R introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling, data vi
R for Data Science
Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data
Python Data Science Handbook
For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data Science Handbook do you get them all - IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other relate
Data Science with Microsoft SQL Server 2016
R is one of the most popular, powerful data analytics languages and environments in use by data scientists. Actionable business data is often stored in Relational Database Management Systems (RDBMS), and one of the most widely used RDBMS is Microsoft SQL Server. Much more than a database server, it's a rich ecostructure with advanced analytic capab
Operating Systems and Infrastructure in Data Science
In data science, mastering a system environment with its tools and processes is essential to achieve minimum productivity. Feeling alien to an environment, using the wrong tools or combining the right tools in the wrong order can lead not only to effectivity limitations but also yield wrong results. Hence, in this book, besides basic computer knowl