Introduction to Data Science

Data Analysis and Prediction Algorithms with R


Introduction to Data Science
Introduction to Data Science
CC BY-NC-SA

Book Details

Author Rafael A. Irizarry
Publisher CRC Press
Published 2019
Edition 1
Paperback 722 pages
Language English
ISBN-13 9780367357986
ISBN-10 0367357984
License Creative Commons Attribution-NonCommercial-ShareAlike

Book Description

Introduction to Data Science: Data Analysis and Prediction Algorithms with R introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling, data visualization, predictive algorithm building, file organization with UNIX/Linux shell, version control with Git and GitHub, and reproducible document preparation.

This book is a textbook for a first course in data science. No previous knowledge of R is necessary, although some experience with programming may be helpful. The book is divided into six parts: R, data visualization, statistics with R, data wrangling, machine learning, and productivity tools. Each part has several chapters meant to be presented as one lecture.

The author uses motivating case studies that realistically mimic a data scientist's experience. He starts by asking specific questions and answers these through data analysis so concepts are learned as a means to answering the questions. Examples of the case studies included are: US murder rates by state, self-reported student heights, trends in world health and economics, the impact of vaccines on infectious disease rates, the financial crisis of 2007-2008, election forecasting, building a baseball team, image processing of hand-written digits, and movie recommendation systems.

The statistical concepts used to answer the case study questions are only briefly introduced, so complementing with a probability and statistics textbook is highly recommended for in-depth understanding of these concepts. If you read and understand the chapters and complete the exercises, you will be prepared to learn the more advanced concepts and skills needed to become an expert.


This book is available under a Creative Commons Attribution-NonCommercial-ShareAlike license (CC BY-NC-SA), which means that you are free to copy, distribute, and modify it, as long as you credit the original author, don't use it for commercial purposes, and share any adaptations under the same license.

If you enjoyed the book and would like to support the author, you can purchase a printed copy (hardcover or paperback) from official retailers.

Download and Read Links

PDF

Share this Book

[localhost]# find . -name "*Similar_Books*"


Algorithms

Algorithms

Algorithms are the lifeblood of computer science. They are the machines that proofs build and the music that programs play. Their history is as old as mathematics itself. This book is a wide-ranging, idiosyncratic treatise on the design and analysis of algorithms, covering several fundamental techniques, with an emphasis on intuition and the proble

Data Structures and Algorithms

Algorithms

This book provides implementations of common and uncommon algorithms in pseudocode which is language independent and provides for easy porting to most imperative programming languages. It is not a definitive book on the theory of data structures and algorithms. For the most part this book presents implementations devised by the authors themselves b

Open Data Structures

Algorithms

Offered as an introduction to the field of data structures and algorithms, Open Data Structures covers the implementation and analysis of data structures for sequences (lists), queues, priority queues, unordered dictionaries, ordered dictionaries, and graphs. Focusing on a mathematically rigorous approach that is fast, practical, and efficient, Mor

Intel Galileo and Intel Galileo Gen 2

Arduino Linux Assembler C / C++ Java

Intel Galileo and Intel Galileo Gen 2: API Features and Arduino Projects for Linux Programmers provides detailed information about Intel Galileo and Intel Galileo Gen 2 boards for all software developers interested in Arduino and the Linux platform. The book covers the new Arduino APIs and is an introduction for developers on natively using Linux.

Data Science at the Command Line, 2nd Edition

Unix

This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed wit

Computer Graphics from Scratch

JavaScript Scratch

Computer graphics programming books are often math-heavy and intimidating for newcomers. Not this one. Computer Graphics from Scratch takes a simpler approach by keeping the math to a minimum and focusing on only one aspect of computer graphics, 3D rendering. You'll build two complete, fully functional renderers: a raytracer, which simulates rays o