Operating Systems and Infrastructure in Data Science


Operating Systems and Infrastructure in Data Science
Operating Systems and Infrastructure in Data Science
CC BY-NC-SA

Book Details

Author Josef Spillner
Publisher vdf Hochschulverlag AG
Published 2023
Edition 1st
Paperback 172 pages
Language English
ISBN-13 9783728141675, 9783728141682
ISBN-10 3728141674, 3728141682
License Creative Commons Attribution-NonCommercial-ShareAlike

Book Description

In data science, mastering a system environment with its tools and processes is essential to achieve minimum productivity. Feeling alien to an environment, using the wrong tools or combining the right tools in the wrong order can lead not only to effectivity limitations but also yield wrong results. Hence, in this book, besides basic computer knowledge and programming skills, students on data science are empowered to assemble a battery of useful tools to employ in the right situation, ranging from small, versatile command-line tools to powerful online platforms. Compared to mastering a single programming language and thus controlling an application logic in the small, something that can be fitted into a few functions on the screen, this book advances the skills to programming in the large, beyond the boundaries of individual processes or machines. Programming in the large means defining and orchestrating complex data-centric processes involving multiple tools, platforms and resources.

The eventual goal for the reader is thus to be able to define data, model and code that should be provisioned and monitored as services in appropriate distributed infrastructures - from hosting data and models to running software in the cloud. As studying is only the first step towards practical application of skills in a professional setting, this book should therefore be a good starting point for students of data science and computer science, digital life sciences, digital mobility and similar curricula.


This book is available under a Creative Commons Attribution-NonCommercial-ShareAlike license (CC BY-NC-SA), which means that you are free to copy, distribute, and modify it, as long as you credit the original author, don't use it for commercial purposes, and share any adaptations under the same license.

If you enjoyed the book and would like to support the author, you can purchase a printed copy (hardcover or paperback) from official retailers.

Download and Read Links

Share this Book

[localhost]# find . -name "*Similar_Books*"


Data Science at the Command Line, 2nd Edition

Unix

This thoroughly revised guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You'll learn how to combine small yet powerful command-line tools to quickly obtain, scrub, explore, and model your data. To get you started, author Jeroen Janssens provides a Docker image packed wit

Operating Systems and Middleware

C / C++

The scenario describes a user sitting down at a computer to check email. One of the messages includes an attached document to be edited. The user clicks the attachment, and it opens in another window. After starting to edit the document, the user realizes they need to leave for a trip. They save the document in its partially edited state and shut d

How To Code in Go

Go

This book is designed to introduce you to writing programs with the Go programming language. You'll learn how to write useful tools and applications that can run on remote servers, or local Windows, macOS, and Linux systems for development. The topics that it covers include how to: - Install and set up a local Go development environment on Windows,

Introduction to Data Science

R

Introduction to Data Science: Data Analysis and Prediction Algorithms with R introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling, data vi

R for Data Science

R Analysis

Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data

Managing Cloud Native Data on Kubernetes

Kubernetes Cloud

Is Kubernetes ready for stateful workloads? This open source system has become the primary platform for deploying and managing cloud native applications. But because it was originally designed for stateless workloads, working with data on Kubernetes has been challenging. If you want to avoid the inefficiencies and duplicative costs of having separa