Data Science & AI Books


Artificial Intelligence and Librarianship, 3rd Edition

This open book provides a comprehensive and technically grounded exploration of Artificial Intelligence (AI) and Machine Learning (ML) within the context of modern librarianship and information science. Authored by Martin Frické, it moves beyond theoretical discussion to focus on practical applications and the underlying architecture of contempora

Evidence-based Software Engineering

R Analysis

This book discusses what is currently known about software engineering based on an analysis of all publicly available software engineering data. This aim is not as ambitious as it sounds because there is not a lot of data publicly available. The analysis is like a join-the-dots puzzle, except that the 600+ dots are not numbered, some of them are ac

Creating a Data-Driven Enterprise with DataOps

Many companies are busy collecting massive amounts of data, but few are taking advantage of this treasure horde to build a truly data insights-driven organization. To do so, the data team must democratize both data and the insights in a way that provides real-time access to all employees in the organization. This report explores DataOps, the proces

Making Sense of Stream Processing

Kafka

How can event streams help make your application more scalable, reliable, and maintainable? In this report, O'Reilly author Martin Kleppmann shows you how stream processing can make your data storage and processing systems more flexible and less complex. Structuring data as a stream of events isn't new, but with the advent of open source projects s

Building Knowledge Graphs

Graph

Incredibly useful, knowledge graphs help organizations keep track of medical research, cybersecurity threat intelligence, GDPR compliance, web user engagement, and much more. They do so by storing interlinked descriptions of entities - objects, events, situations, or abstract concepts - and encoding the underlying information. How do you create a k

Statistics Done Wrong

Statistics

Scientific progress depends on good research, and good research needs good statistics. But statistical analysis is tricky to get right, even for the best and brightest of us. You'd be surprised how many scientists are doing it wrong. Statistics Done Wrong is a pithy, essential guide to statistical blunders in modern science that will show you how t

Financial Machine Learning

Financial Machine Learning surveys the nascent literature on machine learning in the study of financial markets. The authors highlight the best examples of what this line of research has to offer and recommend promising directions for future research. This survey is designed for both financial economists interested in grasping machine learning tool

Natural Language Processing with Transformers

Python

Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book -now revised in full color- shows you how to train and scale these large models using Hugging Face Transformer

Statistical Foundations of Actuarial Learning and its Applications

This open access book discusses the statistical modeling of insurance problems, a process which comprises data collection, data analysis and statistical model building to forecast insured events that may happen in the future. It presents the mathematical foundations behind these fundamental statistical concepts and how they can be applied in daily

Data Visualization with Category Theory and Geometry

This open access book provides a robust exposition of the mathematical foundations of data representation, focusing on two essential pillars of dimensionality reduction methods, namely geometry in general and Riemannian geometry in particular, and category theory. Presenting a list of examples consisting of both geometric objects and empirical data