Text Mining with R
A Tidy Approach
Book Details
| Authors | Julia Silge, David Robinson |
| Publisher | O'Reilly Media |
| Published | 2017 |
| Edition | 1st |
| Paperback | 194 pages |
| Language | English |
| ISBN-13 | 9781491981658 |
| ISBN-10 | 1491981652 |
| License | Creative Commons Attribution-NonCommercial-ShareAlike |
Book Description
Much of the data available today is unstructured and text-heavy, making it challenging for analysts to apply their usual data wrangling and visualization tools. With this practical book, you'll explore text-mining techniques with tidytext, a package that authors Julia Silge and David Robinson developed using the tidy principles behind R packages like ggraph and dplyr. You'll learn how tidytext and other tidy tools in R can make text analysis easier and more effective.
The authors demonstrate how treating text as data frames enables you to manipulate, summarize, and visualize characteristics of text. You'll also learn how to integrate natural language processing (NLP) into effective workflows. Practical code examples and data explorations will help you generate real insights from literature, news, and social media.
- Learn how to apply the tidy text format to NLP;
- Use sentiment analysis to mine the emotional content of text;
- Identify a document's most important terms with frequency measurements;
- Explore relationships and connections between words with the ggraph and widyr packages;
- Convert back and forth between R's tidy and non-tidy text formats;
- Use topic modeling to classify document collections into natural groups;
- Examine case studies that compare Twitter archives, dig into NASA metadata, and analyze thousands of Usenet messages.
This book is available under a Creative Commons Attribution-NonCommercial-ShareAlike license (CC BY-NC-SA), which means that you are free to copy, distribute, and modify it, as long as you credit the original author, don't use it for commercial purposes, and share any adaptations under the same license.
If you enjoyed the book and would like to support the author, you can purchase a printed copy (hardcover or paperback) from official retailers.
Download and Read Links
Share this Book
[localhost]# find . -name "*Similar_Books*"
Clinical Text Mining
This open access book describes the results of natural language processing and machine learning methods applied to clinical text from electronic patient records. It is divided into twelve chapters. Chapters 1-4 discuss the history and background of the original paper-based patient records, their purpose, and how they are written and structured. The
Introduction to Data Science
Introduction to Data Science: Data Analysis and Prediction Algorithms with R introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling, data vi
High-Performance Caching with Nginx and Nginx Plus
One of its most important capabilities is content caching, which is a highly effective method for improving a website's performance. In this ebook, the authors describe how NGINX caches content, how to implement caching and cache clustering, and some of the ways to improve performance. The text provides a deep dive into how content caching truly wo
R for Data Science
Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data
Natural Language Processing with Transformers
Since their introduction in 2017, transformers have quickly become the dominant architecture for achieving state-of-the-art results on a variety of natural language processing tasks. If you're a data scientist or coder, this practical book -now revised in full color- shows you how to train and scale these large models using Hugging Face Transformer
Ruby Regexp
Scripting and automation tasks often need to extract particular portions of text from input data or modify them from one format to another. This book will help you learn Regular Expressions, a mini-programming language for all sorts of text processing needs. The book heavily leans on examples to present features of regular expressions one by one. I