FreeComputerBooks.com
Links to Free Computer, Mathematics, Technical Books all over the World
|
|
- Title The Data Engineer's Guide to Apache Spark
- Author(s) Databricks
- Publisher: Databricks
- Paperback: N/A
- eBook HTML
- Language: English
- ISBN-10: N/A
- ISBN-13: N/A
- Share This:
This book is for data engineers looking to leverage the immense growth of Apache Spark to build faster and more reliable data pipelines.
Apache Spark is a fast, scalable, and flexible open source distributed processing engine for big data systems and is one of the most active open source big data projects to date. This book helps you build practical Big Data solutions that leverage Spark's amazing speed, scalability, simplicity, and versatility.
This book's straightforward, step-by-step approach shows you how to deploy, program, optimize, manage, integrate, and extend Spark–now, and for years to come. You'll discover how to create powerful solutions encompassing cloud computing, real-time stream processing, machine learning, and more. Every lesson builds on what you’ve already learned, giving you a rock-solid foundation for real-world success.
About the Authors- N/A
- Big Data
- Data Engineering and Data Science
- Data Analysis and Data Mining
- Non-relational/NoSQL Databases
- The Data Engineer's Guide to Apache Spark (Databricks)
- The Mirror Site (1) - PDF
- The Mirror Site (2) - PDF
- Book Homepage (Book Download, Resources, etc.)
-
Learning Spark: Lightning-Fast Data Analytics (Jules Damji, et al.)
This book shows data engineers and data scientists why structure and unification in Apache Spark matters. Specifically, it explains how to perform simple and complex data analytics and employ machine learning algorithms.
-
Learning Apache Spark with Python (Wenqiang Feng)
This book offers an introduction to the Apache Spark ecosystem, you will learn a wide array of concepts about PySpark in Data Mining, Text Mining, Machine Learning and Deep Learning.
-
Mastering Spark with R: Large-Scale Analysis and Modeling
With this practical book, data scientists and professionals working with large-scale data applications will learn how to use Spark from R to tackle big data and big compute problems.
-
Big Data Processing with Apache Spark (Srini Penchikala)
Learn about the Apache Spark framework and develop Spark programs for use cases in big-data analysis. It covers all the libraries that are part of Spark ecosystem, which includes Spark Core, Spark SQL, Spark Streaming, Spark MLlib, and Spark GraphX.
-
The Internals of Apache Spark (Jacek Laskowski)
This book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala.
-
Graph Algorithms: Practical Examples in Apache Spark and Neo4j
This book is a practical guide to getting started with graph algorithms for developers and data scientists who have experience using Apache Spark or Neo4j. You'll walk through hands-on examples that show you how to use graph algorithms in Apache Spark/Neo4j.
-
Data Engineering Cookbook: The Plumbing of Data Science
This is a practical and comprehensive guide. You will learn the basics of data engineering. Then you will learn the technologies and frameworks required to build data pipelines to work with large datasets.
-
Knowledge Graphs and Big Data Processing (Valentina Janev, et al)
Each chapter in this book addresses some pertinent aspect of the data processing chain, with a specific focus on understanding Enterprise Knowledge Graphs, Semantic Big Data Architectures, and Smart Data Analytics solutions.
-
Kafka: The Definitive Guide: Real-Time Data and Stream Processing
Through detailed examples, you'll learn Kafka's design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.
-
Artificial Intelligence for Big Data (Anand Deshpande, et al)
You will learn to use machine learning algorithms such as k-means, SVM, RBF, and regression to perform advanced data analysis. You will understand the current status of machine and deep learning techniques to work on genetic and neuro-fuzzy algorithms.
-
Hadoop with Python (Zachary Radtka, et al)
This book takes you through the basic concepts behind Hadoop, MapReduce, Pig, and Spark. Then, through multiple examples and use cases, you'll learn how to work with these technologies by applying various Python tools.
:
|
|