FreeComputerBooks.com
Links to Free Computer, Mathematics, Technical Books all over the World
|
|
- Title The Internals of Apache Spark
- Author(s) Jacek Laskowski
- Publisher: japila-books
- Paperback: N/A
- eBook HTML
- Language: English
- ISBN-10: N/A
- ISBN-13: N/A
- Share This:
This book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala. This edition includes new information on Spark SQL, Spark Streaming, setup, and Maven coordinates.
Updated to include Spark 3.x, this book shows data engineers and data scientists why structure and unification in Spark matters. Specifically, this book explains how to perform simple and complex data analytics and employ machine learning algorithms.
- How Spark SQL’s new interfaces improve performance over SQL's RDD data structure
- The choice between data joins in Core Spark and Spark SQL
- Techniques for getting the most out of standard RDD transformations
- How to work around performance issues in Spark's key/value pair paradigm
- Writing high-performance Spark code without Scala or the JVM
- How to test for functionality and performance when applying suggested improvements
- Using Spark MLlib and Spark ML machine learning libraries
- Spark's Streaming components and external community packages
- Jacek Laskowski is an independent consultant who is passionate about software development and teaching people in effective use of Apache Spark, Scala, sbt, and Apache Kafka (with a bit of Hadoop YARN, Apache Mesos, and Docker).
- The Internals of Apache Spark (Jacek Laskowski)
- The Internals of Spark Structured Streaming (Jacek Laskowski)
-
Mastering Spark with R: Large-Scale Analysis and Modeling
With this practical book, data scientists and professionals working with large-scale data applications will learn how to use Spark from R to tackle big data and big compute problems.
-
Big Data Processing with Apache Spark (Srini Penchikala)
Learn about the Apache Spark framework and develop Spark programs for use cases in big-data analysis. It covers all the libraries that are part of Spark ecosystem, which includes Spark Core, Spark SQL, Spark Streaming, Spark MLlib, and Spark GraphX.
-
The Data Engineer’s Guide to Apache Spark (Databricks)
This book is for data engineers looking to leverage the immense growth of Apache Spark to build faster and more reliable data pipelines. It leverages Spark's amazing speed, scalability, simplicity, and versatility to build practical Big Data solutions.
-
Graph Algorithms: Practical Examples in Apache Spark and Neo4j
This book is a practical guide to getting started with graph algorithms for developers and data scientists who have experience using Apache Spark or Neo4j. You'll walk through hands-on examples that show you how to use graph algorithms in Apache Spark/Neo4j.
-
Knowledge Graphs and Big Data Processing (Valentina Janev, et al)
Each chapter in this book addresses some pertinent aspect of the data processing chain, with a specific focus on understanding Enterprise Knowledge Graphs, Semantic Big Data Architectures, and Smart Data Analytics solutions.
-
Kafka: The Definitive Guide: Real-Time Data and Stream Processing
Through detailed examples, you'll learn Kafka's design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.
-
Artificial Intelligence for Big Data (Anand Deshpande, et al)
You will learn to use machine learning algorithms such as k-means, SVM, RBF, and regression to perform advanced data analysis. You will understand the current status of machine and deep learning techniques to work on genetic and neuro-fuzzy algorithms.
-
Hadoop with Python (Zachary Radtka, et al)
This book takes you through the basic concepts behind Hadoop, MapReduce, Pig, and Spark. Then, through multiple examples and use cases, you'll learn how to work with these technologies by applying various Python tools.
:
|
|