FreeComputerBooks.com
Links to Free Computer, Mathematics, Technical Books all over the World
|
|
- Title: Programming Pig
- Author(s) Alan F Gates
- Publisher: O'Reilly Media (October 22, 2011)
- Hardcover/Paperback: 224 pages
- Language: English
- ISBN-10: 1449302645
- ISBN-13: 978-1449302641
- Share This:
This book is an ideal learning tool and reference for Apache Pig, the programming language that helps you describe and run large data projects on Hadoop. With Pig, you can analyze data without having to create a full-fledged application making it easy for you to experiment with new data sets.
It shows newcomers how to get started, and teaches intermediate users the benefits of using Pig Latin, the data flow language for building and maintaining pipelines for processing data. Advanced users learn how to build complex data processing pipelines with Pig's macros and modularity features, and discover how to build systems for complex data processing needs by embedding Pig Latin into scripting languages.
- Learn the advantages and disadvantages of using Pig instead of MapReduce
- Understand how Pig fits in with other Hadoop components, such as HDFS, Hive, MapReduce, and HBase
- Follow examples that explain built-in Pig Latin functions, and data operators such as join and group
- Use grunt, the shell that Pig provides for exploring and working with HDFS
- Get performance tuning tips for running Pig Latin scripts on Hadoop clusters in less time
- Extend Pig with powerful user defined functions written in Java or Python
- Alan Gates, a member of Yahoo's Pig development team, is responsible for company's implementation of the language, including programming interfaces and the overall design. He has presented Pig at numerous conferences and user groups, universities, and at companies using Pig. Alan oversaw the rewriting of nearly the entire code base when Pig moved from a research project to a production project.
- Non-relational and NoSQL Databases
- Data Storage and Management
- Database, Data Warehouse and Data Management
- Miscellaneous Computer/Programming Languages
- Programming Pig (Alan F Gates)
- The Mirror Site (1) - PDF
- The Mirror Site (2) - PDF
- The Mirror Site (3) - PDF
-
Big Data Processing with Apache Spark (Srini Penchikala)
Learn about the Apache Spark framework and develop Spark programs for use cases in big-data analysis. It covers all the libraries that are part of Spark ecosystem, which includes Spark Core, Spark SQL, Spark Streaming, Spark MLlib, and Spark GraphX.
-
The Internals of Apache Spark (Jacek Laskowski)
This book introduces Apache Spark, the open source cluster computing system that makes data analytics fast to write and fast to run. With Spark, you can tackle big datasets quickly through simple APIs in Python, Java, and Scala.
-
The Data Engineer's Guide to Apache Spark (Databricks)
This book is for data engineers looking to leverage the immense growth of Apache Spark to build faster and more reliable data pipelines. It leverages Spark's amazing speed, scalability, simplicity, and versatility to build practical Big Data solutions.
-
Redis for Dummies, Limited Edition (Steve Suehring)
Discover Redis data structures and modules. Create applications with Redis. Learn about Redis use cases. Go beyond the basics to understand how Redis is powering the real world with use cases such as caching, session stores, geospatial indexing, full-text search, ...
-
Graph Databases: New Opportunities for Connected Data
This book provides a practical foundation for those who want to apply Graph Database to real-world business solutions. You'll learn why graph database are useful, where they're applicable, and how to design and implement solutions that use them.
-
O'Reilly® CouchDB: The Definitive Guide (J. Chris Anderson, et al)
This book teaches the fundamentals of one of the most powerful database engines ever created for the price of a good lunch.
-
O'Reilly® HBase: The Definitive Guide (Lars George)
This book will show you how Apache HBase can fulfill your needs. It provides the details you require, whether you simply want to evaluate this high-performance, non-relational database, or put it into practice right away.
-
Apache Cassandra Succinctly (Marko Svaljek)
Step outside the relational world and learn how to store data with Apache Cassandra, the massively popular NoSQL distributed database system. You will be able to store and model data using the Cassandra Query Language, and use Cassandra within your own apps.
-
Firebase Essentials (Neil Smyth)
This book provides everything you need to successfully integrate Firebase cloud features into your Android apps. The book is organized into chapter groups that focus on specific Firebase features, with each topic area consisting of a detailed overview.
:
|
|