A new book from Manning, Hadoop in Practice, is definitely the most modern book Kumar Vavilapalli et al., caite.info~garth//papers/caite.info . this edition is free when you purchase Hadoop in Action, Second Edition It starts with a few easy examples and then moves quickly to show Hadoop use in more complex data analysis tasks. . eBook $ pdf + ePub + kindle + liveBook. Hadoop in Action, Second Edition, provides a comprehensive introduction to Hadoop and shows you how to write programs in the MapReduce style. It starts with.
|Language:||English, Spanish, Portuguese|
|Genre:||Politics & Laws|
|ePub File Size:||24.64 MB|
|PDF File Size:||8.34 MB|
|Distribution:||Free* [*Regsitration Required]|
Why “Hadoop in Action”? 4. What is Hadoop? 4. Understanding distributed systems and Hadoop 6. Comparing SQL databases and Hadoop 7 . Contribute to Better-Boy/books-for-big-data development by creating an account on GitHub. In the four years after the publication of Hadoop in Action, interest in and In Hadoop in Action, 2nd Edition, we have deeply revised the original book to cover .
Chuck Lam. Application Level Cryptography Tokenization, field-level encryption. Constructing the basic template of a MapReduce program 5. Understanding distributed systems and Hadoop. Your book will ship via to:. Appendix D: Accelerating SQL analytics with Tez.
The building blocks of Hadoop. Setting up SSH for a Hadoop cluster. Components of Hadoop 3. Working with files in HDFS. Anatomy of a MapReduce program. Writing basic MapReduce programs 4.
Getting the patent data set. Constructing the basic template of a MapReduce program. Improving performance with combiners. Advanced MapReduce 5.
Chaining MapReduce jobs. Joining data from different sources. Programming Practices 6. Developing MapReduce programs. Monitoring and debugging on a production cluster. Cookbook 7.
Passing job-specific parameters to your tasks. Probing for task-specific information.
Partitioning into multiple output files. Inputting from and outputting to a database. Keeping all output in sorted order. Managing Hadoop 8. Setting up parameter values for practical use. Recovering from a failed NameNode. Designing network layout and rack awareness. Scheduling jobs from multiple users.
Running Hadoop in the cloud 9. Introducing Amazon Web Services. Running MapReduce programs on EC2. Cleaning up and shutting down your EC2 instances. Programming with Pig Thinking like a Pig. Learning Pig Latin through Grunt. Working with user-defined functions. Seeing Pig in action—example of computing similar patents. Hive and the Hadoop herd Case studies Converting 11 million image documents from the New York Times archive.
Recommending the best websites at StumbleUpon. Appendix A: HDFS file commands. About the Technology Big data can be difficult to handle using traditional databases.
About the reader This book requires basic Java skills. Hadoop in Action combo added to cart. Your book will ship via to:. Become a Reviewer.
We regret that Manning Publications will not be publishing this title. Table of Contents detailed table of contents. Part 1: Introducing Hadoop 1. Why "Hadoop in Action"? Understanding distributed systems and Hadoop. Scale-out instead of scale-up.
Offline batch processing instead of online transactions. Understanding MapReduce 1. Scaling a simple program manually.
Scaling the same program in MapReduce.
The Hadoop Ecosystem 1. Apache Zookeeper. Yet Another Resource Negotiator. Starting Hadoop 2. The building blocks of Hadoop 2. Setting up SSH for a Hadoop cluster 2. Define a common account. Distribute public key and validate logins.
Running Hadoop 2. Local standalone mode. Running Hadoop in the cloud 2. Introducing Amazon Web Services. Securing the Hadoop Platform 3.
Hadoop Security Weaknesses 3. Top 10 Security and Privacy Challenges in Hadoop. Additional Security Weaknesses. Hadoop Threat Model 3. Challenges and Threats in Hadoop Security.
Hadoop Security Framework 3. Data Management. Threat Modeling. Getting and Installing Kerberos. Application Level Cryptography Tokenization, field-level encryption. Network Security 3. Threat Model. Threat Model Development.
Components of Hadoop 4. Working with files in HDFS 4. Basic file commands. Reading and writing to HDFS programmatically. Anatomy of a MapReduce program 4. Hadoop data types. Word counting with predefined mapper and reducer classes.
Reading and writing 4. Writing basic MapReduce programs 5. Getting the patent data set 5.
The patent citation data. Constructing the basic template of a MapReduce program 5. MapReduce v1 and v2. Streaming in Hadoop 5.