Coverart for item
The Resource Apache Mahout Cookbook

Apache Mahout Cookbook

Label
Apache Mahout Cookbook
Title
Apache Mahout Cookbook
Creator
Subject
Language
eng
Summary
In Detail The rise of the Internet and social networks has created a new demand for software that can analyze large datasets that can scale up to 10 billion rows. Apache Hadoop has been created to handle such heavy computational tasks. Mahout gained recognition for providing data mining classification algorithms that can be used with such kind of datasets. "Apache Mahout Cookbook" provides a fresh, scope-oriented approach to the Mahout world for both beginners as well as advanced users. The book gives an insight on how to write different data mining algorithms to be used in the Hadoop environment and choose the best one suiting the task in hand. "Apache Mahout Cookbook" looks at the various Mahout algorithms available, and gives the reader a fresh solution-centered approach on how to solve different data mining tasks. The recipes start easy but get progressively complicated. A step-by-step approach will guide the developer in the different tasks involved in mining a huge dataset. You will also learn how to code your Mahout's data mining algorithm to determine the best one for a particular task. Coupled with this, a whole chapter is dedicated to loading data into Mahout from an external RDMS system. A lot of attention has also been put on using your data mining algorithm inside your code so as to be able to use it in an Hadoop environment. Theoretical aspects of the algorithms are covered for information purposes, but every chapter is written to allow the developer to get into the code as quickly and smoothly as possible. This means that with every recipe, the book provides the code for reusing it using Maven as well as the Maven Mahout source code. By the end of this book you will be able to code your procedure to do various data mining tasks with different algorithms and to evaluate and choose the best ones for your tasks.Approach "Apache Mahout Cookbook" uses over 35 recipes packed with illustrations and real-world examples to help beginners as well as advanced programmers get acquainted with the features of Mahout.Who this book is for "Apache Mahout Cookbook" is great for developers who want to have a fresh and fast introduction to Mahout coding. No previous knowledge of Mahout is required, and even skilled developers or system administrators will benefit from the various recipes presented
http://library.link/vocab/creatorName
Giacomelli, Piero
Dewey number
006.31
Index
no index present
Language note
English
Literary form
non fiction
Nature of contents
dictionaries
http://library.link/vocab/subjectName
  • Java (Computer program language)
  • Machine learning
  • Web site development
Label
Apache Mahout Cookbook
Instantiates
Publication
Contents
  • Cover; Copyright; Credits; About the Author; Acknowledgments; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Mahout is Not So Difficult!; Introduction; Installing Java and Hadoop; Setting up a Maven and NetBeans development environment; Coding a basic recommender; Chapter 2: Using Sequence Files -- When and Why?; Introduction; Creating sequence files from the command line; Generating sequence files from code; Reading sequence files from code; Chapter 3: Integrating Mahout with an External Datasource; Introduction; Importing an external datasource into HDFS
  • Using adaptive logistic regression in Java codeUsing logistic regression on large-scale datasets; Using Random Forest to forecast market movements; Chapter 6: Canopy Clustering in Mahout; Introduction; Command-line-based Canopy clustering; Command-line-based Canopy clustering with parameters; Using Canopy clustering from the Java code; Coding your own cluster distance evaluation; Chapter 7: Spectral Clustering in Mahout; Introduction; Using EigenCuts from the command line; Using EigenCuts from Java code; Creating a similarity matrix from raw data
  • Using spectral clustering with image segmentationChapter 8: K-means Clustering; Introduction; Using K-means clustering from Java code; Clustering traffic accidents using K-means; K-means clustering using MapReduce; Using K-means clustering from the command line; Chapter 9: Soft Computing with Mahout; Introduction; Frequent Pattern Mining with Mahout; Creating metrics for Frequent Pattern Mining; Using Frequent Pattern Mining from Java code; Using LDA for creating topics; Chapter 10: Implementing the Genetic Algorithm in Mahout; Introduction; Setting up Mahout for using GA
  • Using the genetic algorithm over graphsUsing the genetic algorithm from Java code; Index
Control code
ocn867317377
Dimensions
unknown
Extent
1 online resource (250 pages)
Form of item
online
Isbn
9781849518031
Specific material designation
remote
System control number
(OCoLC)867317377
Label
Apache Mahout Cookbook
Publication
Contents
  • Cover; Copyright; Credits; About the Author; Acknowledgments; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Mahout is Not So Difficult!; Introduction; Installing Java and Hadoop; Setting up a Maven and NetBeans development environment; Coding a basic recommender; Chapter 2: Using Sequence Files -- When and Why?; Introduction; Creating sequence files from the command line; Generating sequence files from code; Reading sequence files from code; Chapter 3: Integrating Mahout with an External Datasource; Introduction; Importing an external datasource into HDFS
  • Using adaptive logistic regression in Java codeUsing logistic regression on large-scale datasets; Using Random Forest to forecast market movements; Chapter 6: Canopy Clustering in Mahout; Introduction; Command-line-based Canopy clustering; Command-line-based Canopy clustering with parameters; Using Canopy clustering from the Java code; Coding your own cluster distance evaluation; Chapter 7: Spectral Clustering in Mahout; Introduction; Using EigenCuts from the command line; Using EigenCuts from Java code; Creating a similarity matrix from raw data
  • Using spectral clustering with image segmentationChapter 8: K-means Clustering; Introduction; Using K-means clustering from Java code; Clustering traffic accidents using K-means; K-means clustering using MapReduce; Using K-means clustering from the command line; Chapter 9: Soft Computing with Mahout; Introduction; Frequent Pattern Mining with Mahout; Creating metrics for Frequent Pattern Mining; Using Frequent Pattern Mining from Java code; Using LDA for creating topics; Chapter 10: Implementing the Genetic Algorithm in Mahout; Introduction; Setting up Mahout for using GA
  • Using the genetic algorithm over graphsUsing the genetic algorithm from Java code; Index
Control code
ocn867317377
Dimensions
unknown
Extent
1 online resource (250 pages)
Form of item
online
Isbn
9781849518031
Specific material designation
remote
System control number
(OCoLC)867317377

Library Locations

    • InternetBorrow it
      Albany, Auckland, 0632, NZ
Processing Feedback ...