Mahout and Machine Learning Training Course is Here


Author: Koji Sekiguchi

  • This training course is now provided in Tokyo in Japanese only. We, however, are considering ton provide the course overseas as its contents are very intriguing.

    Please contact me if you are interested and like to help me providing the course overseas.

  • Many enterprises have been tackling with the task of analyzing big data to obtain new perception. In the course of this challenge, machine learning is drawing their attention as their essential idea.

    RONDHUIT now would like to announce the introduction of machine learning training course using Apache Mahout.

    Click to go to the “Machine Learning Using Apache Mahout” page.

    Machine Learning Using Apache Mahout is a training course that mainly consists of hands-on sessions. The course systematically organizes basic knowledge’s of machine learning and uses Mahout as needed. Each unit provides you with applicable exercises that you can solve to improve your understanding and get a foothold in applying the knowledge to actual operations.

    Training Course Features

    • Our textbooks use plenty of graphics and charts in order for you to study the material effectively over a finite period of time. In addition to that, the detail notes provided on each page will be very useful for your self-study outside the class.
    • We are sure you will acquire the vision of background theory and practical knowledge by solving easy-to-tackle exercises provided in each unit with the instructor.

    Here are some of exercises:

    Use the following 2 methods to exhibit that among rectangles that have fixed circumference, square has the largest area.

    • A solving method that uses simple variable elimination by simultaneous equations.
    • A solving method that uses Lagrange multiplier.

    This training course is recommended if you:

    • Want to systematically study the hot topic of machine learning and put your experience to good use in your development work.
    • Are an information processing personnel of an enterprise that has big data and need to have minimum knowledge to order a development project that makes use of big data to an integrator.
    • Want to purchase books that cover machine learning in order to study the topic but have trouble reading as there are mathematical expressions.
    • Purchased “Mahout in Action” and are using Mahout but are not comfortable using the technology.

    Contents of Training Course

    This training course is a 2-day course.

    Day 1

    The day 1 of this 2-day course first looks at “What is Machine Learning?”, goes on to study pattern recognition and classification algorithms for supervised learning and finishes with writing a handwritten characters recognition program where the all students participate in creating handwritten character data. You will be amazed to find out how many handwritten characters the Mahout classifier recognizes!

    • Machine Learning and Apache Mahout
      • What is Machine Learning?
      • Model [Exercises]
      • What is Apache Mahout?
      • Installing Mahout [Exercises]
    • Pattern Recognition
      • What is Pattern Recognition?
      • Feature (or Attribute) Vector
      • Various Distance Measures [Exercises]
      • Prototypes and Learning Data
    • Classification
      • Nearest Neighbor Rule
      • k-NN Rule [Exercises]
      • Prototyping by Learning
      • Derivation of Discriminant Function
      • Perceptron Learning Rule [Exercises]
      • Averaged Perceptron
      • Problems of Perceptron Learning Rule
      • Widrow-Hoff Learning Rule [Exercises]
      • Neural Network [Exercises]
      • Support Vector Machine
      • Lagrange Multiplier [Exercises]
      • Decision Tree [Exercises]
      • Learning Decision Tree [Exercises]
      • Naive Bayes Classifier [Exercises]
      • Multivariable Bernoulli Model [Exercises]
      • Extension to Multiclass Classification
    • Programing Handwritten Characters Recognition
      • Structure of Handwritten Characters Recognition Program
      • Creating Learning Data [Exercises]
      • Facts of Handwritten Characters Recognition [Exercises]

    Day 2

    The day 2 of this 2-day course first looks at functions other than classification that Mahout provides – recommendation, clustering – and goes on to study principal component analysis for eliminating dimension of feature vector, machine learning evaluation, and machine learning for natural language processing, all through exercise using Mahout.

    • Recommendation
      • What is Recommendation?
      • Information retrieval and Recommendation
      • Types of Recommendation Architecture
      • User Profiles and Their Collection
      • Forecasting Evaluation Values [Exercises]
      • Pearson Correlation Coefficient
      • Explanation in Recommendation
    • PageRank
      • Importance of Ranking
      • Rating Scale of Information retrieval System – Theory and Practice
      • Vector Space Model [Exercises]
      • Score Accounting of Apache Lucene
      • PageRank [Exercises]
      • HITS
    • Clustering
      • What is Clustering?
      • Clustering Methods
      • K-Means [Exercises]
      • Nearest Neighbor Method [Exercises]
      • Evaluating and Analyzing Clustering Results
      • Similar Image Search – Apache alike
      • Information retrieval and Clustering
    • Principal Component Analysis
      • Relationship between the Number of Learning Patterns and Dimensions [Exercises]
      • What is Principal Component Analysis?
      • Average and Variance [Exercises]
      • Covariance Matrix [Exercises]
      • Eigenvalue and Eigenvector [Exercises]
    • Evaluating Machine Learning
      • Evaluating and Analyzing Results
      • Partitioning Training Data
      • Precision and Recall Ratio [Exercises]
      • False Positive and False Negative [Exercises]
      • Evaluating Features
      • Within-Class Variance, Between-Class Variance
      • Bayes Error Rate
      • Feature Selection
    • Machine Learning in Natural Language Processing
      • What is Natural Language Processing?
      • Natural Language Processing for Lalognosis
      • Corpus
      • bag-of-words
      • N-gram Model [Exercises]
      • Sequential Labeling
      • Hidden Markov Model [Exercises]
      • Viterbi Algorithm [Exercises]
      • Introduction to NLP4L


    Skill of editors such as vi and Emacs and knowledge of Linux commands are helpful as exercises require the use of an Ubuntu machine.


    • Please bring a notebook PC that has ssh installed and running. We can provide you with one if you don’t have access to a notebook PC.
    • Please bring a (mechanical) pencil and an eraser as some exercises involve hand calculations.

    We are looking forward to working with you!

    » Pagetop