OOM in mappers
In a certain setting, the default input split size is too large for Hivemall. Due to that, OutOfMemoryError cloud happen on mappers in the middle of training.
Then, revise your a Hadoop setting (mapred.child.java.opts/mapred.map.child.java.opts) first to use a larger value as possible.
If an OOM error still caused after that, set smaller mapred.max.split.size value before training.
Then, the number of training examples used for each trainer is reduced (as the number of mappers increases) and the trained model would fit in the memory.
OOM in shuffle/merge
If OOM caused during the merge step, try setting a larger mapred.reduce.tasks value before training and revise shuffle/reduce parameters.
If your OOM happened by using amplify(), try using rand_amplify() instead.