Deep Learning with DeepLearning4J
This example shows how to run a deeplearning4j example on spark cluster. The example is to train a multi-layer neural network model to recognize hand-written digits from the MNIST dataset.
1. SSH to the master node. Use login “centos” and your key pair.
2. Download MNIST dataset
mkdir MNIST
wget http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
wget http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
mv train-images-idx3-ubyte.gz images-idx3-ubyte.gz
mv train-labels-idx1-ubyte.gz labels-idx1-ubyte.gz
3. Download maven
cd $HOME
wget http://www-eu.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz
tar -zxvf apache-maven-3.3.9-bin.tar.gz
4. Download and compile deeplearning4j example programs
cd $HOME
git clone https://github.com/deeplearning4j/dl4j-examples
Check dl4j-examples/pom.xml if java.version is set to 1.8 and scala.binary.version to 2.11.
Change the dl4j.spark.version and datavec.spark.version from _1 to _2
Then, compile the dl4j example programs.
cd $HOME/dl4j-examples/dl4j-spark-examples/dl4j-spark
$HOME/apache-maven-3.3.9/bin/mvn package
Without any errors, the jar package is created under target directory.
ls target
Notice the version of dl4j-spark-<version>-bin.jar.
classes generated-sources maven-status
dl4j-spark-0.9.1-bin.jar maven-archiver
dl4j-spark-0.9.1.jar maven-lint-result.xml
5. Submit MNIST Training Job
Pass the dl4j-spark-<version>-bin.jar that is found in target directory.
rm -rf $HOME/MNIST
spark-submit --class org.deeplearning4j.mlp.MnistMLPExample \
target/dl4j-spark-0.9.1-bin.jar \
-useSparkLocal false
The example program may take 10-20 minutes to finish (up to the cluster size and configuration). An example result is shown below.
... skip ...
Examples labeled as 9 classified by model as 6: 3 times
Examples labeled as 9 classified by model as 7: 57 times
Examples labeled as 9 classified by model as 8: 15 times
Examples labeled as 9 classified by model as 9: 5766 times
==========================Scores========================================
# of classes: 10
Accuracy: 0.9818
Precision: 0.9818
Recall: 0.9816
F1 Score: 0.9817
Precision, recall & F1: macro-averaged (equally weighted avg. of 10 classes)
========================================================================