How to develop deep belief multi layer neural network in deeplearning4j

In my previous tutorial i have discussed how to setup the dependency libraries in your deeplearning4j project with maven now in this post i will be discussed how to setup a deep belief multi layer neural network in dl4j to recognize fraud patterns

First create a new Java class with a any name as you wish for this case i will be named it as FraudDetectorNeuralNet and add following code inside the class

package org.neuralnetwork;

import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;
import org.apache.spark.api.java.JavaRDD;
import org.canova.api.records.reader.RecordReader;
import org.canova.api.records.reader.impl.CSVRecordReader;
import org.deeplearning4j.eval.Evaluation;
import org.deeplearning4j.nn.api.OptimizationAlgorithm;
import org.deeplearning4j.nn.conf.MultiLayerConfiguration;
import org.deeplearning4j.nn.conf.NeuralNetConfiguration;
import org.deeplearning4j.nn.conf.Updater;
import org.deeplearning4j.nn.conf.layers.*;
import org.deeplearning4j.nn.multilayer.MultiLayerNetwork;
import org.deeplearning4j.nn.weights.WeightInit;
import org.deeplearning4j.optimize.api.IterationListener;
import org.deeplearning4j.optimize.listeners.ScoreIterationListener;
import org.deeplearning4j.spark.impl.multilayer.SparkDl4jMultiLayer;
import org.nd4j.linalg.dataset.DataSet;
import org.nd4j.linalg.dataset.api.iterator.DataSetIterator;
import org.nd4j.linalg.lossfunctions.LossFunctions;

import java.util.Collections;

/**
 * Deep Mulilayer Neural Network to detect frauds.
 */
public class FraudDetectorNeuralNet {

    private static final Log log = LogFactory.getLog(FraudDetectorNeuralNet.class);
    private int outputNum = 2;
    private int iterations = 1;
    private MultiLayerNetwork model = null;

    public FraudDetectorNeuralNet(){



    }

    public void buildModel() throws NeuralException {

        log.info("load model....");
        try {
            log.info("build model...");
            MultiLayerConfiguration conf = new NeuralNetConfiguration.Builder()
                    .seed(12345)
                    .iterations(iterations)
                    .optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
                    .learningRate(0.001)
                    .l1(0.01).regularization(true).l2(1e-3)
                    .list()
                    .layer(0, new DenseLayer.Builder().nIn(3).nOut(2)
                            .activation("tanh")
                            .weightInit(WeightInit.XAVIER)
                            .build())
                    .layer(1, new DenseLayer.Builder().nIn(2).nOut(2)
                            .activation("tanh")
                            .weightInit(WeightInit.XAVIER)
                            .build())
                    .layer(2, new OutputLayer.Builder(LossFunctions.LossFunction.MSE)
                            .weightInit(WeightInit.XAVIER)
                            .activation("softmax")
                            .nIn(2).nOut(outputNum).build())
                    .backprop(true).pretrain(false)
                    .build();

            model = new MultiLayerNetwork(conf);
            model.init();
            //model.setUpdater(Updater.ADAGRAD);
        } catch (Exception e) {
            log.error("Error ocuured while building neural netowrk :"+e.getMessage());
            throw new NeuralException(e.getLocalizedMessage(),e);
        }
    }

    public void trainModel() throws NeuralException {

        try {
            RecordReader recordReader = new CSVRecordReader(0,",");
            SparkDataSet sparkDataSet = SparkDataSet.getInstance("fraud_data","local["+nCores+"]",3,2);
            JavaRDD<DataSet> rddDataSet = sparkDataSet.generateTrainingDataset("hdfs://localhost:9000/user/asantha/fraud_data/data.txt");
            log.info("Train model...");
            if(model== null){

                buildModel();
            }
            model.setListeners(Collections.singletonList((IterationListener) new ScoreIterationListener(1/5)));
            SparkDl4jMultiLayer network = new SparkDl4jMultiLayer(sparkDataSet.getSc(),model);
            int nEpochs = 5;
            for(int i=0;i<nEpochs;i++){

                model = network.fitDataSet(rddDataSet);
            }
            Evaluation evaluation = network.evaluate(rddDataSet);
            System.out.println(evaluation.stats());
        } catch (Exception e) {

            log.error("Error ocuured while building neural netowrk :"+e.getMessage());
            throw new NeuralException(e.getLocalizedMessage(),e);
        }
    }

    public void detectFraud(DataSet input) throws NeuralException {

        if(model == null){

            buildModel();
        }
        log.info("output :"+model.output(input.getLabels()));
    }

    public static void main(String[] args) {

        FraudDetectorNeuralNet model = new FraudDetectorNeuralNet();
        try {
            model.buildModel();
            model.trainModel();
        } catch (NeuralException e) {
            e.printStackTrace();
        }
    }


}

Note :- At this point you will get a error Undefined class SparkDataset let it as it is we will create this SparkDataset in next post

I’m not going to describe each and every aspect in code because you can find much detail documentation in dl4j official documentation page instead i will consider some important features where you need to concentrate on when you developing a dl4j neural network

The key aspects of any of the neural network there key aspects to consider

learning rate

learning rate is very important feature for any neural network for better training and increase the accuracy of output in neural network you have to give a good learning rate basically higher learning rate will caused fast learning but when it comes to prediction accuracy is much lower for higher learning rate algorithms for smaller learning rate will be more time consuming when in training but accuracy is much better than early scenario therefore we have to carefully define learning rate not much big and not much smaller for efficient neural network most of the time ideal learning rate would be 0.001 – 0.01 range

Optimization Algorithm

Optimization Algorithm also a big factor to predict correct result and train neural network accurately there are 4 optimization algorithms available in dl4j it’s hard to say which one is better one because it always depends on the requirement and which type of neural network you used for more informations about dl4j optimization algorithm you can refer their documentation

Stochastic gradient descent
Stochastic gradient descent with line search
Conjugate gradient line search
L-BFGS

Activation Function

Activation function is the one of the major component in neural network accuracy of neural network is directly depend on activation function there are many different activation functions available in dl4j based on type of neural network and dataset we can define which activation function is best suite for neural network and there can be possible to define difference activation functions among layers for the output layer softmax activation function is recommended

Depth of the neural network

This is very critical factor in any neural network since this is multilayer neural network there will be know limitations for maximum number of layers developers can add much as they need but when you constructing layers you have to consider on number of inputs and outputs those are mainly depend on the structure of dataset in my case i will be using a dataset which contains only three inputs and one two outputs structure of a my dataset sample will be as follow

output sourceIp destinationIp timeStampDifference

normal 3232239403 1459758271 1468013838596

normal 3232239403 1459774333 1468013837731

fraud 3232508417 1464094333 1468013834161

Note :- here ip address is converted to number format since this dataset has to be vectorized to send to neural network ip address format cannot be vectorized

As you can see in above dataset there are three inputs in dataset such as sourceIp,destinationIp and timeStampDifference and there will be two possible outputs in sample normal or fraud therefore in starting layer of our network we use number of inputs(nIn) as 3 and number of outputs(nOut) as 2

since the difference between number of inputs and outputs in initial layer is much smaller we don’t require to add many hidden layers to increase the depth since dataset structure any neural network layer structure has to be compatible with each. Otherwise it will throw runtime error in neural network if the dataset and layers and inputs and outputs are not compatible

for train dataset in deeplearning4j neural network we cannot pass above dataset as it is since it can identify only numbers we have to vectorize above dataset to support to neural network for that you can use canova api in dl4j how you can pass a dataset to vectorized dataset is explained in this tutorial.

after vectorized above sample result will be like below

0,0.0833333333,0.0037523452,0.1122922169
0,0.0833333333,0.0037523452,1
1,0.3333333333,1,0.6940098172

Now we successfully added neural network configuration to the project in next post i will describe how you can train dataset in dl4j neural network by using apache spark

deeplearningpatternmatching

How to develop deep belief multi layer neural network in deeplearning4j

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply