There are two main programs: 'dib_train' for training a neural network, and 'dib_predict' for making predictions from a pre-trained neural network. To get the command syntax and a list of options, type the command name with the '-H' option. Each data file comprises a simple, binary dump of the data in row-major order with a four-byte header giving the number of columns. There are several file conversion routines supplied with the library, but for a more complete set of routines, please try the 'libmsci' libary on Github.
Using the library to perform statistical classification is a bit more complicated than simply training the NN with the training routine, then making predictions with the prediction routine. Some pre- and post-processing are necessary. Assume the classification data is in a simple, binary dump of four-byte integers, 'classes.dat', and that there are eight (8) classes. First we need to decide on a multiclass coding then convert the integer data to floating point. To output a coding matrix chosen from five different ones, use the 'dib_print_ecc' command:dib_print_ecc -Q 2 8 > coding_matrix.txt(If five are not enough to choose from, I would once again suggest heading to the 'libmsci' project.) Next, we use the result to code the multiclass classes into binary classes:
cls2vec -M coding_matrix.txt classes.dat classes.vecNow, assuming we have some matching coordinate data, call it, 'coord.vec', we can train the model:
dib_train -C 1 -m gsl:trs_lm -n ff_gen:1hidden coord.vec classes.vec model.txtIf we have some unrelated coordinate data, call it 'test.vec', we can start making predictions:
dib_predict test.vec model.txt output.vecBefore we can use this, we need to reverse the coding:
multiclass_solver output.vec coding_matrix.txt prob.vecBut we still aren't done. The result, contained in 'prob.vec' contains only probabilities, not classes. To convert these probabilities to classes:
dib_classify prob.vec resultwhere 'result.cls' contains the winning classes while 'result.con' contains the winning probabilities normalized to lie between 0 to 1.
While there are few tools specifically for designing custom topologies (for the beginnings of such an endeavour, see the header file, 'dib_ff_tools.h') the implementation of the neural networks is fully general and can handle any topology desired. The networks are stored in simple ASCII files and 'dib_train' can take these files as input using the '-a' option.
There are two implementations: node-by-node, (see 'dib_ff.h') and term-by-term (see 'dib_ff_sl.h'). The node-by-node implementation can handle any type of feed-forward NN including convolutional neural networks. The files have the following format: