### Abstract:

Neural networks, in particular feedforward neural networks architectures such as multilayer perceptrons and radial basis function networks, have been used successfully in many chemical engineering applications. A number of techniques exist with which such neural networks can be trained. These include backpropagation, k-means clustering and evolutionary algorithms. The latter method is particularly useful, as it is able to avoid local optima in the search space and can optimise parameters for which there exists no gradient information. Unfortunately only moderately-sized networks can be trained by this method, owing to the fact that evolutionary optimisation is extremely computationally intensive. In this research, a novel algorithm called combinatorial evolution of regression nodes (CERN) is propose for training non-linear regression models, such as neural networks. This evolutionary algorithm uses a branch-and-bound combinatorial search in the selection scheme to optimise groups of neural nodes. The use of a combinatorial search, for a set of basis nodes, in the optimisation of neural networks is a concept introduced for the first time in this thesis. Thereby it automatically solves the problem of permutational redundancy associated with the training of the hidden layer of a neural network. CERN was further enhanced by using clustering, which actively supports niches in the population. This also enabled the optimisation of the node types to be used in the hidden layers, which need not necessarily be the same for each of the nodes (i.e. a mixed layer of different node types can be found). A restriction that does apply is that in order to make the combinatorial search efficient enough, the output layer of the neural network needs to be linear. CERN was found to be significantly more efficient than a conventional evolutionary algorithm not using a combinatorial search. It also trained faster than backpropagation with momentum and an adaptive learning rate. Although the Levenberg-Marquardt algorithm is nevertheless significantly faster than CERN, it struggled to train in the presence of many non-local minima. Furthermore, the Levenberg-Marquardt learning rule tends to overtrain and requires a gradient information. CERN was analysed on seven real world and six synthetic data sets. Oriented ellipsoidal basis nodes optimised with CERN achieved significantly better accuracy with fewer nodes than spherical basis nodes optimise by means of k-means clustering. On the test data multilayer perceptrons optimised by CERN were found to be more accurate than those trained by the gradient descent techniques, backpropagation with momentum and the Levenberg-Marquardt update rule. The networks of CERN were also compared to the splines of MARS and were found to generalise significantly better or as well as MARS. However, for some data sets, MARS was used to select the input variables to use for the neural networks. Networks of ellipsoidal basis functions built by CERN were more compact and more accurate than radial basis function networks trained using k-means clustering. Moreover, the ellipsoidal nodes can be translated into fuzzy systems. The generalisation and complexity of the resulting fuzzy rules were comparable to fuzzy systems optimised by ANFIS, but did not result in an exponential increase of the number of rules. This was caused by the grid-partitioning employed by ANFIS and for data sets with a relatively high dimensionality, in comparison with the data points, the resulting generalisation was consequently much poorer than that of the CERN models. In summary, the proposed combinatorial selection scheme was able to make an existing evolutionary algorithm significantly faster for neural network optimisation. This made it computationally competitive with traditional gradient descent based techniques. Being an evolutionary algorithm, the proposed techniques does not require a gradient and can therefore optimise a larger set of parameters in comparison to traditional techniques.