Robust Machine Learning

Robust ML

 

Machine Learning and AI are used extensively in multiple application domains. If we are to trust algorithms to work for us and with us in society, some levels of robustness must be built into the approaches. Research in Oxford covers many important areas related to this theme - below are just a few.

 

Bayesian Optimisation

Bayesian optimisation is the use of probabilistic modelling techniques to perform the global optimisation of black-box functions. Such optimisation problems are found in any application requiring the automation of design choices. In particular, global optimisation forms the chief challenge in much of Machine Learning, for which the state-of-the-art is often arduous hand-tuning of parameter choices and the heuristic selection of models and architectures. More broadly, Bayesian optimisation has found use in interactive user-interfaces, robotics, environmental monitoring and sensor networks. These applications are united by their complex, expensive, multi-modal objective functions. Here, the Bayesian modelling approach offers flexible probabilistic surrogates that both adapt to the observed structure of the objective while maintaining an honest reflection of uncertainty. The former ensures that the best use is made of expensive evaluations; the latter is crucial in ensuring that algorithms continue to explore objectives so as to locate hidden pockets of objective mass. Our research agenda focuses on expanding the range of Bayesian optimisation to tackle the most challenging real-world problems, including those with high-dimensional input spaces and where very large numbers of data must be gathered. This enables the automation of algorithm choice, in a principled manner, as well as automating the right choices over parameters and tunings, part of the focus of 'autoML' research.

 

Active Inference for Sensor Networks

Sensors networks are an inportant source of complex data, offering insight into many environmental and human phenomena, but demanding novel forms of information processing. We have introduced methods that leverage sensor networks to provide real-time estimates of dynamic environmental phenomoma. Our flexible non-parametric algorithms allow effective inference even with minimal domain knowledge, and our probabilistic approach provides uncertainty estimates to support the decision-making of human operators. We have also introduced algorithms that parsimoniously select only the most valuable observations from sensor networks. Real observations are usually associated with some cost, such as the battery energy required to power a sensor or transmit a reading: it is desirable to reduce the number of such observations. We have demonstrated such active data selection on the real sensor network data above. Our algorithms intelligently concentrate sparse observations in the most informative spatio-temporal locations, even in the presence of dynamic, missing and faulty data.

 

Fault, Changepoint and Anomaly Tolerant Inference

Fast and reliable fault and changepoint detection is vital in achieving near-zero-breakdown performance for many real-world systems. We aim to design algorithms that can identify changes in the characteristics of an instrumented system that may be indicative of actual or imminent failure. Our probabilistic systems are also be capable of distinguishing between misleading changepoints in the mechanism of observation (i.e. sensor faults) and changepoints in the actual underlying characteristics of data. The flexibility of Bayesian non-parametric models permits tolerance to a wide range of possible faults and changepoints. Further, these approaches provide the operators and/or the supervisory control system with robust uncertainty estimates for decision support and inference.

 

Adversarial Robustness & Verification

It is well known that many algorithms produce erroneous outputs when their inputs are perturbed by small (inperceptable from a human perspective) changes; this 'attack' is referred to as an 'Adversarial Attack'. Making algorithms that can mitigate against such manipulations is vital; it provides security against spoofing and a resilence to small changes in data due, perhaps, to noise and artefacts. Other research considers using the formal proof of logic to demonstrate the correctness of methods.

 

Safe Machine Learning

In this research domain we look at ways in which algorithms learn only solutions which are compliant with some safety criteria. Examples include work on 'Safe PILCO', learning constrained reinforcement learning models for safety critical systems and work on safety guarantees using adversarial robustness in probabilistic systems.