The Neural Adaptive Computing Laboratory (NAC Lab)
Computer Science, RIT
Dr. Alexander G. Ororbia II (Assistant Professor, Computer Science)
- William Gebhardt, Ph.D. student
- Lifelong machine learning, neuro-evolution
- Zhizhuo Yang, Ph.D. (co-advised w/ Reynold Bailey at Rochester Institute of Technology)
- Active inference, reinforcement learning, predictive processing
- Ankur Mali, Ph.D. student (co-advised w/ Dr. C. Lee Giles at Penn State University)
- Neural memory systems, learning algorithms, lifelong machine learning
- Timothy Zee, Ph.D. student (co-advised w/ Ifeoma Nwogu at Rochester Institute of Technology)
- Learning algorithms, interpretable neural systems, convolutional networks
- Hitesh Ulhas Vaidya, Ph.D. student (co-advised w/ Travis Desell at Rochester Institute of Technology)
- AbdElRahman ElSaid, Ph.D. student (co-advised w/ Travis Desell at Rochester Institute of Technology)
- James Le, MSc student
- Michael Peechat, MSc student
- Xu Sun, MSc student
- Recurrent networks, time series
- Passed capstone project defense in May 2020
Neurocognitively-Inspired Lifelong Machine Learning
Neural architectures trained with back-propagation of errors are susceptible to catastrophic forgetting. In other words, old information acquired by these models is lost when new information for new tasks is acquired. This makes building models that continually learn extremely difficult if not near impossible. The focus of the NAC group's research is to draw from models of cognition and biological neurocircuitry, as well as theories of mind and brain functionality, to construct new learning procedures and architectures that generalize across tasks and continually adapt to novel situations, combining input from multiple modalities/sensory channels.
The NAC team is focused with developing novel, neurocognitively-inspired learning algorithms and memory architectures for artificial neural systems (for both non-spiking and spiking neurons). Furthermore, we explore and develop nature-inspired metaheuristic optimization algorithms, ranging from (neuro-)evolution to ant colony optimization to hybrid procedures. We primarily are concerned with the various sub-problems associated with lifelong machine learning, which subsumes online/stream learning, transfer learning, multi-task learning, multi-modal/input learning, and semi-supervised learning.
Lifelong Machine Learning Publications
- Continual Competitive Memory: A Neural System for Online Task-Free Lifelong Learning (2021) -- In this paper, we propose continual competitive memory (CCM), a neural model that learns by competitive Hebbian learning and is inspired by adaptive resonance theory (ART). CCM is designed to learn from streams of data in an unsupervised fashion, particularly acquiring semi-distributed representations that are robust to catastrophic interference. Notably, CCM does not require knowledge of the task in the form of task descriptors, providing the foundation for a more powerful basal ganglia task selection model that would impact the work below.
- Lifelong Neural Predictive Coding: Learning Cumulatively Online without Forgetting (2019) -- In this work, we propose an interactive generative model, the Sequential Neural Coding Network, focused on the problem of learning cumulatively across a sequence of tasks in the streaming data continuum setting. Our proposed model (and its learning algorithm) exhibits less forgetting compared to standard models and recent methods aimed at improving memory retention in artificial neural networks. We notably propose a "task selection model" inspired by the executive control functionality of the basal ganglia in the brain -- this model drives selecting portions of knowledge in our generative model when learning across tasks, reducing neural cross-talk, and adjusts its synapses through a simple competitive Hebbian learning rule we design for online stream-based learning. (Accepted to NeurIPS 2022)
- Continual Learning of Recurrent Neural Networks by Locally Aligning Distributed Representations (2018) -- In this work, we propose a recurrent neural architecture, the Parallel Temporal Neural Coding Network (P-TNCN), and its learning algorithm, a variation of Local Representation Alignment, motivated by the theory of predictive cdoing. The temporal model's inference and learning can exploit layerwise parallelization, can use non-differentiable activation functions, and, more importantly, does not required unfolding/unrolling over the length of the sequences that it is trained to generatively model. Furthermore, due its continual error-correcting nature, we examine the P-TNCN's ability to zero-shot generatively model sequences of previously unseen objects and show that is able to perform lifelong sequence modeling. (Accepted as IEEE TNNLS journal article in 2019.)
- Supplementary Information Appendix for this paper can be found here.
Spiking Neural Network Publications
- Spiking Neural Predictive Coding for Continual Learning from Data Streams (2019) -- In this paper, we proposed a generalization of the neural predictive coding framework developed throughout the efforts of our prior work to the realm of binary spike trains in continuous time learning and inference. The concrete model proposed was a Spiking Neural Coding Network composed of leaky integrate-and-fire neurons. We show that the online model is competitive with various modern-day spiking neural nets and is notably capable of conducting effective semi-supervised learning.
Neurocognitive Learning Algorithm and Architecture Publications
- Reducing Catastrophic Forgetting in Self Organizing Maps with Internally-Induced Generative Replay (2021) -- In this study, we develop and evaluate a novel form of the Kohonen self-organizing map (SOM) that we call the continual SOM (c-SOM), an artificial neural system that operates and learns via the principle of competitive Hebbian learning and is capable of mitigating the amount of catastrophic forgetting it experiences through a form of generative replay induced by our model's generalized memory units.
- Towards a Predictive Processing Implementation of the Common Model of Cognition (2021) -- In this article, we lay down the design of a cognitive architecture based on the predictive processing variant known as neural generative coding (NGC) as well as distributed holographic memory. This system is to embody the template of brain organization and modularity prescribed by the Common Model of Cognition, a unifying cognitive functionality framework based on the similarities across many important cognitive architectures like SOAR, ACT-R, and LEABRA (as well as Nengo and many others).
- Backprop-Free Reinforcement Learning with Active Neural Generative Coding" (2021) -- In this article, we propose the generalization of neural generative coding (NGC) to dynamic control. Specficially, our active NGC (ANGC) computational framework for building biologically-motivated intelligent agents is shown to work well on several popular reinforcement learning (RL) benchmarks, outperforming baselines such as the deep Q-network on a variety of tasks including the challenging mountain car problem (which is an extremely sparse reward problem). ANGC is, as the paper discusses, an interpretation of the cognitive neuroscience theoretical view of planning-as-inference, specifically offering a scalable NGC (see paper below for details of the original NGC) instantiation of active inference.
(Note: Public release of this article is on hold on arxiv, even though it was to appear on Tuesday April 6, 2021 -- we are releasing this sooner for the public to read until the internal system finally gets around to releasing it, of which it is not clear when this will happen.)
- The Neural Coding Framework for Learning Generative Models (2022; 2019 arxiv version) -- In this paper, we proposed a novel computational framework for learning unsupervised (neural) generative models without backpropagation-of-errors, inspired by the theory of predictive processing. Specifically, we design a model within our framework called the generative neural coding network (GNCN) which we show is competitive with backprop-based generative models such as the variational autoencoder and the adversarial autoencoder with respect to data log likelihood and outperforms these on tasks that the original models are not trained on, e.g., downstream classification and pattern completion. (Nature Communication Editor's Choice for Applied Physics and Mathematics)
- Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment (2020) -- The recursive local representation alignment (rec-LRA) algorithm was presented for scalable, gradient-free training of neural architectures. We demonstrated that we can efficiently train very large models, specifically residual convolutional neural networks on several image benchmarks including the massive-scale benchmark, ImageNet. rec-LRA offers improved model convergence, faster training update time, and a natural ability to handle non-differentiable activation functions in a flexible, architecture-agnostic framework.
- Biologically Motivated Algorithms for Propagating Local Target Representations (2019) -- In this paper, we proposed error-driven local representation alignment (LRA-E) and noise-controlled difference target propagation (DTP-sigma), a more stable, robust version of DTP.
- Like a Baby: Visually Situated Neural Language Acquisition (2019) -- This paper extended the Delta-RNN and other recurrent models to the task of multimodal language modeling and uncovers an interesting result that the models that learn with visual stimuli still retain some generalization even when blinded and forced to do classical language modeling.
- Deep Credit Assignment by Aligning Local Distributed Representations (2018) -- This paper proposed the first variants of local representation alignment, LRA-diff and LRA-fdbk, two algorithms that operate with the fundamental unit we call the "computational subgraph".
- Learning to Adapt by Minimzing Discrepancy (2017/2018) -- This work introduced the discrepancy reduction framework and one of the earliest version of the the neural error-correction framework that underpins LRA and the neural coding networks of later work -- the Temporal Neural Coding Network (TNCN) and showed that it worked quite well as a generative model of bouncing balls without backpropagation through time.
- Learning Simpler Language Models with the Differential State Framework (2017) -- This paper proposed the Differential State Framework (and the slow/fast mixture of states for surprisal-driven memory and prediction) and derived the original Delta-RNN. We showed that this simple model (with hardly any more parameters than an Elman-RNN) could compete with or even outperform complex recurrent models such as the Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) in the task of langauge modeling.
Metaheuristic Optimization Publications
- The Ant Swarm Neuro-Evolution Procedure for Optimizing Recurrent Networks (2019) -- In this paper, we propose a novel metaheuristic optimization algorithm called the Ant Swarm Neuro-evolution (ASNE) procedure for conducting neural architecture search (NAS) specifically for recurrent neural networks (RNNs).
- An Empirical Exploration of Deep Recurrent Connections and Memory Cells Using Neuro-Evolution (2019) -- This work showed how to use the state-of-the-art EXAMM neuro-evolutionary metaheuristic as an analysis tool for exploring various key properties of recurrent networks, primarily the use of skip connections and time delays as opposed to or in tandem with complex memory cells (such as the LSTM and Delta-RNN cells).
- Investigating Recurrent Neural Network Memory Structures using Neuro-Evolution (2019) -- This work proposed a general algorithm for neuro-evoluationary exploration/optimization (in the mmetic style, using backprop for the local component of the search), called EXAMM. The exploration included using "modules", or RNN cell/structure types, including the Delta-RNN, GRU, and LSTM among a pool of others.
- A Hybrid Algorithm for Metaheuristic Optimization (2019) -- This paper proposed a flexible, hybrid communication-based algorithm treating individual metaheuristic optimizers (such as particle swarm, the bat algorithm, flower pollination, and cuckoo search) as a team of agents that can communicate their global-best solutions in various ways to each other. We also proposed novel modifications to the Bat algorithm and particle swarm by using a Levy flight distribution instead of Gaussian distribution to generate noise for exploration of a problem search space.