This is a guest post in our #NPGsfn11 blog series and posted on behalf of Björn Brembs.
“Like riding a bike” is what we say when we want to convey how some skills are never forgotten. In this way, skills are like habits: they stick and are hard, if not impossible, to shake. Therefore, it may not be surprising that one of the models used in neuroscience research to mimic the process of skill learning is habit formation. Animals (mostly rats or mice) are trained in a specific task until it becomes so automated or stereotyped that the behavior becomes difficult to change. It is precisely because of this automation and lack of behavioral control that habit formation is also an important paradigm when modeling drug addiction. Drug addicts are thought to have developed a drug-taking habit that has become so automatic and rigid that they cannot help but execute it (especially when faced with drug-associated cues). Thus, skills and habits are motor memories that can last a lifetime.
Habit formation in animal models is usually induced by over-training them. For instance, in one poster from the first session of this year’s Society for Neuroscience conference, Smith and Graybiel trained rats in a T-Maze: following a given auditory cue, the animals had to go either left or right for a reward. Before a habit is formed, i.e., in the early phase of the experiment, the behavior is still flexible (termed ‘goal-directed.’) This is tested by devaluing the reward the animals receive for choosing the correct arm of the T-maze. For instance, if turning right after tone A is rewarded with water and turning left after tone B is rewarded with food, animals are more likely to make more mistakes when the ‘water cue’ is given, if they were sated with water immediately before testing in the maze. Presumably, this is because the animals were tested when they were no longer thirsty and had less incentive to follow the devalued ‘water cue.’ This goal-directed behavior is abolished by habit formation: over-training the animals will not lead to any reduction in response to the water cue, even if the animal had previously been sated. The behavior has become stereotyped, automatic, insensitive to devaluation.
The neurobiology of these processes is complicated. Previous lesion-based work has established two brain areas as being important for habit formation: the dorsolateral striatum (DLS) and the prefrontal infralimbic cortex (IL). In order to learn more about the role of these two brain regions, Smith and Graybiel recorded from neurons in these regions during habit formation. What they found was intriguing: both regions exhibited an emergence of slowly-stabilizing response patterns over the course of the training, but the pattern evolved much more quickly in the DSL than in the IL. Thus, the authors were able to follow the different stages of the training using recordings in these two regions, allowing them to assign earlier and later roles to each area. This is important as the timing of these stages is a critical aspect to the whole process: establish a habit too early and it can become maladaptive. Any athlete can tell you that it’s much harder to unlearn the wrong move than to learn it anew.
It is precisely this temporal control over habit formation that another poster in this session (actually just across the aisle) explored. Schreiweis et al. used a similar T-maze design to train wild-type and genetically-modified mice. The GM mice had their FoxP2 gene replaced by a humanized version. While the authors did not use a devaluation test, their experiments nevertheless suggested that the timing of habit formation was shifted. The experiments took advantage of so-called allocentric and egocentric learning strategies producing declarative or procedural memories, respectively. When making decisions in the maze, the animals either learned to use their own, self-motion (egocentric) to pick the correct arm of the T-maze (e.g., ‘turn left,’) or they learned that external cues (allocentric) could reveal where to go (e.g., ‘go for the star, not the square.’) In this framework, the allocentric strategy leads to a memory that can be declared (star,) whereas the egocentric strategy leads to a memory which is best described as a procedure (turning left.) The experiments of Schreiweis et al. seem to suggest that FoxP2 manipulation leads to a shift in the balance between these two processes, in this case towards the egocentric strategy. Coincidentally (or not,) language acquisition, the process that FOXP2 is famous for being involved in, is a very prominent case of egocentric/procedural/habit learning.
These two posters are highly relevant to our own research using invertebrate models. As I’ve discussed previously, similar circuit organization can be found in the fruit fly, Drosophila. In work we have recently submitted, we find that the Drosophila FoxP gene (there is only one) distinguishes between self and non-self behaviors (egocentric and allocentric, if relating to the above maze experiments) and we can shift the time point at which habit formation occurs by switching off a region in the fly brain called the mushroom-bodies (Brembs, Curr. Biol. 2009.) Thus, it seems there is an ancient organization of behavioral control, present in the last common ancestor of invertebrates and vertebrates, the Urbilaterian, in which learning about external cues integrates with learning about the behavior of the animal itself and behavioral flexibility is only given up after sufficient training.
Before we had sufficient evidence to see how strong these parallels between our work and the vertebrate experiments really were, we started to call the two processes world- and self-learning, respectively. If these parallels keep following the same trajectory, then mechanisms of certain self-learning procedures could be as ancient as the ‘Kandelian’ world-learning mechanisms, having influenced flexible and stereotyped behavioral control for 500 million years.
But of course, at this point, it could all be confirmation bias on my part, so there are many more experiments to be done before this hypothesis can enter the textbooks as fact. Nevertheless, on this day, at the SfN 2011 meeting, at least these two poster presenters seemed to like the idea.
Heisenberg Fellow, Freie Universität Berlin
Björn received his PhD in genetics at the University of Würzburg with Martin Heisenberg working on operant learning in the fruit fly Drosophila. He spent almost four postdoctoral years in John H. Byrne’s lab working on operant learning in Aplysia before setting up his own lab in Berlin. Since 2009, Björn is a Heisenberg Fellow of the DFG with his own small research group.