Trying to figure out how to proceed with this adaptation paper, and I retreat here.
Minor problem is the rewrite: this will get done, not too worried about it. May be the last thing that gets done, since the major problem needs to be solved materially first.
Major problem is the modeling. The original paper details a complexified version of the model proposed by the authors of a paper that our paper basically replicates, accidentally. We were scooped, and so I thought that to novelify our paper, I would take their model and try to push it a little further, and do some extra analysis of it.
What I didn't do was what I should have done, which was to also test the simple model and show that it is somehow inadequate, and that complexification is therefore justified or necessary. I am actually ambivalent about this. My main idea was that we should take a model which has generalizable features and use it to explain the data; but, it's true that the more sophisticated version can't really be credited with achieving anything unless the simple one can also be shown to fail.
So the problem is that I have to do a lot of testing of the simple model. So, I decided that I would scrap the section that was already in the paper and replace it with an evaluation of the simple model, but make up for the lack of 'advance' by employing the simple model in a more realistic simulation of the actual experiments. This is what I've been trying to do, and basically failing at, for several weeks now.
The first idea was to use the simplest form of the model, but the most complete form of the stimuli: videos, played frame by frame and decomposed into the relevant stimulus bands, adaptation developing according to a simple differential equation with the same dimensions as the stimulus. This didn't work. Or, it almost worked. The problem is that adaptation just won't build up in the high frequency channels, unless it's way overpowered, which is against any bit of evidence I can think about. If high frequency adaptation were so strong, everything would be blurry all the time. I think it should be the weakest, or the slipperiest.
Soon after that, I gave up and retreated to the 'global sum' model, where instead of using 2d inputs, I use 0d inputs - i.e. the stimulus is treated as a scalar. I get the scalars from the real stimuli, and the same dynamic simulation is run. It's tons faster, of course, which makes it easier to play around with. I figured I would have found a solution by now.
See, it's so close. It's easy to get a solution, by adjusting the time constants, how they vary with frequency, and the masking strength, and get a set of simulated matching functions that look a lot like the human data. But I figure this is uninteresting. I have a set of data for 10 subjects, and they seem to vary in particular ways - but I can't get the simulated data to vary in the same way. If I can't do that, what is the point of the variability data?
Also, last night I spent some time looking closely at the statistics of the original test videos. There's something suspicious about them. Not wrong - I don't doubt that the slope change that was imposed was imposed correctly. But the way contrast changes with frequency and slope is not linear - it flattens out, at different frequencies, at the extreme slope changes. In the middle range, around zero, all contrasts change. Suspiciously like the gain peak, which I'm wondering isn't somehow an artifact of this sort of image manipulation.
I don't expect to figure that last bit out before the revision is done. But, I'm thinking it might be a good idea to play down the gain peak business, since I might wind up figuring out that e.g. adaptation is much more linear than it appears, and that the apparent flattening out is really an artifact of the procedure. I don't think I'll find that, but - did I mention I'm going to write a model-only paper after this one? - seems a good idea not to go too far out on a limb when there are doubts.
I have a nagging feeling that I gave up too soon on the image-based model...