AMHAZ journal: March 2012

Monday, March 26, 2012

standing at fenway station, thinking, as usual, "what's my problem", and i came up with a nice little self-referential, iterative statement of it: it's pablum, but i'm not usually this verbally clever, so let's write it down:

don't do what you don't believe
can't believe what you can't understand
won't understand what you won't do

so that's the problem; it's not exactly as i would normally say these things. if you asked me before i formulated this, i would probably say, "i don't like to do what i don't understand", and that's what i started out thinking. but then i asked, "why is that?", and decided that if i don't understand it, i can't really attach to it - then i saw the loop.

interestingly enough, the solution is the negation of the problem, literally:

do what you believe
believe what you understand
understand what you do

both of these statements have a sort of inertia; once you have one of the predicates, it starts rolling and keeps going. since they aren't specific, both statements are generative or productive - the referents don't need to be the same on each loop, but of course they should be logically linked.

(really, the middle statement isn't necessary in either one, with 'believe' replaced with 'understand' in the first line. i feel like the middle line adds some depth, though, so there it is.)

Wednesday, March 21, 2012

vision includes the world

I need to procrastinate slightly more productively, so here is a short essay relating some of my thoughts on visual consciousness.

For years now, I've understood visual experience or consciousness (experience is easier to say and write, and has less near-meaning baggage, so let's continue with that term) as having two components:

1. The image. A part of vision is direct, which means that when you see an object, it is true to say that what you see is the thing itself, or at least the light reflected/emitted by that object (this similar to the idea of the 'optic array'). This is a difficult position to hold, but I think it is a necessary default. The alternative, which is definitely more popular these days, is to say that what you see is entirely a representation of the thing itself, instantiated in the brain. This sort of idealism is attractive because the brain is obviously a self-contained system, and because experience also seems to be self-contained, and because every aspect of experience seems to have a neural correlate. If I say that vision involves processes or structure outside the brain, I have to explain why we don't see what we don't see; why don't I see what you see, for example?

It seems to me that in placing the contents of consciousness somewhere in the physical world, there are two possible null hypotheses: either everything is absolutely centralized, completely contained within the brain, or everything is absolutely external, completely outside the brain. The second account is rare these days (see Gibson), as the only job it leaves for the brain is sorting out of responses to visual experiences. It seems clear that much of vision actually does occur within the brain, and I'll get to that in part 2, below. Now, these null hypotheses: that everything is internal is an objective hypothesis, based on e.g. a scientist's observations that the brain is correlated with experience; that everything is external is a subjective hypothesis, based on e.g. my observations that what seems to be in the world is actually there, i.e. that my sensations are always accurate.

Since visual experience is a subjective process which cannot be observed, I like to stick to the subjective null hypothesis: everything is external unless shown otherwise. Immediately on stating this hypothesis, we can start to make a list of the components of visual experience which are surely neural.

2. The brain. Let's start with the subjective null hypothesis: everything you see is there, in the world. Just a little thought proves that this can't be true: faces are a great example. Look at two faces, one of a person you know well - your sister or brother, maybe - and one of a strange that you've never seen before. There, in the faces, you see a difference that you can't deny, because one seems to have an identity and the other does not. This difference isn't purely cognitive or emotional, either, because one will easily make the admission that the face of his sister is his sister. Seeing her face, he will say, "That is her!" Clearly, however, the identity is not in the face - it is in the observer.

If this isn't a satisfying example, color perception must be. Color is not a property of images, it is a construct of the brain - this is not difficult to show, either with the proof that identical wavelength distributions can yield different color percepts in different conditions ('color constancy'), or with the inverse proof that different wavelength distributions can yield identical color percepts ('metamers'). We understand color as a brain's capacity to discriminate consistently between different (simultaneous or asynchronous) distributions of visible radiation. It is something that exists only in the observer.

These are easy, but it does get harder. Consider depth perception. In a scene, some things are nearer or further from you, but there is nothing in the images you sense that labels a given point in the scene as being at a particular depth. There is information in the scene that can be used by the observer to infer depth. So, depth is another part of the brain's capacity to interpret the image, but it is not a part of the scene. This is a more more difficult step than with faces or colors, and here's why: whereas a face's identity, or a light's color, is plainly not a property of the world itself, we know that the world is three dimensional, and that objects have spatial relationships; and, we know that what we see as depth in a scene informs us as to these spatial relationships. However, we then make the mistake of believing that visual depth is the same as space; on reflection, however, we can begin to understand that they are not the same. Depth is an neural estimate of space based on image information.

Let's keep going. Spatial orientation is another good one: 'up' and 'down' and 'left' and 'right' are, in fact, not part of space. I've already made my complaint about this one: spatial orientation is created by the brain.

If we keep going like this, what do we have left? What is there about visual experience that is not in some way created by the brain? How can I state that there is an 'external' component to vision?

The only feature of vision, it seems, that is not generated by the brain is the internal spatial organization of the image, the positional relationships between points in the image - what in visual neuroscience is recognized as retinotopy. Spatial relationships between points in the visual field do not need to be recovered, only preserved. A person's ability to use this information can be lost, certainly, through damage to the dorsal stream (simultanagnosia, optic ataxia, neglect, etc). This does not mean that the visual experience of these relationships is lost, only that it is unable to contribute to behavioral outputs. I think it is a mistake - commonly made - to assume that a patient with one of these disorders is unable to see the spatial relationships that they are unable to respond to. Assigning to the brain the generation of positional relationships needs evidence, and I know of none. A digital, raster image based system would be different, of course: a video camera detects images by reading them into a long, one-dimensional string of symbols. Positional relationships are lost, and can only be recovered by using internal information about how the image was encoded to recreate those positions. The visual system never needs to do this: it's all there, in the very structure of the system, starting at the pupil of the eye.

So, here is my understanding of vision: it is a stack of transformations, simultaneously experienced. The bottom of the stack is, at the very least, the retinal image (and if the image, why not the logically prior optic array?). Successive levels of the stack analyze the structure of the lower levels, discriminating colors, brightnesses, depths, and identities; this entire stack is experienced simultaneously, and is identical with visual consciousness. But, the entire thing is anchored in the reality of that bottom layer; take it away, and everything above disappears. Activity in the upper levels can be experienced independently - we can use visual imagination, or have visual dreams, but these are never substantial, and I mean this not in a figurative sense - the substance of vision is the retinal image.

This view has consequences. It means that it is impossible to completely reproduce visual experience by any brain-only simulation, i.e. a 'brain in a vat' could never have complete visual experience. Hallucinations must be mistakes in the upper levels of the stack, and cannot involve substantial features of visual experience - a hallucination is a mistaking of the spatial organization in the lowest levels for something that it is not. Having had very few hallucinations in my life, this does not conflict with my experiences. I can imagine that a hallucination of a pink elephant could actually involve seeing a pink elephant in exactly the same experiential terms as if one was there, in physical space, to be seen, but i don't believe it, and I don't think there's any evidence for vision working that way. Similarly, dreams are insubstantial, I claim, because there is nothing in that bottom layer to pin the stack to a particular state; memory, or even immediate experience, of a dream may seem like visual experience, but this is a mistake of association: we are so accustomed to experiencing activity in the upper stacks as immediately consequent to the image, that when there is activity with no image, we fail to notice that it isn't there! I think, though, that on careful inspection (which is difficult in dreams!), we find that dream vision has indeterminate spatial organization.

Anyways, that's my thinking. This has gone on long enough, I need to work on this proposal...

Sunday, March 18, 2012

oscillate, explode, or stabilize

must learn about runge-kutta methods,
must learn about runge-kutta methods,
must learn about runge-kutta methods.

clearly this too-complicated model is suffering because of the temporal resolution. i've spent nights now trying to figure out why the thing wasn't working right - and did find a few errors along the way, which i don't think would have made or brake the thing anyways - and finally i conclude that the response time constant was too small. this is strange, because the same model works great with a 2d network, and perfect with a single unit; apparently there's something about this network, which is essentially 3d, which effectively makes the time constants faster... it must be that compounding the differential during the convolution, over multiple filter layers, effectively speeds everything up.

it's not like i wasn't aware of this problem at first. i thought i had solved that by doing the global normalization, where the convolution stage would basically be treated as a single layer. last night, i decided that collapsing that stage to one layer was a mistake, because it resulted in the pools everywhere being overwhelmed by the finer-grain channels, since those filters are more numerous. that may actually be correct, with some sort of leveling factor, but at any rate i took out the collapse. it didn't change performance much, but that's when i was using a too-complex test case (two faces), instead of the current test case of two gratings. now i realize that the pooling was accelerating the responses, resulting in useless behavior by the network - turning up the interocular inhibition to any level that did anything tended to result in ms-to-ms oscillations.

so, the compounding of responses was doing it, i guess, and would be doing it even if i had the pooling collapse still worked in. but now i can't understand why i didn't get the same problem, apparently ever, with the fft-based version of the model. now i'm suspicious that maybe i *did* get it, and just never perceived it because i wasn't doing the same sorts of tests with that thing.

not quite back to the drawing board. i wish i could get away from the drawing board, for just a few nights, so i could work on this goddam proposal like i should have been doing for the past 2 months.

Tuesday, March 13, 2012

An Instantiation of a General Problem

(I wrote this, but never finished it, in China back around Christmastime. Randomly remembered it today, and thought this would be as good a place as any for it.)

The key was to be found across the city, in the
old commercial district. We had tried simulations, implanted demos, viewed stereoscopic
images through a haploscope we found in storage in the medical school. After
all of these, we had tried hallucinogens to modulate the imagined presence of
the key, but it was all to no avail. At least, we said to ourselves, when we
finally approach the key we will be familiar with it. The front end of the
process will not be a surprise.

The approach, however, to that front end, would
be horrendous. First, our camp was protected from the feed. This kept the peace
from finding us, but it also meant that our emergence into the feed would stand
out like a tree in the desert. We had monitored the security cycles for days.
Most would say that such monitoring was futile, since the cycle paths were
random, generated with new seeds every minute give or take another random
cycle. Any attempt, most would say, to predict gaps in the cycle would result
in no better chance of unnoticed entry than no attempt at all, with the added
hazard of false confidence to mask the creeping signs of detection.

It was possible, though, to closely estimate the
number of cycles. We could detect the passes themselves, which gave us data for
the estimation. The different cycles were unique, originating from different
security servers, each assigned its own identification during its current
generation. Given all these data, we had a method for estimating, at any given
moment, the likelihood of a pass. The optimal estimate could be made using the
previous twenty seconds of data. You could have pointed out that a likelihood
is the opposite of a certainty, at least along a certain conceptual dimension.
You could also have pointed out that the optimal estimate was lousy if those
twenty seconds contained a generation update. We would have ignored you.

Once inside, we would have to obtain city ids
from an admin, which was not trivial, but not a problem as long as we could
quickly make contact with Tsai, our woman on the inside. We knew she was still
online and that her admin was current, so as long as she wasn't in some
unshakable stupor, she would tie us on and we'd be set for the rest of the
trip. Anyways, persisting for a few minutes with unregistered cids wasn't as
dangerous as suddenly emerging out of the void. An impulse is like that tree in
the desert and the primary means of detecting aliens, while trouble finding a
cid registration is a basic function of the feed servers, which would be
checked in serial, assuming corruption or damage first and alien somewhere
further down the line. Tsai could just tie us onto the oldest and most remote
server, plot a false geographic history of intermittent reception and an
outstanding service request, and there would be nothing in the feed to mark us
out. The tree would dissolve into a puff of dust.

The next problem would be the actual emergence
into the city. Feed presence can be smoothed over, anyone can appear to be
anyone, fit into any group, assume any identity. The body, however, is much
less convenient to modify. Their hair is long, but ours is short. Their skin is
yellow, but ours is brown. We stand head and shoulders above them on the
street, and we have no choice but to travel on the street for the most part, by
foot, in the open, making stark and clear the comparison between foreigner and
local. But, there are other foreigners in Haisheng. They are few and far
between, but there are others, and though we draw attention it is natural,
because who can ignore a brown spot among yellow? The noticing is in itself not
a threat. But when others are looking for you, being easily noticed is a step
away from being easily found. We did not want to be found, but there was no
choice but to be noticed.

The final hazard was beyond any interaction with
the first two. At the time I could not imagine how, but I was still cognizant
that there was a possibility that the locked id had already been accessed by my
competitors before I had retrieved it. If so, they may even have already
decrypted it, outformed the important information inside, and restored the
encryption. This was beyond any vital worry on my part, since the main danger
was that knowing the key, and that I was looking to open the id, they might be
waiting for me at the site. This meant I would have to move slowly through the
streets, below them when possible, work quickly when it was time to get the key,
and maintain vigilance on all channels at all times. There was nothing else we
could do but be vigilant.

I can tell you more about the key without
compromising the truth of the mission. Someday down the line, you may be able
to put two and two together, but by that time whether or not you know such an
obscure truth won't matter much, and you'll be occupied with obscuring your own.
Anyways, it is an interesting detail, and may spark one or another interest in
you.

The id I had retrieved was that of a neural
engineer from a century or so earlier. We needed to query it regarding some
interactions it had had at one time with our main objective, whose id at the
time was missing and presumed destroyed. As it turns out this engineer had
dabbled in id encryption, which was a new field in those days, specifically in encryption
through perceptual experience. Though the field was active at the time, it was
- and remains - completely unknown to the science that this particular engineer
had worked on the problem. It was a private pastime, perhaps a paranoid fear
that a great advance might be stolen, or maybe it was just a fear of inadequacy
in an outsider bringing to the field such an idiosyncratic development. At any
rate, this engineer had come up with something exquisite, which was probably
unmatched by anything else produced by her generation. She may have meant it
entirely for herself. Today, it's a work of art, but the tech is fundamentally
outdated.

This is a digression, I'm sorry. Outdated or not,
it was a good lock, and on site we still needed the key to open it. The encryption
was applied to the id by taking the online state of some suite of perceptual
systems, definitely including visual, possibly other - and by the way, don't
take my ambiguity as indicating anything other than an intention to be
ambiguous - and using this neural state as the key for the encrypted id. The
entire state couldn't  be recorded, of
course, since the subject would have to be standing out in the open at the
location, i.e. a true state scan would be impractical, especially in those
days. Instead, something was probably worn, perhaps obvious or perhaps hidden,
instantaneously recording a blocked brain state amounting to just a few
terabytes. It was a functional state, meaning that it could be reproduced in
other human brains, but our initial estimate that a good visual simulation
would suffice proved wrong. We needed to be there, unless someone could explain
exactly what composed the key, and the only person who could tell us that, it
appeared, was the one locked in that id.

Back to the problem. Being noticed, maybe being
scooped, these were mostly outside our control. But skipping as an alien into a
secure feed using random-cycle maintenance, that's something we can deal with.
Look at the figure field. We used standard methods to monitor the cycles and
establish their regeneration characteristics, how many there were, durations of
the cycles, amplitude of the duration modulation - everything here is something
you've seen before. You all have four minutes to generate the optimal estimate from these data, starting - now.

Monday, March 12, 2012

multi-channel M-scaled discrete filter convolution

Okay, so, I built this really neat discrete filter-based visual field model, planning to use it to measure binocular image statistics and to generate more realistic rivalry simulations. I hoped that doing the simulations would actually be quicker using the filters, since there would be far fewer filters than pixel images (I was using image-filter convolution to do the simulations I showed 2 posts ago), and the filters only needed to be represented by their scalar responses. Hoped but did not believe..

So now, I just spent the weekend (wrote that first paragraph a week ago) staring at the code, trying to figure out how to do, essentially, convolution of a function with an irregular array. It is complicated! I wrote a function to get local neighborhood vectors for each filter within its own channel, and then stared at that for a couple of days, and then realized that I should have written it to get the neighborhood without regard to channel. It's a pretty gangly operation, but it does have a good structural resemblance to stuff I've been thinking about for years. Ed and Bruce's relatively abstract idea about the broadband gain control pools, well, I've built it. Not for the intended purposes, since there's not going to be any gain control here - the only suppression that will be involved is like an 'exit gate', the permission for information in the channel array to be moved out to the later stages ("consciousness", we'll call it).

And, I say again, it's complicated. It's definitely not going to be faster than the rectangular filter convolution; in fact, it's likely to be 3 or 4 times slower, and it's going to produce rougher looking images on top of that. All this just to incorporate stupid M-scaling into these stupid rivalry waves. I swear, I can't think of a better way to do it. And the thing still isn't going to know anything about surfaces or faces or houses or any of that stuff, and it's going to take forever to debug and proof since it's going to be so slow...

But it's going to be cool.

Monday, March 05, 2012

retrograde inversion

Several times in your life you may hear it noted that the retinal image is reversed and upside-down. Fewer times than that, hopefully, you may then hear it noted with curiosity that the brain somehow undoes this retrograde inversion. When you do hear this, please interject with the following:

"The brain does not reverse the coordinates of the retinal image. The brain does not know or care about about the retinal image's orientation relative to the world; as far as the brain is concerned, the image is not upside-down, or upside-up, or flipped or double-flipped. It is not delivered to the brain with reversed coordinates, but with no coordinates at all. The brain assigns spatial coordinates to the visual information it obtains from the eyes. It does this by integrating information about body position, gravity, and other consistent sensory cues about the state of the world. There is no reversal or correction of coordinates, there is only assignment of coordinates."

You will promptly be thanked for clearing up the misunderstanding, and hopefully your interjection will serve to end one strain of a particularly irritating bit of pernicious nonsense.

Thank you.