AMHAZ journal: vision

Showing posts with label vision. Show all posts

Wednesday, August 08, 2012

20/100+

Yes, so my glasses broke about 2 weeks ago. I explained it already in the July 24 post.

I still don't have my new pair. Should be here any day now. I'm wearing a 10-year old pair, over-negative in both eyes, but only at night to watch the Olympics and "work" on my laptop. Otherwise I need to be within 12 inches or so to see clearly.

At distance, my acuity is no better than 20/200. I can tell just by looking into the distance and sticking a finger out: no detail that I can see is much smaller than maybe a quarter of the thickness of a fingernail (a little bigger than a degree). So, my acuity limit is probably not much more than 4cpd.

If this was the best I could do, I would be on the bad end of low vision. But it gets better the closer you get to my face (come on, get closer to my face) - like I said, everything is sharp and clear within a foot or so. But, at distance - which is how I spend a lot of my commuting time, at least, and a lot of time at taekwondo - I'm 20/200. How bad is that?

For one thing, 4cpd is about the acuity limit of a cat (or of some jumping spiders). So it's not bad on a basic vision standard, because cats and jumping spiders are very visual creatures. 4cpd is good enough to get by on vision. But for a human, in the world of humans, it's not too good. At distance I can't recognize faces, at all. Ten days' practice, and I just can't do it if I don't have some other information, and then I don't think it counts. I can't read signage - I can't tell what trains are coming into the station at Government Center. None of that is debilitating, but it makes sense to call it a handicap. High frequencies aren't just details, they're content.

20/200 is resolvable detail of 10 minutes of arc. At 35cm, about where my laptop screen is right now, 10ma is about 1mm, which sounds small. But the dot pitch (vertical/horizontal pixel separation) on my screen is about .227mm. With my corrected acuity being around 20/15, I should be able to see details at least as small as 1ma, about .1mm apart, so I can discriminate individual pixels with my normal acuity. At 20/200, I wouldn't be able to discriminate details smaller than 4 or 5 pixels across; I would definitely not be able to read this text (whose lines are 1 pixel thick, and which tend to about 10x5 pixels HxW).

Printed text, which I like to hold pretty close to my face, at least 20cm, would still be unreadable. I'd have to hold it much closer, where the shadow of my head would start to get in the way. I'd need large print. I wouldn't be able to read music. I'm wondering how much acuity you need to do the classic threading of a needle (I haven't even started to wonder about depth perception - I noticed that it was off for the first few days, but I seem to have adapted pretty well, and I'm not afraid to cross the street as I was at the start), or to slice meat and vegetables without slicing yourself - it's not the same as reading, since you're looking for centroids for those things, but I wouldn't really want to try...

Not that I'm going to try, but I could: visual acuity at about 20 degrees eccentricity is close to 20/200. There you have to contend with crowding, though, so you're effectively worse off.

So 20/200 isn't disabling, but it does prevent you from accessing all sorts of primate-relevant stuff. Faces, reading, music, fine finger-based activities. It's been interesting, but I'm just about ready for my new pair of ($10!!) glasses to arrive.

Tuesday, July 24, 2012

height and a black belt

two posts in a day, this is bad.

just been meaning to write this down: saturday afternoon I leave the apt to go to tkd, walking down the sidewalk, take off my glasses to blow off some dust, then go to put them back on, and they snap in half. so my glasses broke.

i go to tkd, not wearing glasses, and not really able to precisely recognize faces beyond a couple of meters. in the distance i see a woman practicing something, a black belt, with mid-length black hair pulled back in a ponytail. at first i think, that's j*, just i guess because her style and general characteristics seemed right. but then i thought, no, that's not her, she's too small. i got a little closer, and yes, this woman was too small to be j* - a couple of inches shorter than me. a little closer and, then, yes - it was indeed j*.

i felt certain that j* was taller than me - if you had asked me before how tall i thought j* was, i would have guessed, oh, maybe 6' or so, she's a real amazon. apparently she's closer to 5'8". taller than the average woman, and still an amazon, okay, but not what i thought. in my mind, i guess, her high status (at tkd) had me convinced that she was actually larger than in reality. i've noticed this effect before (it's been studied for a long time, and probably known forever), but never in a context like this: certain that a person had a relatively unusual trait (woman taller than me), and then unable to recognize her because i don't see the trait, when it was never there in the first place.

"she's actual size, but she seems much bigger to me"

Friday, July 20, 2012

update

that last entry was kind of embarrassing. guess it's worthwhile to keep a record of peaks in frustration.

anyways, kind of better now. with the data from the new rivalry experiment, i was 1) making an error in the processing, and 2) even with the error corrected it was a dumb analysis. i did the 'better' analysis, which i had had in mind but thought would be more complicated than it was (and which did require that that error was fixed), and got basically what i was looking for: before a target is reported as seen, there is an increase in its strength.

i then tried to expand it out, looking for effects in non-target locations. this also seems to work; i'll have to figure out how to separate the effects of spatial correlation in target strength, i.e. a part (maybe the major part) of these peripheral effects will be non-interesting because they will be firmly tied to the central target effect.

i also will need to make the analysis more specific, since each time a target is reported, it matters whether the transition is from a different report or an absence of report. this makes a difference in how the data are interpreted: 1) the increase in stimulus strength caused a dominance change (if there was an immediately previous report of a different target), 2) the increase in stimulus strength firmed up an indeterminate state (if the previous report was 'mixed'), or 3) the increase in stimulus strength made the current dominance state noticeable, i.e. made the target color visible.3

so, i expect that i will need much more data to make these sorts of different relationships clear. i will collect another half-hour's worth of data today, then i'll have more than an hour total. may be able to get something interesting out of that...

Wednesday, July 04, 2012

scintillating scotoma 3.b

I just can't let this go. Maybe this is as far as I can get with it for the time being. I wasn't satisfied with the straight-line model of the CSD wave that I had been playing with, because there was just no straightforward way of figuring out direction, origin, speed, etc. But I had a great idea: say the wave is like a surface wave, with a point origin, spreading out in every direction at a the same speed. Maybe this isn't true; it may be that the phenomenon is limited to V1; it may be that the origin is, e.g., the interface between cortical areas; it may be that the speed or direction varies. The first isn't a problem, but if we assume the second two (point origin, constant velocity) are true, or close to true, then we can come up with the simple model you see in the gif above: according to my June 24 data (fit with a lazy grid search), the CSD is a surface wave, traveling at (exactly!) 3mm/min, arising within about 3mm of the V1/V2 border, near the foveal confluence.

I'm not going to go into detail here about the spatial properties of the wave. It's simple, that's enough for you.

Anyways, nothing crazy here; this is all consistent so far with what I understand of the published research so far. It's a lot of fun figuring out how to create and convert all these maps, and it's amazing just how solid those parameter estimates are given the data. Sometime soon I'll look deeper into it, hopefully I can put it off for a while.

Tuesday, June 26, 2012

scintillating scotoma 3.a

I went ahead and figured out how to do the cortex transform. there are many papers describing the equations; some good recent ones (like this one) which are basically reviews at the same time that they are tweaking one or another aspect of the basic model. It's not really that complicated; the log-polar transform is very similar, except that the angles are calculated outside the logarithm. The space-V1 transform is the logarithm of a complex number representing the spatial coordinates plus the limit of the foveal confluence. The paper I linked above describes what further steps can be taken to get the transform more precise, accounting for meridional anisotropies. they go further, but I stopped there. The basic model was proposed by E.L. Schwartz in 1977, and hasn't changed much since then; I'm using Schira et al's version with their shear equation, and some parameters they cite in another paper.

This is similar to the second plot from the last post, but you will notice the geometry is different, as it gets narrower towards the fovea (lower part). Colors indicate time in minutes as shown by the colorbar. The grid drawn in the background isn't labeled, but it's easy to understand if you've seen these before. The lines going up and down are, from left to right, the superior vertical meridian, the left superior 45 degree meridian, the left horizontal meridian, and so on. From bottom to top, the left-right lines are spaced 5 degrees of visual angle apart. You don't see the first one until about 30mm up. The origin in this plot, (0,0), is where the foveal representation converges with V2 and V3, the foveal confluence.

This is interesting, the foveal confluence. I probably had heard of this before and forgotten it. I actually stated to E* yesterday that I didn't know what was on the other side of the foveal edge of V1, though I knew that the edges are flanked all the way around by V2. In fact, V1, V2, and V3 foveae all meet in the same place. This is apparently a relatively poorly understood region of visual cortex; imaging and physiology studies have focused on the more peripheral regions. The reason is that it can't be certain of what is being studied if one looks closely at the confluence, since the three areas are mixed together in a fashion that is still not well understood. I'm going to read more about this (the main writing on it is by the same group as the paper I cited at top; this one explains things up front).

Okay, so that map. What can I do with it, now that I have the coordinates right (or as close to right as I can)? Yes: I can measure the rate of progression of the wave in cortical distance over time. Awesome. I don't have the best method worked out just yet, but here's my approximation, summarized in the last figure below:

On the left, we have the same coordinates as in the figure above. The plotted line is the mean, over time, of the recorded scotoma regions. This is not a great measure of position of the waves, since as they got further out and larger, I couldn't trace them completely, and because it took time to trace them, so at a given epoch a trace might be in one place, or another, and that shows up here as a back-and-forth wave, on top of whatever sort of limiting bias is imposed by the screen size, etc. Still, it's okay. We know this because of the next plot: On the right, we have the distance of that waggly trace (from its starting point near the bottom of the left plot) as a function of time. A straight line. That's not why we know it's okay; it's because of the slope of this line: 2.76 mm/min. This is extremely slow, but exactly in the realm of cortical spreading depression. Not going to give references on that (need to save some work for an actual paper on this business), but they're there. Pretty sure I'm doing this right.

Sunday, June 03, 2012

sum.. sum.. summm... summation?

ok, so i'm working on this classification image paper, and it's going really well, and i'm pretty happy about it. i feel like i've got a good handle on it, i'm writing it in one big shot, the analyses are all good, the data are fine, it's all under control. i'm pretty happy about this one. i keep telling myself that, and then noticing that i keep telling myself that. i guess it's in contrast to the blur adaptation paper, which was such an ordeal (took 2 years, basically), and then the magnification paper, which just isn't much fun. i feel myself moving down that priority list - hey, i should do a post on the priority spreadsheet! i made some nice plots in there!

anyways, the CI paper, it's going well, but i'm constantly on the lookout for problems. so tonight, i finally thought of one. not a crucial, deep problem, but a problem with how i've calculated some of the modeling stuff, a serious enough problem that i'll probably have to redesign a bit of it before doing the final runthroughs. i'm writing this entry so i can just sort of kick off thinking of how to solve the problem. here it is, right plain as day in this little cluster of plots from last year's poster, which has become this fine little paper:

the problem is spatial summation - or, the problem is that you don't see anything about spatial summation in those plots. for the main models, i have a CSF that was measured using test-field-sized images. the thresholds measured must reflect a sort of spatial summation, then. the problem is, i've been using those thresholds to set the baseline thresholds for the models, and then summing over the spatial responses. i had kind of had an inkling that i was being lazy there, but had overlooked how obviously stupid that is. i haven't tested the models on the threshold tasks, but i think that they would necessarily get much lower thresholds than the humans; spatial summation should give you a lower overall threshold than you would get for any single location. i need to think of a quick way to solve this, because i don't want to wind up estimating the model CSF through simulations...

and the simulations then raise the problem of noise, and how many samples should there actually be, etc etc... i guess there are benefits to doing things the simple way first, but i think i've run myself into a weird little corner here. gonna need to talk to somebody about this, probably..

Monday, May 28, 2012

scintillating scotoma 2

another aura! (no post in a month, and now it is forced!)

right now it's flying out to the far left field, but there's also a weird 'rough patch', a bit nearer in that appeared late on. i got a good map. i didn't expect this one coming on at all, except for what i think is just sort of standard low-level paranoia which i've developed since this all started. it's hard to tell what's prodrome and what's just coincidental suspicion.

anyways, i was reading through a section of the classification image manuscript, and, yep, can't see the letters. i think that 8/10 of these things so far have begun while i was reading. not sure if that's because i spend the majority of my time working with text in one way or another, or if that actually pushes things over the edge.

***

it's been a few hours now. here's the map i drew of this last event:

similar to last time, except that it's in the left field this time. it started out below fixation, and did the slow arc outward, following a really similar path as the last one. below, i'll show you below just how similar they are.

as for the rough patch: in the plot above, notice that in the superior field there's some green (and hard-to-see gray) scribbles over the neat arcs; that was a region that i noticed late which wasn't blind, and wasn't flickering, but was clearly.. unclear. it's at about 10 degrees eccentricity, so it's hard to see well out there anyways, but it was obvious that something was wrong. i could see the scribbles that had already been drawn, but it was all very indistinct and jumbled, and i couldn't see the motion of the cursor even though i could tell that it was laying down green/gray scribbles of its own. without any other explanation, i'm going to guess that this was the fabled extrastriate scotoma - my V2 was getting some CSD!

ok, now some analysis. first, log-polar maps. i haven't gone and gotten/worked out a cortical remapping scheme, but putting things in in log-polar coordinates is almost as good. actually, what you do is put them in log(ecc+1)-polar space, so you can see where the fovea is (zero in case you're dull). here are logecc+1-polar-time plots for the last two events (lets say L = log(ecc+1), for easier reference):

time is measured here from start of recording. these data are smoothed versions of the drawn maps in 10 degree radial steps. the scales are the same except that these are for opposite hemifields. the z-axis is color coded.

i mean, you can just see that those two maps are almost identical. i got more data the second time because i wasn't occupied with working out a system (btw i was caught off-guard; i wrote a matlab script to record in real time, but it not work in matlab-64, and i put off doing the -32 install because... ah.. it's done now). the origin is similar - look at these plots:

sorry about the colors. these are the same data as in the scatter plots above, except these are collapsed over the x(angle)-axis. actually they aren't quite the same: here i've used linear regression to estimate the earliest time (before recording began) that the visual event could have begun, and subtracted it out of the time axis, so these plots start when recording began with respect to when i estimate the event began. so, basically, this aligns the data. 2 things: one, the rate of advance (remember that it's radial advance, so these technically are distorted plots; i assume that's why the slope changes with angle) is basically identical in the two cases, ~.16 L/m. and the origin, ie where it all begins, is ~180-170 degrees in both cases (that's directly below fixation).

i've got other analyses, but the above sums up the interesting stuff. i know i've seen these things do weirder things, following more difficult-to-understand courses, and i hope i see one of those next time i'm able to record this business.

Tuesday, April 24, 2012

scintillating scotoma

a few minutes ago (~22:35), reading, and i notice that letters are hard to see. that sensation of having a bright afterimage at fixation; it's moving rightward, usually it's been leftward (I haven't taken notes on the past 2 occurrences, sadly..); this it is at least the first of the last 4 to go through the right field, if not longer.

it begins with just a weird sense of scotoma-ness, very near fixation, but the blind areas are hard to pin down - they seem to change very rapidly, or else it's more a sensation of blindness rather than actual blindness. it's strange how it sticks at fixation even as it arcs out into the periphery; it seems it always arcs into the lower field, after arcing just a bit above fixation. i've not noticed yet one passing across the field, maybe it is restricted to hemisphere?

it's almost gone at this point (almost 30 minutes after the first signs), and all that's left is a flickering at the very top of my visual field, as if there's a light flashing on my eyebrows; interestingly, if i look up, it disappears, which is strange because it should be attached to the field location. i can look up, it disappears, look down again and it reappears. maybe was an interaction with the reflection of room/computer lights off my eyeglass frames? can't test, it's all gone now..

and i have a headache (actually it started a few minutes in; the light show was so slow to start, i thought we were skipping straight to the headache for a few minutes, a bit of disappointment, but it worked out!)

also, some hints: today and yesterday, i several times wondered if i wasn't going to get a headache soon, without understanding why. not sure what sets off those feelings.. this afternoon, i thought i saw some flashes at some point when i was walking down the hallway, and that really made me suspicious; and, all day, really tight, painful muscle spasm throughout my upper back, both sides trapezius.

23:05

map below: i have a few of these now, should get to processing them this summer..

Edit: look at what this guy has done: http://www.pvanvalkenburgh.com/MigraineAura/MigraineAuraMaps.html. pretty amazing.

also, i did wind up writing a script to analyze these plots; once I get some stuff settled, i'll post those in a new entry.

Friday, April 20, 2012

lazy friday dark adapt

Spent afternoon of Friday, 4-20-12, with a ~3 log unit (~.2%) ND filter over my right eye. Made the following observations:

(Took the filter off after about 4 hours. It wasn’t bothering me much anymore, but I think the plastic and rubber stuff in the goggles was irritating my eyes, which were starting to feel kind of dry and red. Light adaptation is really fast, it’s just been a minute and the (formerly dark adapted) right eye’s image only seems slightly brighter than the left’s.)

Noise: the dark adapted eye’s view is noisy, and the noise intrudes into the dominant view. It’s irritating. The dark adapted view isn’t being suppressed, though, it’s there like a ghost. Double images, from depth, are strikingly noticeable, not sure why.
Pulfrich effect: first time I’ve really seen this work. I put my index fingers tip-to-tip and move one from side to side, fixating on the still one, and the moving finger looks like it’s rotating. My hand even feels like it’s rotating.
Pulfrich effect 2: Fusion isn’t always working, but I seem to be ortho a lot of the time. I just noticed, though, that if I make quick motions, e.g. a flick of a finger, there’s a delay in the motion between the two eyes; the dark adapted image is delayed by several hundred milliseconds! Especially obvious if I focus at distance so I have a double image of the finger. Explains the strong Pulfrich effect.
Noise 2: Just looked at some high frequency gratings. With the dark-adapted eye, the noise was very interesting, looked like waves moving along the grating orientation, i.e. along the bright and dark bars there’s a sort of undulating, grainy fluctuation.
I still have foveal vision and color vision, but both are very weak. Dim foveal details are invisible. High contrast details (text, the high frequency gratings) are low contrast, smudgy..
Motion is kind of irritating, I think because it brings about lots of uncomfortable Pulfrich-type effects. Even eye movements over a page of text can be bothersome, because there is always an accompanying, delayed motion. I’m guessing that the saccade cancellation is being dominated by the light-adapted eye, and so I’m seeing the dark-adapted saccades. I don’t notice a depth effect, but walking around in the hallways I do feel kind of unsteady, maybe because of motion interfering with stereopsis. If objects are still, stereopsis seems to be okay.
If I take vertical and horizontal gratings (64c/512px), add them together, then look at them at 25% contrast from about 30cm (here at my desk), I don’t see a compound grating – I see patches of vertical and patches of horizontal. I’ve never noticed this before; I wonder what differences there are with scotopic vision and cross-orientation suppression..
I tried to watch my light-adapted eye move in a mirror, but the dark-adapted eye just couldn't see well enough. I think a weaker filter would make it possible.

Tuesday, April 03, 2012

dream post!

recurring dream:

jingping and i are trying to get to the train station. the city is like a cross between boston and chicago - it's boston but with lots of overhead walkways and more of that chicagoesque feeling of sharp-edged criss-crossedness.

lots of things happen as we're on our way, it's like we're being chased, but the recurring part is where we get into the station and have to start climbing a stairwell, up and up. i know what's going to happen as the dream progresses. there's a fear of falling down the stairwell, but what happens is that it gets narrower and narrower, less and less place to put your feet, and you're crawling finally up a spiral tunnel, until you can't go further because there's just not enough space - around this point i know it's a dream, because i'm thinking that it can't really be this way, and i'm trying to change it because it's so damn uncomfortable. even in the dream, i'm thinking, why does this happen, why can't i fix it?

once it got to that point, i realized that my eyes were closed, but i couldn't open them, and yet i could still kind of see the twisting stairwell tunnel ahead - and there was a confusing sensation of being able to see but not being able to see, at the same time (interesting relevance to the visual consciousness stuff i was wondering about earlier, which is really why i'm writing it down). i was feeling around for the gap ahead, to see if i would fit, and i knew jingping was behind me and i couldn't back up, but i also felt like i could see it all...

i think i woke up soon after. i figure that noticing my eyes were closed and not being able to open them, and yet still having a sense of vision, must have been REM atonia - sleep paralysis, the sort of thing that gives you the feeling of being trapped and immobile in a bad dream.

anyways, i'm pretty sure i've had this dream a few times, the "shrinking stairwell dream".

dream post, yeah!

Monday, April 02, 2012

model update

I'm working on other things lately, but I did finally get that multi-channel rivalry model working - main problem was that I had written the convolution equations out wrong. I had to do the convolution there in the code because the filter array is irregular - there's no function to call for 2d irregular-array convolution, much less for switching the convolution between different layers.

Here's what I had done:

Z´(x) = Z´(x) + F(x)·Z(x), where x is a vector of spatial indices, Z is the differential equation describing the change in excitation or adaptation over time, F is basically just a 2-d Gaussian representing spatial spread of activation for the inhibitory or excitatory unit, and Z´ is (supposed to be) the differential convolved with the spread function.

Now that doesn't make any sense at all. I don't know what that is. In the actual code that equation was actually 3 lines long, with lots and lots of indices going on because the system has something like five dimensions to it; so, I couldn't see what nonsense it was.

This is how it is now:

Z´(x) = Z´(x) + sum(F(x)·Z(x))*F(x)

THAT is convolution. I discovered what was going on by looking at the filter values as images rather than as time plots; Z and Z´ didn't look different at all! Z´ should look like a blurred version of Z. Such a waste of time...

Anyways, it kind of works now. Different problems. Not working on it until later in April. The 'simple' single resolution model was used to generate some images for my NRSA application. Here's a sample simulation of strabismus (with eye movements):

Wednesday, March 21, 2012

vision includes the world

I need to procrastinate slightly more productively, so here is a short essay relating some of my thoughts on visual consciousness.

For years now, I've understood visual experience or consciousness (experience is easier to say and write, and has less near-meaning baggage, so let's continue with that term) as having two components:

1. The image. A part of vision is direct, which means that when you see an object, it is true to say that what you see is the thing itself, or at least the light reflected/emitted by that object (this similar to the idea of the 'optic array'). This is a difficult position to hold, but I think it is a necessary default. The alternative, which is definitely more popular these days, is to say that what you see is entirely a representation of the thing itself, instantiated in the brain. This sort of idealism is attractive because the brain is obviously a self-contained system, and because experience also seems to be self-contained, and because every aspect of experience seems to have a neural correlate. If I say that vision involves processes or structure outside the brain, I have to explain why we don't see what we don't see; why don't I see what you see, for example?

It seems to me that in placing the contents of consciousness somewhere in the physical world, there are two possible null hypotheses: either everything is absolutely centralized, completely contained within the brain, or everything is absolutely external, completely outside the brain. The second account is rare these days (see Gibson), as the only job it leaves for the brain is sorting out of responses to visual experiences. It seems clear that much of vision actually does occur within the brain, and I'll get to that in part 2, below. Now, these null hypotheses: that everything is internal is an objective hypothesis, based on e.g. a scientist's observations that the brain is correlated with experience; that everything is external is a subjective hypothesis, based on e.g. my observations that what seems to be in the world is actually there, i.e. that my sensations are always accurate.

Since visual experience is a subjective process which cannot be observed, I like to stick to the subjective null hypothesis: everything is external unless shown otherwise. Immediately on stating this hypothesis, we can start to make a list of the components of visual experience which are surely neural.

2. The brain. Let's start with the subjective null hypothesis: everything you see is there, in the world. Just a little thought proves that this can't be true: faces are a great example. Look at two faces, one of a person you know well - your sister or brother, maybe - and one of a strange that you've never seen before. There, in the faces, you see a difference that you can't deny, because one seems to have an identity and the other does not. This difference isn't purely cognitive or emotional, either, because one will easily make the admission that the face of his sister is his sister. Seeing her face, he will say, "That is her!" Clearly, however, the identity is not in the face - it is in the observer.

If this isn't a satisfying example, color perception must be. Color is not a property of images, it is a construct of the brain - this is not difficult to show, either with the proof that identical wavelength distributions can yield different color percepts in different conditions ('color constancy'), or with the inverse proof that different wavelength distributions can yield identical color percepts ('metamers'). We understand color as a brain's capacity to discriminate consistently between different (simultaneous or asynchronous) distributions of visible radiation. It is something that exists only in the observer.

These are easy, but it does get harder. Consider depth perception. In a scene, some things are nearer or further from you, but there is nothing in the images you sense that labels a given point in the scene as being at a particular depth. There is information in the scene that can be used by the observer to infer depth. So, depth is another part of the brain's capacity to interpret the image, but it is not a part of the scene. This is a more more difficult step than with faces or colors, and here's why: whereas a face's identity, or a light's color, is plainly not a property of the world itself, we know that the world is three dimensional, and that objects have spatial relationships; and, we know that what we see as depth in a scene informs us as to these spatial relationships. However, we then make the mistake of believing that visual depth is the same as space; on reflection, however, we can begin to understand that they are not the same. Depth is an neural estimate of space based on image information.

Let's keep going. Spatial orientation is another good one: 'up' and 'down' and 'left' and 'right' are, in fact, not part of space. I've already made my complaint about this one: spatial orientation is created by the brain.

If we keep going like this, what do we have left? What is there about visual experience that is not in some way created by the brain? How can I state that there is an 'external' component to vision?

The only feature of vision, it seems, that is not generated by the brain is the internal spatial organization of the image, the positional relationships between points in the image - what in visual neuroscience is recognized as retinotopy. Spatial relationships between points in the visual field do not need to be recovered, only preserved. A person's ability to use this information can be lost, certainly, through damage to the dorsal stream (simultanagnosia, optic ataxia, neglect, etc). This does not mean that the visual experience of these relationships is lost, only that it is unable to contribute to behavioral outputs. I think it is a mistake - commonly made - to assume that a patient with one of these disorders is unable to see the spatial relationships that they are unable to respond to. Assigning to the brain the generation of positional relationships needs evidence, and I know of none. A digital, raster image based system would be different, of course: a video camera detects images by reading them into a long, one-dimensional string of symbols. Positional relationships are lost, and can only be recovered by using internal information about how the image was encoded to recreate those positions. The visual system never needs to do this: it's all there, in the very structure of the system, starting at the pupil of the eye.

So, here is my understanding of vision: it is a stack of transformations, simultaneously experienced. The bottom of the stack is, at the very least, the retinal image (and if the image, why not the logically prior optic array?). Successive levels of the stack analyze the structure of the lower levels, discriminating colors, brightnesses, depths, and identities; this entire stack is experienced simultaneously, and is identical with visual consciousness. But, the entire thing is anchored in the reality of that bottom layer; take it away, and everything above disappears. Activity in the upper levels can be experienced independently - we can use visual imagination, or have visual dreams, but these are never substantial, and I mean this not in a figurative sense - the substance of vision is the retinal image.

This view has consequences. It means that it is impossible to completely reproduce visual experience by any brain-only simulation, i.e. a 'brain in a vat' could never have complete visual experience. Hallucinations must be mistakes in the upper levels of the stack, and cannot involve substantial features of visual experience - a hallucination is a mistaking of the spatial organization in the lowest levels for something that it is not. Having had very few hallucinations in my life, this does not conflict with my experiences. I can imagine that a hallucination of a pink elephant could actually involve seeing a pink elephant in exactly the same experiential terms as if one was there, in physical space, to be seen, but i don't believe it, and I don't think there's any evidence for vision working that way. Similarly, dreams are insubstantial, I claim, because there is nothing in that bottom layer to pin the stack to a particular state; memory, or even immediate experience, of a dream may seem like visual experience, but this is a mistake of association: we are so accustomed to experiencing activity in the upper stacks as immediately consequent to the image, that when there is activity with no image, we fail to notice that it isn't there! I think, though, that on careful inspection (which is difficult in dreams!), we find that dream vision has indeterminate spatial organization.

Anyways, that's my thinking. This has gone on long enough, I need to work on this proposal...

Sunday, March 18, 2012

oscillate, explode, or stabilize

must learn about runge-kutta methods,
must learn about runge-kutta methods,
must learn about runge-kutta methods.

clearly this too-complicated model is suffering because of the temporal resolution. i've spent nights now trying to figure out why the thing wasn't working right - and did find a few errors along the way, which i don't think would have made or brake the thing anyways - and finally i conclude that the response time constant was too small. this is strange, because the same model works great with a 2d network, and perfect with a single unit; apparently there's something about this network, which is essentially 3d, which effectively makes the time constants faster... it must be that compounding the differential during the convolution, over multiple filter layers, effectively speeds everything up.

it's not like i wasn't aware of this problem at first. i thought i had solved that by doing the global normalization, where the convolution stage would basically be treated as a single layer. last night, i decided that collapsing that stage to one layer was a mistake, because it resulted in the pools everywhere being overwhelmed by the finer-grain channels, since those filters are more numerous. that may actually be correct, with some sort of leveling factor, but at any rate i took out the collapse. it didn't change performance much, but that's when i was using a too-complex test case (two faces), instead of the current test case of two gratings. now i realize that the pooling was accelerating the responses, resulting in useless behavior by the network - turning up the interocular inhibition to any level that did anything tended to result in ms-to-ms oscillations.

so, the compounding of responses was doing it, i guess, and would be doing it even if i had the pooling collapse still worked in. but now i can't understand why i didn't get the same problem, apparently ever, with the fft-based version of the model. now i'm suspicious that maybe i *did* get it, and just never perceived it because i wasn't doing the same sorts of tests with that thing.

not quite back to the drawing board. i wish i could get away from the drawing board, for just a few nights, so i could work on this goddam proposal like i should have been doing for the past 2 months.

Monday, March 12, 2012

multi-channel M-scaled discrete filter convolution

Okay, so, I built this really neat discrete filter-based visual field model, planning to use it to measure binocular image statistics and to generate more realistic rivalry simulations. I hoped that doing the simulations would actually be quicker using the filters, since there would be far fewer filters than pixel images (I was using image-filter convolution to do the simulations I showed 2 posts ago), and the filters only needed to be represented by their scalar responses. Hoped but did not believe..

So now, I just spent the weekend (wrote that first paragraph a week ago) staring at the code, trying to figure out how to do, essentially, convolution of a function with an irregular array. It is complicated! I wrote a function to get local neighborhood vectors for each filter within its own channel, and then stared at that for a couple of days, and then realized that I should have written it to get the neighborhood without regard to channel. It's a pretty gangly operation, but it does have a good structural resemblance to stuff I've been thinking about for years. Ed and Bruce's relatively abstract idea about the broadband gain control pools, well, I've built it. Not for the intended purposes, since there's not going to be any gain control here - the only suppression that will be involved is like an 'exit gate', the permission for information in the channel array to be moved out to the later stages ("consciousness", we'll call it).

And, I say again, it's complicated. It's definitely not going to be faster than the rectangular filter convolution; in fact, it's likely to be 3 or 4 times slower, and it's going to produce rougher looking images on top of that. All this just to incorporate stupid M-scaling into these stupid rivalry waves. I swear, I can't think of a better way to do it. And the thing still isn't going to know anything about surfaces or faces or houses or any of that stuff, and it's going to take forever to debug and proof since it's going to be so slow...

But it's going to be cool.

Monday, March 05, 2012

retrograde inversion

Several times in your life you may hear it noted that the retinal image is reversed and upside-down. Fewer times than that, hopefully, you may then hear it noted with curiosity that the brain somehow undoes this retrograde inversion. When you do hear this, please interject with the following:

"The brain does not reverse the coordinates of the retinal image. The brain does not know or care about about the retinal image's orientation relative to the world; as far as the brain is concerned, the image is not upside-down, or upside-up, or flipped or double-flipped. It is not delivered to the brain with reversed coordinates, but with no coordinates at all. The brain assigns spatial coordinates to the visual information it obtains from the eyes. It does this by integrating information about body position, gravity, and other consistent sensory cues about the state of the world. There is no reversal or correction of coordinates, there is only assignment of coordinates."

You will promptly be thanked for clearing up the misunderstanding, and hopefully your interjection will serve to end one strain of a particularly irritating bit of pernicious nonsense.

Thank you.

Wednesday, February 22, 2012

Rivalry and Diplopia

A simulation of binocular rivalry and fusion with eye movements:

First, the input:

If you can cross-fuse, you want to fuse that white rectangle (and the matched noise background). It's hard to do, especially since there will be a strong urge to fuse the face, not the background. If you succeed, the girl's face will be diplopic (seen double). The video below is a simulation of what is happening in the parts of the visual field where the face is seen.

The photo of the girl is represented at two different ('disparate') locations for the two 'eyes' (just different filter streams in the simulation), while both eyes see the same background (noise with a little white block below the photos). At locations where the two eyes get different inputs (i.e. wherever the photo is seen), the two streams suppress one another and 'binocular rivalry' is induced. This rivalry is unstable, and results in periodic fluctuations where either one or the other eye's image is seen, but not both.

On the other hand, when both eyes get the same input, there is no suppression between streams (this isn't physiologically accurate, just convenient in this simulation). This results in 'fusion' of the two eyes images.

Every second, the filter streams - the eyes - shift to new, random coordinates (they are yoked together of course). You can see that by the shifts in position of the little black dot, which starts out near the white block.

(Both of these videos look a lot better if magnified, i.e. hit that little box in the lower-right corner and look at them full-screen.)

To make a little clearer what's happening, here's a color-coded version:

Here, locations where one eye's image gets through to be 'seen' are colored red or green (depending on which eye - geometrically it only makes sense that green-is-left and red-is-right, which would mean that the photo is between the viewer and the gray background), while regions where there is fusion are colored yellow or brown. The stream marker is now a blue dot (not really; the googlevideo encoder seems to favor dumping small blue dots against red/green backgrounds, go figure)!

Look at that mess of imbalanced fusion that builds up all over the scene. What a mess!

Tuesday, January 31, 2012

Psychophysics and Consciousness

I reread this paper by David Chalmers yesterday morning for the first time in several years. I had been reminded of it because of this commentary by Kaspar Meyer in Science last week. The commentary was mildly interesting, and pointed out some of the current neuroscience perspectives as to just what consciousness is: e.g. is it the sensory experience with a background of knowledge and cognitive processes ("bottom up"), or is it a sort of best-estimate of what reality is given current and recent circumstances, using sensory input as a sort of reality check ("top down")? He finishes off in what seemed to me to be pretty fuzzy territory, but it was at least evocative of interesting ideas.

I'm vaguely familiar with some of this stuff, but I've never gotten too deep into because it doesn't satisfy me the way the philosophers do. The neuroscientists are looking for the "neural correlates of consciousness", which I guess is all that one really can look for. What this science reveals is the structure of consciousness, i.e. what is and is not included, what are the boundaries and how are they determined by the nature of the brain, and as indicated above, what exactly is the seeming 'core', or experiential reference point, of conscious experience, in neurobiological terms.

It is good stuff, but it always seems to me that the proposed theories far outstrip the basic science that is supposed to underpin them (e.g., in the commentary, Meyer cites experiments that demonstrate internally generated excitation of sensory cortex, and more generally recurrent activation, as evidence for the interesting idea that perceptual experience "would result from signals that descend through the sensory systems, just as behavior results from signals that descend along the motor pathways"). I don't know, that seems a bit of cart before horse, but like I said, I've only ever really skimmed the surface of this research. Meyer, Damasio, Dehaene, these guys are all basically frontal cortex cognitive neuroscientists, not perception scientists, and I've never really had cause to sink into that part of the science.

Now, the Chalmers paper. That's what I was going to go on about, not the Meyer commentary...

Anyways, in that paper Chalmers isn't really describing new ideas or new ways of thinking about consciousness (there is a subsection on some sort of "Kripkean" analysis of some philosophical point which I think actually subtracted from my comprehension of other parts of the paper, but it doesn't seem crucial). What he does is lay out a taxonomy of theories of consciousness - and the consciousness he's talking about isn't the "easy" kind, as he calls it, i.e. the NCC business that Dehaene is always going on about, but the "hard" kind, i.e. the fact-of-phenomenal-experience. I was thinking about that taxonomy yesterday evening, and wondering how psychophysics as a science fits into it, whether or not it biases one towards one or another way of thinking about phenomenal consciousness and just what it could be, or where it might come from.

As far as I know, the only visual psychophysicist who has written extensively (in English) on the philosophy of perception is Stanley Klein. I'm sure there are others, probably some I have heard of, but for now I'm guessing that if they exist they are writing in German or Italian. Klein is a proponent of the idea that phenomenal consciousness has something to do with quantum physics. Chalmers categorizes this sort of idea as dualist, since it supposes that consciousness is a quantum epiphenomenon of the activity of the physical brain. In other words, there is the brain and its physical structure, then there is a corresponding, consequent pattern or structure of quantum effects, and it is those effects that correspond to subjective, phenomenal consciousness.

I never liked this idea, at all. It usually relies on the Copenhagen interpretation of quantum mechanics to make the connection between observation and collapse of a wave function, which is the same thing that leads to that horrible Schrodinger's cat story. Not that I'm qualified to really have an opinion on this stuff, but that interpretation - that multiple possibilities exist simultaneously until selected by "observation" - is obviously nonsense, and just exists to show that something is not properly understood about the whole situation.

Okay, so I've shown myself to be a quantum mechanics ignoramus. Anyways, the QM-as-consciousness stuff is a type of dualism according to Chalmers, and I think it's quasi-mysticism, but does it have any currency among psychophysicists? I doubt it. I think Klein carries it because he was a student of Feynman who went into psychology, and he couldn't help but make the connection. He's an order of magnitude smarter than I am, maybe, but I think he's wrong.

As scientists, we might expect that psychophysicists should be materialists according to Chalmers' taxonomy. When I first got interested in perception and psychophysics was back when I was reading every bit of Daniel Dennett that I could find, and he is really the popular standard bearer of materialist theories of phenomenology (or was back in the 90's; this was the same time that I read Blackmore's "Meme Machine", and became completely obsessed with those ideas for a good couple of years). The idea here is that consciousness, in a way, doesn't actually exist; all that exists is the interconnected and multilayered and recurrent set of mechanisms for relating sensation to action over many timescales; in other words, "the mind is what the brain does". The fact that we have the impression of "looking out", or of being somehow spatially immersed in our thoughts and percepts, is a sort of necessary fiction that helps all those mechanisms to bind together and work correctly.

I'm not sure, but I think that J.J. Gibson might have been the closest thing (in the previous academic age) to a philosophical materialist in vision science. I suppose that most vision scientists adhere to a much more nuanced form of materialism, since Gibsonian materialism, or direct realism, is not really in good repute these days. I really like the idea in general, and consider it a good null hypothesis for study of perception - i.e. the perceptual world is the physical world that we tend naively to identify it with, and not a "representation" of the physical, and a given brain is a locus of limitations on what is known or remembered or simply accessible about this world.

Cognitive and perceptual neuroscience in general usually makes claims about consciousness that are consistent with the materialist position, i.e. that consciousness is the set of processes and functions of the brain. Chalmers says this (about neuroscience) explicitly. I always feel, though, (and I think that somewhere I've seen a talk by Dehaene where he says as much) that this is a terminological confusion, and that the neuroscientists must generally know, but forget sometimes, that the hard problem of consciousness, of phenomenology, is not addressed by their studies. Again, you know, I just have superficial acquaintance with this research, and maybe it's a common complaint amongst the Dehaenists that outsiders are always complaining that they (Dehaenists) are claiming that they're studying something that they aren't, when of course, duh, the Dehaenists know the difference. Oh well.

Finally, we wrap things up by mentioning what Chalmers calls monism, which is ultimately pan-psychism or pan-subjectivism. Reality has its relational, "objective" properties, and also its intrinsic, "subjective" properties. Phenomenal consciousness is simply the intrinsic nature of a functioning brain. This is an old idea, thousands of years old maybe, but it's not scientific. It's anti-scientific, even, since it's a claim that science, being the study of the objective nature of reality, can by definition not touch phenomenal consciousness. I think this is probably the truth of things, too, and it's kind of irritating. Anyways, is this a common feeling amongst psychophysicists, that the ultimate object of their study (whether or not they admit it; behaviorist materialism is a necessary stance for formulating good scientific theories of perception) is by definition un-attainable? That might be the answer right there; there's an operational stance (materialism), and a functional stance (monism), and only one of them - the wrong one - will ever get you anywhere.

I guess I'm going to have to start questioning psychophysicists. It will require a certain amount of drunkenness, I'm sure...

Friday, August 26, 2011

hypothetical question

Okay, so let's say you run the following experiment:

You want to compare different states of adaptation. The yardstick you're going to use to compare them is is a matching function. You have two stimuli, x and y, and you're going to assume that the associated matching function - your matching function model - is simple, like y = mx + b. You want to know how those function parameters, m and b, vary when the adaptation state changes.

To do the experiment, you keep one adaptation state constant in all conditions. You can do this because you have two stimuli, and you can adapt them separately. So, you have two adaptors, X and Y. You keep adaptor X the same in all conditions, but you vary adaptor Y. Since X doesn't change, you can then compare the effects of Y across conditions. Adaptor X is your baseline.

Within a subject, this design is fine. You can take your xy data from different X conditions and plot them on the same axes. You look at how the data for X1 differs from X2, for example. You fit your model to the X1 and X2 data, and find that mX1 is higher than mX2. You repeat the experiment with another subject and find the same pattern - the m values are different across subjects, but you see the same relative difference between mX1 and mX2 for every subject you test. You average the results together to show that mX1 is higher than mX2. This constitutes a result of your study.

But then...

You start to look at the individual data, at how the m values vary so much across individual subjects, but that within-subject difference is always there. You think, something is covarying between these two things, what could it be? Why is it that whatever value mX2 takes for a particular subject, mX1 is always higher?

Then you realize: Y. mX1 and mX2 might not vary at all, at least not to the extent that they appear to. Maybe its mY that's varying.

Look at that model from the point of view of Y. Then you have x = (mY)y + (bY). Turn it around, and you get y = (1/(mY))x + (bY)/(mY). This means that mX is inversely proportional to mY, so that measured values of mX1 and mX2 will be similarly affected by differences, across individual subjects, in the value of mY.

Well, this led somewhere, anyways.

Tuesday, December 21, 2010

Publication Report 2010

My internet research has dwindled to nothing!

Meanwhile, this year's publication history:

Published manuscripts: 2
Submitted manuscripts: 0
In-preparation manuscripts: 1
Abstracts submitted: 2
Conference papers written: 2
Conference presentations: 1
Invited lectures: 1

SUBMITTED MANUSCRIPTS = 0.0, this isn't so good. I have a waiting list of whatever comes right before "in-preparation", though.

Okay, what did I learn today:
Well, I built a model of adapted image quality (blur/normal/sharp) matching yesterday, and fixed it up today. It does just what it should: it "normalizes" when adapted to one or another type of input, though for now its starting point is "blank adapted" which isn't quite right. It also displays the loss of blur/sharp gain that I found in the matching experiment (which accounts for 4 of the above objects: paper in preparation, abstract accepted, presentation and lecture given).

The model is your basic contrast transducer array, a set of Foley functions (Stromeyer-Foley, Naka-Rushton, etc.) with thresholds set by a standard function. I've built it several times before, but this is the first time I came up with a good way of implementing the adaptation part. This is the transducer function, with w in the denominator standing in for some added (only added, yes) gain control function:

The idea is that the system wants R to be kept relatively constant, at a particular level above threshold but not terribly near saturation - but C keeps changing, so how to keep R in that ideal range? Yes, we adapt, and here adaptation basically means setting the value of w. That's easy to do, just solve for w. This introduces probably the most important free parameter in the model, R, because I don't know what it should be, though I have a good idea of the range, and luckily the thing only really behaves if I put it in that range. So okay, it works!

So what I learned is that the third time you build something, it might actually work. From now on I need to make sure to build everything at least three times.

Thursday, April 08, 2010

well..

oh man. i haven't learned anything today, except that there's only so far you can take a visual simulation before it breaks. so, i've been measuring thresholds for a simulated observer at different spatial frequencies, for content within photographs which has been thresholded depending on a trial-to-trial staircase. it works pretty well for the images themselves. the original image gets compared with the image containing a thresholded band, and the observer is able to converge at a measure of the threshold over several hundred trials, similar to a human observer.

what i do is this: the original image gets filtered at the frequency in question, and the filtered image (the output of the filter) is thresholded and added back into the original image minus the filtered image. so, we actually have the original image minus the subthreshold content within the filter. if the threshold is zero, these two images are identical, i.e. they are whole, unfiltered photographs. this is the experiment as i originally ran it on myself, trying to find the just-detectable threshold (the threshold-threshold). to do the experiment simulation, the thresholded image then gets filtered again, meaning that the filter picks up the thresholded content along with residual off-frequency content. this is the only reasonable way to get the test content, since 1) that off-frequency content is there in the image and would be seen by the filter, and thus can't be ignored, and 2) the filtered band contains harmonics which wouldn't be seen by the filter.

naturally, i eventually decided to do the same experiment without the complete image; i.e., just measure threshold-thresholds for the content within the filter. i thought this would be straightforward - i just use the filtered image as the 'original', and the thresholded filtered image as the 'test'. but then, i thought, ah, almost screwed up there: the thresholded filtered image should be filtered again, just like in the original experiment. so, you can see the problem. the original content is lifted straight out of the source image, while the thresholded content gets lifted out of the source image and again out of the thresholded image, which means it will be multiplied twice by the filter. so, even if the threshold is zero, the test and original images will be different.

this is a problem. in fact, it must also be a problem in the original experiment. but, the test and original images in the original experiment are the same when the threshold is zero - i assume this is because the off-frequency content amounts to the difference between the filtered and double-filtered content, and adding the filtered content back into the image basically restores the lost content.

i need to think about this.