Tactile Qualia

I'm working on two long essays: one about existential meaning and one about the function of the basal ganglia in time perception, attention, task engagement, and impulsivity. Hopefully those will be complete some time mid-May. Until then, here's some crap about an indecorous quale.
 
The internet does a good job chronicling idiosyncratic human pleasures. One cluster of these I will call "visceral fit", for lack of a better term. (Edit: What a bad term!) The stereotypical elicitor is popping bubble wrap. Visceral fit is felt in response to some haptic stimulation, and other-modality cues that co-occur with that haptic stimulation. Also, I don't know quite how to say it, but, the stimuli which produce the specific quale I'm pinpointing are more textural, and do not extend as clearly to muscular stimulation such as in massage or the relieving one's bladder. (Edit: They're cutaneous sensations!)

One evolutionary domain giving rise to visceral fit qualia is close interpersonal contact, especially through grooming and sex. Elicitors from this domain include the sounds of head scratching, whispering, and various fluid sounds - produced not only by the movement of saliva in the mouth. Connoisseurs of abrasive and fluid sounds can be found at /r/asmr and /r/gonewildaudio, the latter of which is of course porn. Probably the prickly sound and feeling of hair clippers on one's neck falls here too.

Another domain for which the elusive textural stimulation quale pings is that of food. John Allen writes an opinion piece about the evolution of human preference for juicy, crunchy food, concluding that the preference was made adaptive by the prehistoric advent of cooking techniques, rather than finding fixation through the benefits of eating arthropods with chitinous exoskeletons. Some food consumption sounds will provoke disgust in the ordinary human audience, but certainly there are also sounds, perhaps relating to the consumption of crunchy and juicy food, that are pleasing, and which perhaps evoke tactile mental simulation. Maybe the pleasant crackling of fire falls here too.

Disgust does not fully inhibit tactile sensory pleasure, and this is nowhere more apparent than in the reddit community which delights in viewing ruptured cutaneous pathologies. This is the dark and forbidden side of grooming pleasure, and I will not provide a link.

Those are the more easily interpreted elicitors. /r/oddlysatisfying chronicles many other things which produce the quale: Things fitting tightly into receptacles or coming into synchronous alignment. Materials being cut and abraded from surfaces of objects. Changes throughout entire object volumes, such as by crushing, shattering, shredding, melting, and deformations to elastic solids, viscous fluids, and semi-solid mixtures.

There is a great diversity of things that produce tactile simulation pleasures, and the whole thing feels a little bit naughty. Someone with too much time or the right comparative advantage should write about it with a better informed and more cutting analysis than I have provided here, because brains are interesting, and shared aesthetics are kind of important, and naughty things are fun.

Edit: Q-tips in ears! Can't believe I didn't think to bring that up. When stimuli in other modalities trigger a tactile response, the cross-modal response is so very much like cotton swabs in the ear. As for other-modality stimuli that I failed to mention: Some people get gooseflesh when they listen to music and call it "frisson" (though most of the stuff on /r/frisson doesn't do it for me, so maybe there's wide variance in which music triggers which people?). I should look at what else produces gooseflesh. And let's not forget sparkles! Some people get ASMR from looking at sparkles. Like a visual-to-haptic synaesthesia. So now I'm looking into tactile hallucinations, goosebumps, and music-frisson generally (which has a much larger literature than ASMR for some reason). Did you know that there are different cutaneous receptors for different kinds of mechanical stimulation? Maybe I can distinguish elicitors of good feelings associated with each kind of cutaneous mechanoreceptor. If that categorization seems compelling, it would make for a pleasing correspondence between aesthetic concepts and aesthetic hardware. To the literature!

The English Texture Lexicon

Categories excerpted from (Bhushan, Rao, Lohse, 1997), with small modifications where I feel like it.

I. Structured?
  • Well ordered, repetitive: faceted, crystalline, lattice, regular, repetitive, periodic, rhythmic, harmonious, well-ordered, cyclical, simple, uniform, fine, smooth,
  • Random, disordered: complex, messy, random, disordered, jumbled, scrambled, discontinuous, indefinite, asymmetrical, nonuniform, irregular,
  • Semi-ordered, irregular segmentation: marbled, veined, scaly,

II. Lines and edges:
  • Linear orientation: pleated, corrugated, ribbed, grooved, ridged, furrowed, lined, striated, stratified,
  • Local linear orientation: wizened, crows-feet, rumpled, wrinkled, crinkled, cracked, fractured,
  • Global contours: flowing, whirly, swirly, winding, corkscrew, spiralled, coiled, twisted,
  • Two linear orientations:  matted, fibrous, knitted, woven, meshed, net-like, cross-hatched, chequered, grid, honeycombed, waffled, zigzag,
  • Multiple orientations, non-orthogonal, indistinct edges: gauzy, cobweb, webbed, interlaced, entwined, intertwined, braided, frilly, lace-like,

III. Random placement of small convex regions:
  • Two dimensional, indistinct edges: mottled, blemished, blotchy, smeared, smudged, stained,
  • Two dimensional, sharp edges: spattered, sprinkled, freckled, speckled, flecked, spotted, polka-dot, dotted,
  • Three dimensional (raised, depressed): bubbly, bumpy, studded, porous, potholed, pitted, holey, perforated, gouged,

Cue Integration and Sensory Fusion

One paradigm in psychology for modelling perception is that of cue integration, where a few sensory cues are presented to a subject, and researchers elicit judgements of some parameter value of the subject's model of the data generating process. This method is a bit primitive, making no use of modern neural imaging technology or statistical learning theory, so we expect that the cue integration paradigm must fail to shed light on many aspects perception and reasoning, such as the formation of structured conceptual categories and intuitive theories of, say, classical dynamics, natural language grammar, or social interactions. Still we have learned some things from cue integration studies, and this post will engage with some of the ideas of the sub-field.

Some signals (or our distributions for them) are mutually informative: learning testable information about one signal reduces the entropy of our distribution on the other signal. We know intuitively to look for relations and correspondences between cues which are coincident (co-occurring or source co-located). When percepts are higher dimensional, they can often be summarized with spatiotemporal coordinates, and correspondences between the coordinates often take the form of alignments, which might be produced by transforming signals to a common frame of reference. For example, neuronal columns originating in our retinas follow their initial retinal organization (their eccentricity (central vs. peripheral location) and their polar angle) surprisingly far into the brain, but their data are eventually transformed into a common coordinate frame through depth estimation, which is largely done by inverting the parallax disparity of the two visual streams. More abstract percepts with coordinates, like scene maps, are also fit objects for alignment.

The correspondences mentioned so far (signal co-occurrence supporting an inference to a common signal source, and alignment of spatio-temporal coordinates when high dimensional percepts are gained through different perspectives of a common referent) are sort of, like, workable inferences from immediately present data. If we leverage prior information, then we might get more "semantic" data correspondences, like the inference that small people have higher voices (or that deeper heard voices probably originate in larger vocal tracts). That inference is sort of supported by an intuitive theory of acoustic resonance, and other semantic correspondences often have interpretations as relying on intuitive modelling of domain structure. For example, cultures intuitively cluster animals by a hierarchy of types, with degrees of similarities described through familial relations ("chimps are cousins to humans"), which is like an intuitive theory of phylogenetic origin of species through natural selection on individuals with mutated phenotypes in a breeding population.

Whatever the source of structure, perceptual cues are often mutually informative, and brains leverage this in estimation of world states. One simple way to combine cues is to take a weighted average of measured values (for example, estimates of a thing's position based on visual sense vs. haptic sense). If our state of knowledge about the signal modality is normal (like if we just know the first two moments of our received data over past observations and we expect that these statistics will hold for future observations), then we are licensed by our information to model the data sources with normal distributions - and if we take each signal's averaging weight according to the reliability of the distribution (the inverse of its covariance), then we're probably doing something like maximum likelihood estimation with a Kalman filter, and our end estimate will be an improved Gaussian (because the reliabilities of the measurements sum, maybe) over the thing's position.

That kind of combination of signal evidence in estimation is the thing that's most often meant by "cue integration". Another thing you could kind of call cue integration, or better yet "sensory fusion", concerns how things get efficiently represented in the brain, and is more related to the domain structure learning (like of concepts and theories), which then support those simpler cue integration parameter estimations.

Short of bayesian nonparametric structure learning or whatever the hell you're doing when you throw a neural net at a problem, we could just model the relation between two recurrent sensory cues in isolation. One way to examine how the brain relates two mutually informative sensory cues is to experimentally introduce systematic bias in one of the cues, and observe the learning times before subjects recalibrate their estimates. For example, psychologists going back to Helmholtz have enjoyed placing prismatic goggles on subjects and watching them stumble around. In addition to manipulating mean values of sensory cues, you could probably introduce changes in signal covariance, with interesting results (I guess it would be like gain modulation, if averaging weights are based on reliabilities).

I feel like I should have more to say confidently about sensory fusion, but I'm mostly drawing a blank for anything motivated by even a hint of mathematical understanding. How about some wild speculation?

Speculation on reliable coupling vs. efficient coding: the more reliable is the relation between two recurrent sensory cues, the more likely the brain is to produce a sensory fusion, i.e. to consider or remember only the final combined estimate. This reduction of redundant information to sparse signals is why we think intuitively of vision and taste-smell as being roughly two and half sense modalities, rather than because of the differential presence of receptive organelles at the periphery of those senses. Also, something something short codes for things of common importance, degrees of neural representation as a determinant of estimate dominance in addition to distribution reliability.

Speculation on reliable coupling vs. learning plasticity: if the relation between two recurrent sensory cues is important but unreliable (where "importance" as an experimental construct might come from, like, introducing a loss function in a decision problem), then recalibration occurs faster than if the relation were reliable. If a relation is expected to change only slowly, then beliefs will be revised only slowly (or by small revisions).

Further, if cue coupling reliably occurs in one of a few parameter regimes, then learning will be fast in changing between those regimes as representations, but slow in calibrating to new ones. For example, your eyes are basically a fixed width apart, but there's a tiny bit of variation because they can swivel in their orbits; thus wild speculation predicts that the brain will have dedicated machinery for quickly adapting its interpretation of binocular signals over a small range of monocular swivels, but that perceptions will be screwed up if eyes are swivelled a lot, or if parallax disparity between monocular signals doesn't match what you would see for eye sockets situated a fixed distance apart within your skull. Or, instead of continuous variation in the parameter regime, maybe you're just used to having psychologists put prismatic goggles on you, and so you can switch your motor coordination pretty quickly with altered context.

The end?