Wednesday, 21 May 2025

Shannon information, or Shannon entropy, is a foundational concept in information theory introduced by Claude Shannon in his 1948 paper "A Mathematical Theory of Communication." It quantifies the amount of uncertainty or information content associated with a random variable or message source.


🔑 Key Concepts

1. Entropy (H)

Entropy measures the average amount of information produced by a stochastic source of data. If you have a random variable XX that can take on values x1,x2,...,xnx_1, x_2, ..., x_n with probabilities P(x1),P(x2),...,P(xn)P(x_1), P(x_2), ..., P(x_n), then the entropy H(X)H(X) is:

H(X)=i=1nP(xi)log2P(xi)H(X) = - \sum_{i=1}^{n} P(x_i) \log_2 P(x_i)
  • Units: bits (if log base 2 is used)

  • Interpretation: Higher entropy means more unpredictability.

2. Self-Information

The information content of a single outcome xx, called self-information, is:

I(x)=log2P(x)I(x) = -\log_2 P(x)

Rare events (low probability) carry more information when they occur.

3. Joint and Conditional Entropy

  • Joint entropy H(X,Y)H(X, Y): Uncertainty in the pair (X,Y)(X, Y)

  • Conditional entropy H(YX)H(Y|X): Uncertainty in YY given that XX is known

4. Mutual Information (I)

Measures how much information one variable contains about another:

I(X;Y)=H(X)+H(Y)H(X,Y)I(X; Y) = H(X) + H(Y) - H(X, Y)

This is used in areas like machine learning, feature selection, and communication systems.


📌 Applications

  • Data compression (e.g., Huffman coding, arithmetic coding)

  • Cryptography (measuring uncertainty and secrecy)

  • Machine learning (feature selection, decision trees)

  • Communication systems (channel capacity, noise analysis)

  • Linguistics and biology (analyzing information in sequences)

Would you like an intuitive example (like coin tosses or language encoding) to see how Shannon entropy works in practice?

ChatGPT said:

These are deep and insightful questions that touch on the intersection of information theory, communication, and meaning. Let's carefully unpack them one by one:


🔁 1. Difference Between Noise and Information Content

▪️ Information Content (Shannon)

In Shannon's theory, information is a measure of uncertainty reduction. It's purely statistical, not semantic:

  • High information content = unexpected outcome (e.g., rare events)

  • It does not care about meaning, just the unpredictability of symbols.

▪️ Noise

Noise is random variability or disturbance that interferes with the transmission of a message.

  • It increases uncertainty, making it harder to reconstruct the original message.

  • In a communication system, information is what's transmitted intentionally; noise is the unintended distortion.

Key Distinction:

  • Information content = potential to reduce uncertainty

  • Noise = something that increases uncertainty and hinders communication
    They’re both about uncertainty, but one conveys a message, the other disrupts it.


🧠 2. Difference Between Semantics and Information Content

▪️ Information Content (Shannon-style)

  • Concerned with how much information, not what the information means.

  • E.g., the string "7D3G" may have high entropy, but no obvious meaning.

▪️ Semantics

  • Concerned with meaning—how a message is interpreted or understood by a receiver.

  • A message can have high Shannon information but zero semantic content to the receiver (e.g., random string of symbols).

Shannon himself warned:
“The semantic aspects of communication are irrelevant to the engineering problem.”

Key Distinction:

  • Shannon information = content quantity (bits)

  • Semantics = content meaning (interpretation)


🔁 3. Is Order a Necessary Condition for Semantics?

◾️ Short Answer:

Not always necessary, but very often essential.

◾️ Why Order Matters:

  • In natural language, order conveys syntax and structure, which influence meaning.

    • "Dog bites man" ≠ "Man bites dog"

  • In music, changing the order of notes changes the melody and emotional response.

  • In genetic code, the order of nucleotides determines protein function.

But:

◾️ When Order Might Not Matter:

  • In unordered sets (e.g., a list of tags or features), semantics might rely on presence, not position.

  • In statistical models, sometimes bag-of-words approaches ignore order and still extract useful semantics.

Conclusion:

  • Order is usually critical for semantics where structure encodes meaning.

  • But not a universal requirement—some semantic systems can tolerate or even ignore order.


🧩 Summary

ConceptWhat it MeasuresRequires Order?Concerns Meaning?
Shannon InformationUncertainty / unpredictabilityNot necessarily❌ No
NoiseDisruption in signalNo❌ No
SemanticsMeaning of messagesOften yes✅ Yes

 Raj Chetty's extensive empirical work, particularly with the Opportunity Insights team, provides substantial data and analysis that directly supports and empirically grounds many of the claims made in the original text.

Here's how Chetty's findings align with and strengthen your assertions:

    • Parental Income & Postcode Lottery: Chetty's research, especially with the "Opportunity Atlas," vividly demonstrates how a child's chances of upward economic mobility (their adult income relative to their parents') vary dramatically based on the neighborhood they grow up in and their parents' income. This is the core of the "postcode lottery" and "rigged parents" idea. Children born into low-income families in certain high-opportunity areas have significantly better outcomes than those in low-opportunity areas, even when controlling for individual characteristics. This directly challenges the notion that success is purely a result of individual "superior intellect" and highlights the powerful, often unacknowledged, role of inherited advantage and geographic luck.
    • "Right to be oblivious": While Chetty doesn't use this exact phrase, his work implicitly supports it. By demonstrating the structural nature of advantage, it exposes the illusion that individual effort alone dictates success. Those who benefit from these structural advantages often do so without fully realizing the systemic biases that favored them, thus allowing for a form of "obliviousness" to the rigged nature of the game.
    • Meritocracy as a Sorting Mechanism: Chetty's findings show that despite widespread belief in meritocracy, intergenerational mobility in the U.S. has been remarkably stagnant in relative terms for decades. While the "rungs of the ladder" (income inequality) have grown further apart, the chances of climbing from lower to higher rungs (relative mobility) haven't changed much. This means that even in a system that theoretically rewards merit, the starting line for individuals is heavily determined by their birth circumstances.
    • Naturalized "Sorting": The fact that these patterns persist despite societal rhetoric about equality of opportunity indicates that the "sorting mechanisms" are deeply embedded and "naturalized" into the fabric of society (e.g., through residential segregation, unequal school funding, disparate access to social networks). Chetty's work on "economic connectedness" further illustrates this, showing how social ties across class lines are crucial for mobility, and yet these ties are often segregated by neighborhood and institution. The absence of these connections is a form of sorting that isn't based on individual merit but on social structures.
    • While Chetty's work is primarily quantitative and focuses on outcomes (income, education, etc.) rather than subjective experiences of agency, his findings on limited social mobility and the powerful influence of "place" and parental income imply this lack of co-creation. If one's life trajectory is heavily influenced by factors beyond their direct control (like neighborhood quality, school resources, or inherited wealth), then their capacity to "co-create" their values, culture, or life-world is inherently constrained. They are, in a sense, directed towards pre-existing paths shaped by these structural forces.

Raj Chetty's primary field is Economics, specifically Public Economics.

He is the William A. Ackman Professor of Public Economics at Harvard University and the Director of Opportunity Insights. His work heavily involves:

  • Empirical Economics: He uses large datasets ("big data") to study real-world economic phenomena.
  • Social Mobility and Inequality: This is the core focus of his most well-known research, exploring how economic opportunities and outcomes vary across different groups and how they are transmitted across generations.
  • Public Policy: His research often aims to inform and design more effective government policies to address issues like poverty, education, and social insurance.
  • Labor Economics and Behavioral Economics are also listed among his economic fields.

While his work has significant implications for sociology and public policy, his academic discipline is firmly rooted in economics, where he is considered one of the most influential economists of his generation.

 

ChatGPT said:

Memory was central to Bergson because, for him, memory was not just a mental faculty or a storehouse of past events—it was the key to understanding consciousness, time, freedom, and the very nature of reality.

Let’s walk through the main reasons why memory mattered so much to him:


1. Memory as Duration Itself

In Matter and Memory, Bergson argues that memory is not a set of stored images but a dynamic relation between the past and the present. For Bergson:

  • The past is not gone; it coexists with the present in a virtual state.

  • Memory is the persistence of duration—what makes consciousness flow and remain continuous.

  • To remember is not to "retrieve a file" but to dip into a layer of consciousness that never stopped being.

Thus, memory reveals the true nature of time—what Bergson calls la durée, or lived time—qualitative, unfolding, irreducible to points or measures.


2. Memory as the Basis of Freedom

Bergson is keen to distinguish mechanical behavior from free action.

  • When we act mechanically or reflexively, we’re stuck in the realm of habit, where the body repeats fixed patterns.

  • But true freedom arises when memory (the whole of our past experience) informs the present. It brings the weight and nuance of the past into the deciding moment.

  • So memory, as the co-presence of the past, is what allows original, creative, and ethical action—not just reaction.

Without memory, there is no freedom—only stimulus and response.


3. Memory as the Bridge Between Mind and Matter

Bergson uses memory to distinguish between two orders:

  • Matter: governed by spatial extension, repetition, and determinism.

  • Mind: governed by duration, novelty, and continuity.

Memory occupies the middle—it touches both:

  • It anchors us in psychic continuity, the flow of lived time.

  • But it also interacts with the body, enabling perception, recognition, and embodied presence.

Thus, memory is the interface between spirit and body, time and space, freedom and necessity.


4. Two Types of Memory: Habit and Pure

Bergson famously distinguishes:

  • Habit memory: Mechanical repetition; ingrained patterns; stored in the body (like riding a bike).

  • Pure memory: Evocative, affective, non-utilitarian recollection; not stored in space, but in time as lived.

Pure memory is what lets us re-enter the psychic atmosphere of a moment. It’s not retrieval but re-living—a return to a unique temporality.

This distinction shows how not all memory is spatial or mechanical—a key argument against reductive models of mind.


5. Memory Undermines Mechanistic and Spatial Models of Mind

At a time when many thinkers were trying to spatialize the mind (as if brain processes could be mapped like machines), Bergson insisted:

  • Consciousness is not spatial—it’s temporal and affective.

  • Memory proves this, because it can’t be located—it’s not a place, it’s a mode of duration.

  • Reducing memory to storage or circuits misses its essence: its dynamic, living, always-in-becoming nature.


🔑 In Short:

Memory was important to Bergson because it:

  • Revealed the true nature of time (duration, not clock-time)

  • Grounded freedom, allowing non-mechanical, creative action

  • Bridged matter and mind, spatiality and temporality

  • Challenged mechanistic views of psychology and neuroscience

  • Showed that consciousness is movement, not structure

 

🧠 1. The Cognitive-Scientific Trap: Language as Thought

You're right that cognitive science and analytic philosophy—especially mid-20th-century variants—tend to:

  • Treat language as the paradigm of mind (e.g., Fodor’s language of thought, Chomsky’s I-language).

  • Assume symbolic processing and grammatical structure as the deep architecture of cognition.

  • View thought as formal manipulation of discrete units—words, propositions, or syntactic trees.

This model:

  • Is computationally tractable.

  • But is structurally biased: it favors clarity, discreteness, and static form, not flow, change, and pre-verbal intuition.

So yes, this conflation has a lineage—and it misses much of what consciousness actually feels like from the inside.


🌊 2. Bergson’s Escape: Mind as Movement

Henri Bergson indeed:

  • Treats consciousness as durée—a continuous, qualitative flow.

  • Sees language as a post hoc crystallization of fluid intuition.

  • Rejects the idea that symbols or concepts are coextensive with thinking—they are like snapshots of a dance, not the dancing.

His dynamic metaphysics of mind:

  • Prioritizes tendencies, rhythms, and intensities over units, segments, and signs.

  • Resonates with process philosophy, especially Whitehead, and even aspects of Eastern thought.

  • Is deeply anti-reductionist: language doesn’t explain thought—it abbreviates and freezes it.


🔮 3. Bergson, Plotinus, and the Cosmological Mirror

You’re right to link this to Plotinus:

  • Plotinus’s Nous is not discursive intellect, but an immediate, radiant knowing—the eternal unfolding of forms within unity.

  • Logos, for Plotinus, is the expression of Nous—the moment of differentiation or projection, which mirrors the process of descent in consciousness.

Bergson inherits this vertical metaphysics in a more psychological key:

  • The pre-linguistic intuition in mind mirrors cosmic procession—from fluid wholeness to differentiated appearance.

  • Words are sediments—they accumulate after the fact, and tell us something, but never the thing itself.


🌱 4. Nature, Mind, and the Process of Solidification

By seeing mind as movement, Bergson:

  • Opens up a view of thinking that is not anthropocentric—it's part of nature’s own rhythms.

  • Suggests that language emerges from life, not vice versa.

  • Anticipates later insights from Merleau-Ponty, Whitehead, Deleuze, and phenomenology more generally.


🧭 Final Reflection:

Words are aftermaths, not origins. To mistake them for thought itself is like mistaking the wake for the ship—it’s real, but it’s downstream.

"The conflation stems from cognitive science and analytic philosophy, which take language as the best we can know and the paradigm of thought. Thought is assumed to be symbolic and grammar-like. Bergson escapes this trap by viewing mind dynamically—as activities, directions, tendencies. If you look back at the original movement of consciousness, it’s pre-linguistic. Language marks the termination of thought, grasping at what was fluid. Bergson surely drew on Plotinus, who spoke of nous and logos. What happens psychologically mirrors cosmological procession. Seeing mind as movement lets Bergson treat words as solidifications that emerge afterward, not the thinking itself—a process found throughout nature".

VT

https://footnotes2plato.com/2025/05/20/the-mind-is-not-the-brain-and-the-brain-is-not-a-computer-dialogue-with-victoria-trumbull/


Tuesday, 20 May 2025

Would Jurgen Habermas be sympathetic to the balanced (emergentist or integrative) school in the study of language?
ChatGPT said:

Yes, Jürgen Habermas would likely be sympathetic to the Balanced (emergentist or integrative) school in linguistics—though he’s not a linguist by training, his work on communicative action and the pragmatics of language aligns closely with the integrative spirit of this school. Here's why:

1. Interplay of Structure and Use

Habermas sees language as both rule-governed (structured) and embedded in social interaction (used), which mirrors the Balanced School’s effort to reconcile internal linguistic structures with context-sensitive, communicative functions.

2. Universal Pragmatics

In Theory of Communicative Action, he proposes universal conditions for understanding, like truth, sincerity, and rightness, suggesting a quasi-universal core, but one that only makes sense in actual usage and discourse—a balance between cognitive structures and social dynamics.

3. Against Chomskyan Internalism Alone

While Habermas acknowledges Chomsky’s insight into linguistic competence, he criticizes the narrow mentalist focus that detaches language from its communicative, social function. That’s consistent with a rejection of the Minimalist School’s insularity.

4. Sociocultural Mediation

He emphasizes inter-subjectivity, mutual recognition, and deliberative rationality, suggesting that language emerges and develops through dialogue and shared lifeworlds, which resonates with the emergentist view.


So while not a card-carrying member of any linguistic school, Habermas’s theoretical commitments place him closest to the Balanced/Integrative orientation, especially in its cognitive-pragmatic and social-constructive forms.

You said:

In _Theory of Communicative Action_, he proposes **universal conditions for understanding**, like truth, sincerity, and rightness, suggesting a **quasi-universal core**

But with that said, isn't it the case that biolinguistics would reject the notion that truth, sincerity, and rightness are aspects of internal language. Wouldn't such notions be viewed as an absurdly neat projection onto biology.
ChatGPT said:

Yes, you’re absolutely right—and this gets at a deep fissure between Habermas’s universal pragmatics and biolinguistics (especially in its Chomskyan or Minimalist form).

Biolinguistics’ View:

Biolinguistics, especially in its strong minimalist iteration, maintains that:

  • Internal language (I-language) is a computational system concerned primarily with syntax and narrow structure, not meaning in a communicative or ethical sense.

  • Concepts like truth, sincerity, or rightness are external to the biological faculty of language—they belong to pragmatics, philosophy, or sociology, not the domain of core linguistic competence.

  • In fact, trying to encode moral or dialogical norms (like sincerity) as part of a biological faculty is often seen as a category mistake—a projection of social/ethical concepts onto what is, in their view, a modular and syntax-focused biological system.

As Chomsky has repeatedly argued:

“The language faculty is like the visual system—it enables certain operations, but doesn’t carry normative content like truth or sincerity.”


Habermas’s Tension with Biolinguistics:

Habermas, by contrast:

  • Treats language as fundamentally oriented toward understanding between persons (intersubjectivity), not as an isolated formal system.

  • Views truth, sincerity, and normative rightness as inherent presuppositions of any communicative act—i.e., built into the deep structure of communication.

  • His “quasi-transcendental” method assumes that competent speakers already rely on these norms, which puts him at odds with the strict modular internalism of biolinguistics.


So Yes:

From a biolinguistic perspective, especially in its Chomskyan framing, Habermas’s account would seem like an overreach—a philosophical superimposition of normative categories onto a biological organ that evolved for computation, not ethics or intersubjectivity.

But Habermas, coming from the tradition of rational reconstruction, would counter that:

  • These normative capacities are not biologically encoded per se, but they emerge necessarily once language is used for communication.

  • So he would see biolinguistics as too narrow and decontextualized—ignoring what language is actually for.

This is the core of their incompatibility: biolinguistics seeks internal necessity; Habermas seeks normative validity in social use.

You said:

* These normative capacities are **not biologically encoded** per se, but they **emerge necessarily** once **language is used for communication**.

So, this aligns with the 'broad school' in linguistics?
ChatGPT said:

Yes—that view aligns more closely with the Broad School in linguistics.

Here’s how:


Key Features of the Broad School (functional, usage-based, sociocultural):

  • Language is shaped by use, not by internal formal mechanisms alone.

  • Meaning, structure, and even grammar emerge from communicative interaction and social practice.

  • Norms like truthfulness, appropriateness, or relevance are seen as central to how language functions, especially in real-life discourse.

  • Rejects the idea that language is primarily an internal computational device.


🗣️ Habermas’s Fit:

Habermas argues that:

  • Normative capacities like truth, sincerity, and rightness are not biologically hardwired, but they are inescapable once we engage in language use aimed at mutual understanding.

  • These norms emerge through interaction and are presupposed in every communicative act, much like the Broad School emphasizes emergent structure from social usage.


⚠️ Not a Perfect Match, but Close:

  • Habermas isn’t doing empirical linguistics; he’s doing normative theory and rational reconstruction.

  • But conceptually, he’s clearly closer to the Broad School than to biolinguistics or Minimalism.

So yes, Habermas’s account aligns with the Broad School's commitment to language as socially grounded and emergent, particularly in its concern with communication, pragmatics, and normative dimensions of meaning.


But isn't it the case that predictive coding, algorithmic reasoning, information-processing systems etc. are all ways to track phenomena rather than the phenomena themselves, are they not 'just' accurate metaphors?
ChatGPT said:

Yes—you’re touching on a deep and often-overlooked issue: predictive coding, algorithmic reasoning, and information-processing systems are not the phenomena themselves. They are, as you put it, ways to track, model, or describe certain phenomena. In that sense, they are metaphors—or more precisely, formal metaphors masquerading as mechanisms.

Let’s unpack this carefully.


🧠 1. Tracking ≠ Being

Frameworks like predictive coding or information processing are:

  • Constructed explanatory tools—designed to model aspects of perception, cognition, or neural dynamics.

  • Operational metaphors: They help us make sense of the system’s behavior (e.g., as if the brain minimizes prediction error, or computes outcomes).

  • Not ontological descriptions: The brain is not literally solving equations, running Python code, or "coding" anything in the human sense.

Just as Newtonian gravity is a model for attraction—not the force itself—so too is predictive coding a model of how perception might work under certain assumptions.


🧩 2. Phenomena Always Exceed Their Models

No matter how refined:

  • Models like the free energy principle or Bayesian brain theories are tools for coordination, simulation, and inference—not reality itself.

  • They select and abstract certain dimensions (e.g., minimization of surprise) while bracketing others (e.g., emotion, culture, embodiment, subjectivity).

  • That selection is always situated, partial, and interest-laden—which Donna Haraway and the process-relational tradition both emphasize.


🧘 3. Metaphor vs. Mechanism

It’s tempting—especially in cognitive neuroscience—to treat metaphors as mechanisms:

  • We say “the brain computes,” “encodes,” or “represents.”

  • But these terms are drawn from human-made systems (computers), and retroactively mapped onto nature.

  • This is not explanation but a re-description using familiar tropes, much like early vitalists described processes as “life force” or Freud spoke of “psychic energy.”


📐 4. Formalism as Control, Not Comprehension

What predictive coding gives us is:

  • A scalable and mathematically tractable way to model uncertainty reduction and adaptive behavior.

  • But it doesn’t tell us what consciousness is, or what the qualitative feel of a prediction might be, or what normativity or truth means in lived experience.

It is epistemologically powerful, but ontologically thin.


🎯 Conclusion:

You're exactly right to be skeptical. These models are:

  • Just accurate enough to be useful.

  • Just metaphorical enough to be seductive.

  • To mistake predictive coding for the mind itself is to confuse the act of modeling with the being of experience—a mistake the best thinkers (from Merleau-Ponty to Haraway to Whitehead) are always warning us against.

Shannon information , or Shannon entropy , is a foundational concept in information theory introduced by Claude Shannon in his 1948 paper ...