Semantic Structures
formal/mathematical approaches, AI interpretability
How meaning works—and fails to work—technically. This theme engages with vector pragmatics, the autoencoder’s dilemma, and proper likeness: formal approaches to understanding how technical systems represent meaning, and the tensions between local semantic coherence and global representational constraints.
Works
Introduces "indexical sufficiency" to describe the threshold at which a generated image becomes recognisable as a specific person, formalising the relationship between training data and likeness fidelity. Traces how "likeness" has functioned in legal discourse and argues that AI-generated representations challenge existing frameworks by decoupling likeness from any originating act of depiction.
DOI PDFAlso in: platforms and power
Examines how self-disclosure of age and gender in Reddit's r/Advice shapes the replies received, using hurdle negative binomial regression and discourse analysis. Reveals that the same advice request acquires different meaning depending on disclosed demographics, as "30F" and "30M" elicit structurally different responses, demonstrating that platform-encoded identity markers function as semantic context that transforms how text is interpreted.
Also in: platforms and power
Develops the concept of "proper likeness" by analogy with Kripke's theory of proper names, arguing that a likeness functions as a rigid designator that fixes reference to a historically understood being rather than a bundle of visual features. Uses encoding and decoding from information theory and semiotics to formalise how likenesses are produced and interpreted, establishing an analytical foundation that extends beyond facial recognition to synthetic imagery generally.
DOI PDFAlso in: platforms and power
Despite comprehensive feature spaces and optimized machine learning, life outcomes remain stubbornly unpredictable, and prediction error was strongly associated with which family was being predicted rather than which technique was used. This reveals a fundamental limit of representation: even rich encodings cannot capture the complexity that determines individual trajectories, challenging the assumption that more data yields proportionally more predictive power.
DOIAlso in: methods and tools
Demonstrates how the same words ("men's rights," "discrimination," "equality") arrive at radically different meanings in r/MensRights versus r/MensLib through the integration of platform metadata ("platform signals") with text classification. Semantic meaning cannot be recovered from text alone, because the context encoded in platform structures is essential for disambiguation.
LinkAlso in: platforms and power
The relationship between lived experience and its platform encoding reveals a fundamental semantic gap: "it's complicated" is the richest category Facebook offers for the dissolution of a partnership. The forced mapping from continuous emotional states to discrete database categories exemplifies the broader problem of how technical ontologies flatten meaning, and this problem scales from relationship status to identity writ large.
LinkAlso in: platforms and power
API closure demonstrates epistemic capture at infrastructure level. Users' social networks exist as local, context-dependent perceptions, but platforms force these into single global representations. An early example of the tension between local coherence and forced globalization that appears later in LLM alignment debates.
DOIAlso in: platforms and power, networks as epistemology
Machine learning-driven curation replaces explicit ordering logics (alphabetical, chronological) with opaque personalized relevance, severing the relationship between structure and meaning. Proposes data-as-graphs as an alternative paradigm that makes the relationships between items visible and navigable rather than hiding them behind black-box recommendation systems.
LinkAlso in: platforms and power
Develops a formal method for separating temporal user behavior into event sequences and inter-event time distributions, creating a new feature space that captures communicative patterns invisible to conventional analysis. The decoupling approach reveals that the meaningful structure in forum participation lies not in what users say but in the temporal signature of how they engage, offering a formal insight into the semantics of participation rhythms.
DOIAlso in: methods and tools
The nymwars reveal a deep semantic problem: identity is contextually performed and locally coherent, but real-name policies force a single canonical referent. The pseudonym solved the technical problem of context-dependent meaning by allowing different names in different contexts, and its elimination collapses the many-to-one mapping between social roles and encoded identity.
LinkAlso in: platforms and power