2026

Characterizing In-Context Learning: When Can Transformers Match Standard Learning Algorithms?

Mosab Hawarey

AIR Journal of Mathematics & Computational Sciences, Vol. 2026, AIRMCS2026277, DOI: 10.65737/AIRMCS2026277

Transformers exhibit remarkable in-context learning (ICL) capabilities—the ability to learn new tasks from examples provided in the context window without weight updates. Despite extensive empirical investigation, a fundamental theoretical question remains unanswered: which function classes can be learned in-context, and which cannot? This gap in our understanding limits principled system design and creates uncertainty about when ICL will succeed or fail. We address this gap by developing a theoretical framework based on Sufficient Statistic Complexity (SSC)—the minimal information that must be extracted from context examples to enable accurate prediction. We prove that function classes with attention-computable sufficient statistics (those expressible as sums over examples) are efficiently ICL-learnable, matching the sample complexity of standard learning algorithms (Theorem 1). Conversely, we prove that function classes requiring combinatorial sufficient statistics—such as sparse parity—cannot be efficiently ICL-learned by any polynomial-size transformer (Theorem 2), establishing a fundamental computational barrier via novel connections to circuit complexity. These results yield a near-complete characterization: we prove a Master Theorem (Theorem 3) establishing necessary and sufficient conditions for ICL-learnability, and a Dichotomy Theorem (Theorem 6) showing that natural function classes are either ICL-Easy (learnable with optimal sample complexity) or ICL-Hard (requiring exponentially more resources). The boundary corresponds to whether learning is parallelizable or inherently sequential. Our framework explains empirical ICL phenomena, provides architectural guidance, and opens new research directions connecting learning theory, circuit complexity, and meta-learning.

2026

Quantum-Enhanced In-Context Learning for Geopotential Field Estimation: A Theoretical Framework

Mosab Hawarey

AIR Journal of Mathematics & Computational Sciences, Vol. 2026, AIRMCS2026322, DOI: 10.65737/AIRMCS2026322

We establish a theoretical framework for in-context learning (ICL) of Earth's gravitational potential field using transformer architectures, with particular emphasis on the sample complexity advantages afforded by quantum gravimetry. The geopotential, expressed as a truncated spherical harmonic expansion of maximum degree N with K = (N+1)² coefficients, defines a function class for which we characterize ICL learnability. Building on the ICL characterization framework of Hawarey (2026) (i.e. the foundational paper), we prove that the geopotential function class is ICL-Easy: it admits an additive sufficient statistic computable by a single attention layer, enabling transformers to match the sample complexity of optimal statistical estimators. Our main contributions are threefold. First, we prove that the minimal sufficient statistic for geopotential ICL is the pair (Gₙ, cₙ) consisting of the Gram matrix and cross-correlation vector, with dimension K² + K, and demonstrate its attention-computability. Second, we derive tight sample complexity bounds showing that ICL achieves nICL = Θ(Kσ²/ε · log(K/δ)) for noise variance σ², target accuracy ε, and failure probability δ, matching empirical risk minimization. Third, we quantify the quantum advantage: for quantum gravimeters operating at the Heisenberg limit with Natoms entangled atoms, the sample complexity reduces by a factor of ρQ = Natoms², yielding up to 10¹²-fold improvement for realistic sensor parameters. We also establish fundamental boundaries: while forward geopotential prediction is ICL-Easy, inverse problems such as density inversion and source localization are ICL-Hard. Source localization with J discrete masses requires identifying combinatorial structure from exponentially many candidates, which we prove satisfies the ICL-Hard structural condition H1 of Hawarey (2026) when J = ω(1), establishing that no polynomial-size transformer can solve it efficiently. This dichotomy—ICL-Easy forward problems versus ICL-Hard inverse problems—reflects the fundamental mathematical structure of potential theory and provides guidance for applying transformer-based methods in geodesy.

2026

ICL Characterization of Geo-Foundation Models

Mosab Hawarey

AIR Journal of Mathematics & Computational Sciences, Vol. 2026, AIRMCS2026360, DOI: 10.65737/AIRMCS2026360

Geo-foundation models (GeoFMs) have emerged as powerful tools for remote sensing, yet there exists no theoretical framework characterizing which downstream tasks admit efficient few-shot adaptation. We address this gap by applying the in-context learning (ICL) complexity framework to classify remote sensing tasks as either ICL-Easy or ICL-Hard. We formalize six core remote sensing function classes—spectral classification, semantic segmentation, object localization, dense change detection, sparse change localization, and spatial interpolation—with explicit sufficient statistic structure. For ICL-Easy tasks (classification, segmentation, dense change detection, spatial interpolation), we prove that additive sufficient statistics enable attention-based aggregation with optimal sample complexity nICL = Θ(CB2/ε), matching empirical risk minimization. For ICL-Hard tasks (object localization, sparse change detection, instance segmentation), we establish that combinatorial sufficient statistics require super-polynomial transformer size or super-constant depth, via reduction to sparse parity and Håstad's circuit lower bounds. Our main result, the Remote Sensing Dichotomy Theorem, proves that every natural remote sensing task falls into exactly one category with no intermediate cases: pixel-wise prediction tasks are ICL-Easy while instance-level localization tasks are ICL-Hard for J = ω(log log n) objects. This dichotomy explains empirical observations that GeoFMs excel at land cover classification but struggle with object detection, and yields testable predictions including threshold behavior at J* ≈ 2–4 objects for typical context sizes. We provide deployment guidelines distinguishing when few-shot ICL suffices versus when fine-tuning or hybrid approaches are required.

2026

ICL Characterization of Climate Foundation Models: When Can Transformers Learn Weather and Climate?

Mosab Hawarey

AIR Journal of Mathematics & Computational Sciences, Vol. 2026, AIRMCS2026405, DOI: 10.65737/AIRMCS2026405

Climate foundation models (FMs) have achieved remarkable success in weather forecasting, yet exhibit puzzling performance gaps between tasks: deterministic field prediction rivals operational numerical weather prediction, while extreme event detection lags significantly behind. We provide a theoretical explanation through the lens of in-context learning (ICL). Extending the ICL characterization framework to spatiotemporal climate data, we prove the Climate Prediction Dichotomy Theorem: every natural climate task falls into exactly one of two complexity categories. Type A (ICL-Easy) tasks—including temperature, pressure, and wind field prediction—admit additive sufficient statistics enabling attention-based computation with sample complexity nICL = Θ(nERM). Type C (ICL-Hard) tasks—including extreme event detection, tipping point identification, and compound event localization—require combinatorial sufficient statistics that provably exceed the computational capacity of constant-depth polynomial-size transformers when the number of simultaneous events J exceeds a threshold J* = O(log log n) ≈ 3–5. We establish the predictability horizon constraint: weather forecasting is ICL-Easy for lead times τ < τL ≈ 14 days, while climate statistics remain ICL-accessible but individual trajectories are fundamentally unpredictable. Our analysis yields six testable predictions about climate FM behavior and five deployment guidelines distinguishing when ICL suffices versus when fine-tuning is required. The dichotomy provides a principled foundation for understanding why Pangu-Weather and GraphCast excel at field prediction while struggling with event detection—and guides the design of next-generation climate AI systems.

2026

ICL Characterization of Multi-Modal Geo-Foundation Models: When Can Vision-Language Transformers Learn Geospatial Tasks?

Mosab Hawarey

AIR Journal of Mathematics & Computational Sciences, Vol. 2026, AIRMCS2026446, DOI: 10.65737/AIRMCS2026446

Multi-modal geo-foundation models combining vision and language have emerged as powerful tools for Earth observation, yet a fundamental question remains unanswered: which vision-language geospatial tasks can be learned in-context, and which cannot? Models such as GeoChat, EarthGPT, and RSGPT demonstrate impressive performance on scene description and attribute queries, but struggle with counting and multi-object localization—a pattern that lacks theoretical explanation. We address this gap by extending the in-context learning (ICL) characterization framework to vision-language geospatial tasks. We introduce the concept of language specificity s(ℓ) ∈ [0,1], measuring how uniquely a language query identifies target objects, and derive the effective object count J_eff(ℓ) = J·(1−s(ℓ)) capturing language-induced compression of the localization problem. We prove that language affects ICL complexity through three mechanisms—constraint specification, sufficient statistic compression, and task decomposition—but cannot overcome fundamental combinatorial hardness when the effective object count exceeds the threshold J* ≈ 2–4. Our main result is the Multi-Modal GeoAI Dichotomy Theorem: every natural vision-language geospatial function class falls into exactly one of three categories. Type A (unconditionally ICL-Easy) tasks—including scene captioning, existence VQA, and change description—admit additive sufficient statistics with sample complexity nICL = Θ(CB²/ε). Type C (unconditionally ICL-Hard) tasks—including counting VQA and universal object localization—require combinatorial statistics regardless of language specification. Type A|ℓ (conditionally Easy) tasks—including attribute VQA, referring expression comprehension, and text-guided detection—transition from Hard to Easy when language specificity satisfies s(ℓ) ≥ 1 − J*/J. We provide complete classification of vision-language geospatial tasks across all major categories (16 representative task types spanning scene-level, pixel-level, VQA, referring expression, and detection categories), derive seven testable predictions about model behavior (including threshold effects at J* ≈ 3–4 and specificity-accuracy correlations ρ > 0.8), and establish five prompt engineering guidelines for practitioners. The dichotomy explains observed performance patterns in existing models—strong on descriptive tasks, weak on quantitative localization—and provides principled guidance for when few-shot ICL suffices versus when fine-tuning is required. Our framework bridges vision-language AI and geospatial analysis, offering the first theoretical foundation for multi-modal GeoAI deployment.

2026

Chain-of-Thought ICL for GeoAI: Breaking the Hardness Barrier

Mosab Hawarey

AIR Journal of Mathematics & Computational Sciences, Vol. 2026, AIRMCS2026482, DOI: 10.65737/AIRMCS2026482

Foundation models for geospatial artificial intelligence (GeoAI) exhibit a striking dichotomy: they excel at pixel-wise tasks such as classification and segmentation, yet struggle with instance-level tasks requiring identification of multiple discrete elements. This limitation, termed the ICL-Hard barrier, arises from fundamental computational constraints of constant-depth transformers, which cannot solve problems reducible to sparse parity for more than J* = O(log log n) ≈ 2–4 objects. The barrier manifests across geospatial domains: geodetic source localization, remote sensing object detection, climate extreme event identification, and multi-modal counting tasks all fail when the number of targets exceeds this threshold. This paper introduces a unified theoretical framework demonstrating that chain-of-thought (CoT) prompting provides a universal mechanism to overcome ICL-Hard barriers across all GeoAI domains. We prove that autoregressive token generation amplifies effective computational depth: generating T intermediate tokens increases effective depth from L to L + γT, enabling transformers to escape AC⁰/TC⁰ circuit limitations and solve previously intractable tasks. Our main contributions include: (1) the CoT Depth Amplification Theorem, proving that T tokens provide effective depth Θ(T); (2) tight bounds on token complexity, establishing T = Ω(J) tokens are necessary and T = O(J log J) sufficient for J-element detection; (3) a smooth success probability scaling law P(success) = (1 − e^{−αT/J})^J; (4) the CoT Threshold Shift Theorem, showing the effective threshold increases to J*_{CoT}(T) ≈ J* + T; and (5) equivalence conditions under which CoT matches fine-tuned model performance without parameter updates. We validate the framework across four GeoAI domains—geodesy, remote sensing, climate science, and multi-modal applications—deriving domain-specific token multipliers (κ ∈ [2.5, 4.5]) and practical deployment guidelines. The theory yields eight testable predictions for empirical validation. This work completes a theoretical arc: where previous work established what is hard for in-context learning in GeoAI, we now show how to overcome these barriers through principled application of chain-of-thought reasoning.

2026

In-Context Learning Characterization of Navigation and GNSS Foundation Models: A Theoretical Framework for Safety-Critical Positioning

Mosab Hawarey

AIR Journal of Mathematics & Computational Sciences, Vol. 2026, AIRMCS2026525, DOI: 10.65737/AIRMCS2026525

Foundation models for navigation and Global Navigation Satellite Systems (GNSS) exhibit a striking empirical dichotomy: transformer-based architectures achieve remarkable performance on trajectory prediction and state estimation, yet consistently struggle with fault detection, cycle slip identification, and integrity monitoring. Despite the proliferation of navigation transformers, no theoretical framework explains which navigation tasks admit efficient in-context learning (ICL) and which are fundamentally intractable. This gap has critical implications for safety-critical applications including aviation, autonomous vehicles, and maritime navigation, where theoretical guarantees are essential for certification. We address this gap by extending the ICL characterization framework to navigation and GNSS domains. Our main result is the Navigation Dichotomy Theorem: every natural navigation function class falls into exactly one of two categories. Type A (ICL-Easy) tasks—including position estimation, velocity estimation, atmospheric delay modeling, and trajectory prediction—possess additive sufficient statistics computable by attention mechanisms, achieving sample complexity matching the statistical optimum. Type C (ICL-Hard) tasks—including satellite fault detection, cycle slip detection, spoofing detection, and integrity monitoring—require combinatorial sufficient statistics over C(M,J) fault hypotheses, rendering them intractable for standard ICL when the number of faults J exceeds the threshold J* ≈ 2–3 (for navigation; J* ≈ 2–4 across GeoAI domains). Building on recent advances in chain-of-thought (CoT) reasoning, we prove that CoT prompting provides a mechanism to overcome ICL-Hard barriers in navigation. We establish that T = Ω(J) tokens are necessary and sufficient for J-fault detection, with a navigation-specific token multiplier κnav ≈ 3.5. The effective hardness threshold shifts to J*CoT(T) ≈ J* + T/κnav, enabling detection of up to 7 simultaneous faults with 20 reasoning tokens. We provide complete classification of 32 navigation tasks across six categories (Table 6), derive seven testable predictions for navigation transformer behavior, and analyze implications for safety certification under aviation (DO-229D/DO-253D), automotive (ISO 26262), and maritime (IMO) standards. Our results establish that hybrid architectures—combining ICL for state estimation with explicit algorithms for integrity monitoring—are not merely practical conveniences but fundamental computational necessities. This work completes a seven-paper theoretical program characterizing ICL across all major geospatial AI domains, establishing a unified science of when and why foundation models succeed or fail on Earth observation and navigation tasks.