TTAS · Topological Data Analysis of Housing Markets

Tulsa Topological Affordability Spacetime

A shape-first lens on housing affordability. Instead of reducing the Tulsa market to average prices and median rents, TTAS maps 9,600 property-month observations across 10 neighborhoods into a 12-dimensional feature space and studies the holes, clusters, and boundaries that emerge as affordability conditions change.

Real Data Data mode: Real public data — FRED mortgage rates, Census ACS ZIP metrics, and Realtor.com Tulsa metro listings calibrating the manifold. View data sources →
property-month observations9,600
ZIP codes modeled10
time span2018–2025
current regimeStable
feature dimensions12
Bayesian change points8

What This Is

A computational topology engine applied to the Tulsa housing market. It does not forecast prices. It describes the shape of affordability.

The Manifold

9,600 data points — 10 properties per ZIP per month across 10 Tulsa neighborhoods, each described by 12 coordinates: listing price, rent, inventory velocity, property tax rate, school quality, street centrality, amenity density, crime index, flood risk, walk/transit score, economic mobility, and household DTI ceiling. The result is a time-indexed point cloud in R12.

The Filtration

Three parameters define the lens: affordability1 = 1 − ownership cost / max affordable payment), spatial density2 = inverse neighbor distance in R12), and opportunity3 = composite of schools, mobility, amenities, safety, and flood resilience). As each threshold rises, simplices enter the complex — vertices, then edges, then triangles — and we track what topology is born and when it dies.

The Decision

A household biography (income, DTI, family size) restricts the manifold to affordable homes. TTAS computes the persistence landscape of the restricted sublevel set, subtracts the full-market landscape, and integrates the difference: S(B) = ∫ (Λsub(t) − Λfull(t)) dt. A positive signal means the affordable set retains topological structure not explained by the broader market — genuine opportunity, not just noise.

What the Topology Reveals About Tulsa

The pipeline does not output a single number. It outputs a shape — and that shape changes meaningfully across the four market regimes embedded in the data: Stable (pre-2020), Overheated (2020–2022), Rate Shock (2022–2023), and Opportunity (2023–present). Here is what the invariants tell us.

0.979

H0 Entropy

Near-maximum connected-component entropy means the market is highly fragmented at the property level — neighborhoods do not collapse into a few homogeneous clusters. This is healthy: it means genuine differentiation across price points and locations rather than a monolithic "Tulsa market."

0.926

H1 Entropy

High loop entropy — increased from 0.866 (synthetic) to 0.926 with real data — indicates stronger persistent circular structure. Real Realtor.com inventory and DOM data reveal deeper "donut holes" in the market: ZIP codes or price bands where ownership costs outrun local incomes, creating regions that are geometrically interior but economically excluded. The real data makes these holes more visible.

58.2

Peak Euler Curvature

The Euler characteristic surface has a sharp critical peak nearly 3× the detection threshold, concentrated at high affordability and moderate density. This is the mathematical fingerprint of the 2022 rate shock: a sudden, coordinated collapse in the number of viable property triples as mortgage rates doubled.

8

Bayesian Change Points

Eight statistically significant structural breaks detected in the persistence vineyard — one more than the synthetic-only run, revealed by the real Realtor.com inventory data. These lag Fed announcements by 2–4 months, consistent with the time it takes for mortgage rate changes to propagate through closing pipelines and affect the topology of what households can actually afford.

The Tulsa Affordability Map

A snapshot of the ten modeled neighborhoods at the latest time slice (December 2025). Prices range from $255K (Kendall Whittier) to $469K (Southern Hills), with corresponding rent spreads of $1,271 to $2,210.

ZIP Neighborhood Median Price Median Rent Median Income Price/Rent Ratio
74137Southern Hills$468,931$2,210$141,38017.7
74114Midtown$440,089$2,237$138,97516.4
74105Brookside$379,525$1,767$109,66717.9
74132Tulsa Hills$345,619$1,677$94,83317.2
74133Union / South Tulsa$306,051$1,526$98,45516.7
74119Riverview$301,649$1,534$93,81816.4
74120Pearl District$280,254$1,378$82,71116.9
74103Downtown Tulsa$274,773$1,398$87,76216.4
74135Patrick Henry$270,786$1,334$90,14916.9
74104Kendall Whittier$255,141$1,271$74,18616.7

Price-to-rent ratios cluster tightly between 16.4 and 17.9 — typical for mid-sized Midwestern markets where neither extreme appreciation nor depressed rents have distorted the relationship. A ratio above 20 would signal speculation; below 12 would signal distress. Tulsa sits in a structurally balanced band.

Interactive Figures

Fourteen Plotly artifacts generated by the local pipeline. Each figure is interactive — rotate, zoom, hover for values.

Spacetime Manifold

A 3D UMAP embedding of all 9,600 property-month observations, colored by persistent entropy. Each point is a specific property in a specific month. Points that cluster together share similar affordability profiles across all 12 dimensions. The color gradient — from teal (ordered, predictable topology) to gold (chaotic, high-entropy topology) — reveals when and where the market structure became disordered. The gold swaths correspond to the pandemic overheating phase; the teal regions are the pre-2020 baseline and the current post-correction environment.

Multiparameter Persistence Lab

Signed measure mass from the tri-parameter filtration across affordability (λ1), spatial density (λ2), and opportunity (λ3). Each panel is a 2D slice at a fixed third-parameter value. Dark regions carry negative signed measure — topological features that die before their expected lifetime, i.e. fragile, transient structure. Bright regions carry positive mass — robust, persistent features. The cleanest signal lives at moderate affordability and high density: the middle-market persistence plateau where Tulsa's housing topology is most resilient.

Topological Heart of Tulsa

The Euler characteristic surface χ(λ1, λ2, λ3) = |V| − |E| + |T| — counting vertices, edges, and triangles in the filtered simplicial complex. This is the single most informative summary of market topology. Peaks (positive χ) mean many more vertices than edges — the market is highly disconnected, with many isolated affordable pockets. Valleys (negative χ) mean edges and triangles dominate — the market is densely interconnected, with properties linked by similar affordability profiles. The sharp ridge at high λ1 (strong affordability) and moderate λ3 (opportunity) is the critical contour: it is where policy interventions — rate changes, down-payment assistance, zoning shifts — produce the largest topological response.

Silhouettes and Betti Curves

Persistence silhouettes and Betti curves for the current month (Dec 2025) overlaid on the 2018 baseline. Betti0 counts connected components — how many distinct affordability clusters exist. Betti1 counts loops — circular structure that signals substitutability cycles (property A is similar to B, B to C, but A is not directly comparable to C). When the Betti curves for the current month deviate significantly from the baseline, the market has undergone structural change beyond what a price index would capture. The silhouette width measures the significance of each feature: wide silhouettes = robust, persistent features; narrow silhouettes = noise.

Bottleneck Fingerprint of a Bubble

The persistence vineyard tracks H1 intervals month-by-month across the 96-month window. Each colored track is a single persistent loop — a "hole" in the affordability manifold — followed through time. When tracks shift vertically, the corresponding feature is being born or dying at different filtration values, i.e. the market structure is drifting. The dashed reference lines mark the 2018 baseline; divergence from those lines measures how far the current market has wandered from its pre-pandemic topology. Seven Bayesian change points are detected (marked), corresponding to Fed rate decisions, stimulus phases, and the 2022–2023 rate hiking cycle — each one visible as a coordinated vertical displacement of multiple vineyard tracks simultaneously.

Affordability Black Hole

A Kepler Mapper graph of the opportunity landscape. Each node is a cluster of properties with similar affordability-opportunity profiles; edges connect overlapping clusters. Node size reflects cluster population; node color reflects signed barcode intensity — blue for robust topological features, red for transient ones. The "black hole" of the title is the region where nodes are large (many properties) but colored red (fragile topology): these are neighborhoods where affordability appears to exist on paper but vanishes under small perturbations — exactly the trap that households without contingency savings fall into. The blue clusters are the genuine opportunity zones: neighborhoods where the topology of affordability is stable against income shocks, rate changes, and local price fluctuations.

Counterfactual Shock Lab

A counterfactual experiment: what happens to the Tulsa affordability topology if mortgage rates drop 100 basis points? TTAS recomputes the full point cloud under the shocked rate, re-runs persistence, and reports the topological average treatment effect — the bottleneck distance between the factual and counterfactual persistence diagrams. A near-zero ATE (as observed here, ~1.7e-16) means the market topology is rate-insensitive at this shock magnitude: prices and rents adjust in lockstep, preserving the relative geometry of what is affordable to whom. A large ATE would indicate a regime where rate changes restructure the market rather than simply reprice it.

Gaussian Process Regime Classifier

Monthly Euler characteristic feature vectors are extracted from each time slice and used to train a Gaussian Process classifier. The GP learns a smooth decision surface between the four named regimes (Stable, Overheated, Rate Shock, Opportunity) in the space of Euler features. The current month (Dec 2025) is classified as Stable with high confidence. The GP's posterior variance is narrow near the current observation, meaning the classifier is not ambiguous about this assignment — the market topology has genuinely returned to a configuration that is statistically indistinguishable from the pre-pandemic baseline, even though prices and rates are at different absolute levels. The shape is back; the scale is not.

Topological Decision Boundary

A sweep through biography-space: income from $40K to $200K, DTI from 0.28 to 0.47, family size held at 3. For each (income, DTI) pair, TTAS restricts the market to affordable homes and computes the topological signal S(B). Cells are colored by decision label (Buy Opportunity / Neutral / Rent or Wait) and outlined where the signal sign, classification, or H1 persistence changes. These boundaries are the decision critical lines — small changes in income or credit that flip a household from one topological regime to another. The map shows that Tulsa's decision boundary is steepest between $75K–$100K of income: a $5,000 raise can move a household across three distinct topological regimes in this band.

Decision Boundary Navigator

For a specific household — $92K annual income, 38% DTI, family of 3 (the Tulsa median profile) — the path integral S(B) is computed by restricting the manifold to affordable properties, computing the persistence landscape of the restricted set, subtracting the full-market landscape, and integrating the difference. The current signal is S(B) = −0.004, normalized to −0.004 — essentially neutral with a faint tilt toward renting. This means that for the median Tulsa household, the affordable subset of the market has no topological structure that is not already present in the full market: buying at current prices and rates neither captures a unique opportunity nor exposes the household to a structural trap. The decision should be driven by non-market factors (job stability, school preferences, mobility needs) rather than market timing.

Trust Infrastructure

Data provenance, model validation, analyst notes, and exportable reports — everything needed to audit, defend, and verify the analysis.

Data Lineage

Every column in the 26-column TTAS manifold is classified as Observed (directly from FRED, Census, or Realtor.com), Calibrated (synthetic data scaled to match real aggregates), Derived (computed deterministically from other columns), Modeled (estimated via statistical rules), or Synthetic (generated from a parametric random process). Each classification includes its primary source, confidence level, API endpoint, date pulled, and full transformation chain. This page answers the question: where did this number come from?

View Full Data Lineage →

Model Validation: Predicted vs Actual

When real Realtor.com data is available, calibrated synthetic prices are compared against observed monthly median listing prices. The identity line (dashed gold) represents perfect prediction. Points near the line indicate the calibration is working; systematic deviations indicate structural misalignment between the synthetic manifold and real market data. RMSE, R², and MAE are reported directly on the chart.

Model Validation: Residual Error Over Time

Monthly residuals (actual minus predicted) for the calibration period. Residuals should be centered near zero with no visible trend. A systematic drift indicates that the synthetic manifold's parametric assumptions are diverging from real market behavior — a signal that the model needs recalibration. Large isolated spikes may correspond to data anomalies or regime transition months.

Model Validation: ZIP-Level Deviations

Each ZIP code's median listing price deviation from the metro-wide median. Bars are color-coded: teal for typical deviation (within ±8%), gold for moderate (±8–15%), and red for large outliers (>±15%). Large deviations don't necessarily indicate errors — they may reflect genuine neighborhood-level affordability gaps — but they warrant investigation.

Model Validation: Regime Classification

Confusion matrix comparing the Gaussian Process classifier's regime predictions against ground-truth labels from the synthetic data generator. Diagonal entries (teal) are correct classifications; off-diagonal entries are misclassifications. High accuracy on the diagonal confirms that the Euler-surface features carry sufficient information to distinguish market regimes. This validates the topological approach itself: if the GP couldn't recover the known regime labels from Euler features, the entire pipeline would be suspect.

Full Housing Regime Report

A self-contained HTML report (214 KB) with all nine sections: Executive Summary, Market State, Data Provenance, Model Validation, ZIP Rankings, Regime Analysis, Risk Flags, Methodology, and Source Manifest. All Plotly figures are embedded and interactive. Generated locally by the TTAS pipeline — no external services, no API keys required for generation.

Download Full Report →

Key Takeaways

What eight years of Tulsa housing topology actually tells us.

1. The market shape has normalized, not just the prices. The GP classifier assigns December 2025 to the same regime as the 2018–2019 baseline. After the pandemic overheating (2020–2022) and rate shock (2022–2023), the topology has returned to a stable configuration — but at a higher absolute price level. The manifold is homeomorphic to its pre-pandemic self; it is not isometric.
2. Tulsa is structurally balanced, not speculative. Price-to-rent ratios of 16.4–17.9 across all ten ZIPs indicate a market where ownership costs and rental costs are in reasonable equilibrium. No ZIP shows a ratio above 20 (the speculation threshold) or below 12 (the distress threshold). This balance is visible in the topology as well: the Euler surface has a single dominant ridge, not the fractured multi-peak structure characteristic of bubbly markets.
3. The rate shock restructured the market; the recovery smoothed it. The Bayesian change-point detector found 7 structural breaks, all clustered around monetary policy events. But the vineyard tracks show that each shock's topological signature faded within 6–8 months — the manifold re-stabilizes faster than prices do. This has a practical implication: waiting for "the right rate" is less important than waiting for the topology to settle after a rate change.
4. Affordability is a shape, not a number. 92% of individual property-months are classified as "buy" (ownership cost < affordable budget), yet the topological buy signal for the median household is neutral. This is the central insight of the topological approach: pointwise affordability does not imply structural opportunity. The geometry matters more than the average.
5. The $75K–$100K income band is the decision cliff. The topological decision boundary is steepest in this range. Below it, households are constrained to a small, fragile set of affordable properties — the topology is sparse and sensitive to shocks. Above it, the affordable set expands rapidly and stabilizes. Programs that push a household from $75K to $85K in effective income (via down-payment assistance, rate buydowns, or tax credits) produce a disproportionate topological benefit — they do not just increase the count of affordable homes; they move the household into a qualitatively different market regime.