Case Study

ClearValueRE

High-performance data and modeling system for real estate valuation

90s → <2sIngestion (13K records)
< 2sReport generation
6Statistical models
125+E2E test scenarios

The Problem

  • Valuation models produce unstable results on noisy or incomplete data
  • Analytical workflows cannot scale — slow ingestion, slow computation
  • Outputs are dashboards, not decision-ready reports
  • No framework for model comparison or forecast validation

The Solution

  • Frontend-heavy computation with parallel worker pools
  • Lightweight backend — no cron, minimal orchestration
  • 6-model registry with uniform interface and comparison framework
  • Sub-second analytical report generation (~50 pages)

Architecture

Frontend-driven computation keeps infrastructure lean while delivering real-time analytics.

FRONTEND — Browser (React / TypeScript) BACKEND — Python / FastAPI UI Layer MapComponent · AddressSearch · BacktestPanel · DataUpload · ReportPreview Valuation Engine 6-Model Registry primary · fullRegional · CMA linearPPSF · logLog · geomMean Regional multivariate log model Report Generator Client-Side HTML → PDF 6-model comparison tables Coefficient / significance analysis Forecast accuracy lookback Web Worker Pools (Parallel Execution) Backtest Workers Aggregation Workers Report Worker Client Data Layer Client-side property ID hashing · Fingerprint-based smart re-init CSV consolidation & chunking · GZip compression · Spatial indexing REST API Sales · Grid · Geocode · Auth Ingestion Pipeline COPY protocol · bulk upsert Auth & User Isolation JWT · OAuth · Multi-tenant PostgreSQL Parquet (archival)

Performance

Optimized at every layer from ingestion to output.

45×Ingestion speedup
< 2s50-page report
22×Bulk insert vs row-by-row
6.5×API compression ratio
OptimizationBeforeAfterMethod
Data ingestion (13K records)90 seconds< 2 secondsPostgreSQL COPY + client-side property ID hashing
Report generation (~50 pages)N/A< 2 secondsClient-side HTML rendering with worker offload
Bulk database insertRow-by-rowCOPY protocolTemp table → INSERT ON CONFLICT with DISTINCT ON
API response size2.6 MB~400 KBGZip middleware on JSON responses
Model executionMain threadWorker poolsOrchestrator / worker pattern with round-robin dispatch

Data Pipeline

End-to-end flow from raw CSV upload to decision-ready report.

CSV ConsolidationMulti-file merge
Property ID HashSHA-256 (browser)
Parse & Filter0.63s · dedup
COPY Insert0.33s · bulk upsert
HTML Report< 2s · 6 models
Client-side Server-side Report output

Capabilities

What the system actually does.

Performance Engineering

  • Orchestrator / worker pattern for parallel model execution
  • PostgreSQL COPY protocol — 22× faster bulk inserts
  • Client-side property ID hashing eliminates server bottleneck
  • GZip middleware: 2.6 MB → ~400 KB API responses

Modeling & Analytics

  • 6-model unified registry with consistent interface
  • OLS regression with quadratic time trend and Fourier deseasoning
  • EMA-smoothed local adjustment with sign test
  • Model comparison: stability vs. significance evaluation

Spatial Intelligence

  • Heat grid computation for per-cell price distribution
  • Two-level valuation: regional (5 mi) + local (0.75 mi)
  • Time-radius dispersion analysis for trend detection
  • Validation filters for outlier suppression

Report System

  • ~50-page analytical report in under 2 seconds
  • 6-model comparison with coefficient tables
  • Comparable sales with distance and PPSF ranking
  • Browser-native PDF export — no server rendering

Statistical Models

Six models, one uniform interface.

ModelMethodScope
PrimaryRegional + local two-level multivariate log modelRegional (5 mi) + local (0.75 mi)
Full RegionalRegional OLS with rich diagnostics, quadratic trendRegional (5+ mi, 24–36 mo)
CMAComparable Market Analysis — nearest 6 compsLocal (2 mi, 6 mo)
Linear PPSFSimple price-per-sqft linear regressionLocal
Log-Log LocalLog-log model on local comparablesLocal
Geometric MeanGeometric mean price-per-sqft fallbackLocal
Regional Multivariate Log Model ln(Price) = β₀ + β_size · ln(Sqft) + b · t + c · t² + β_beds · Beds + β_baths · Baths + β_attached · IsAttached + ε Quadratic time trend captures market reversals · Fourier seasonal deseasoning (p < 0.05) · Property type coefficient EMA-Smoothed Local Adjustment Sign test for micro-location effects · α = 2.0 · Weekly snapshots Confidence Rating High / Medium / Low based on sales count + R² · α = 0.05 threshold Final Estimate · Reduced Model Refit · Significance α = 0.05

Technology

Production stack, end to end.

Frontend

React 18 TypeScript Vite Leaflet Chakra UI Web Workers Vitest Playwright

Backend

Python 3.11 FastAPI PostgreSQL SQLAlchemy JWT / OAuth pytest ruff structlog

Key Insight

Most valuation systems fail not because the algorithms are weak, but because they cannot handle real-world data variability and performance constraints at the same time. This system was built to solve both.

Next Step

If you have a complex system that needs this kind of engineering, let's talk.

Send a short note with the system, the bottleneck, and the outcome you need.