Toolkit · empirical economics

Delphi

Reproducible econometrics, paper-ready

AI-augmented economics research toolkit: reproducible econometrics, data pipelines, and paper-oriented workflows. Estimation suite covering OLS, IV, DiD, staggered DiD, RDD, panel FE, matching, DML, causal forest, synthetic DiD, quantile, decomposition, bounds, shift-share, selection, and randomization inference. Data pipelines for World Bank, FRED, UN Comtrade. CLI-first.

17
Estimators
77
Modules
6+
Paper projects
CLI
Interface

What this toolkit promises

Estimation, not estimation theater

Each estimator wraps a proven Python library (statsmodels, linearmodels, pyfixest, doubleml, econml, rdrobust) with a consistent interface: fit, diagnostics, export. No custom solvers competing with established code.

Paper-oriented workflows

Projects are directories with config, data, estimation scripts, and output folders. The CLI generates project scaffolds, runs estimation pipelines, validates outputs, and produces audit logs for replication.

Data pipelines built in

Fetchers for World Bank WDI, FRED, and UN Comtrade. Each pipeline handles pagination, caching, and schema validation. Data lands in clean CSV or Parquet, ready for estimation.

Why the name

The Oracle at Delphi delivered answers from structured inquiry. This toolkit provides the same contract: ask a well-formed empirical question, get a reproducible answer with diagnostics and provenance.

17 methods, consistent interface

Each estimator accepts a dataframe and a specification dict, returns fitted results with coefficient tables, diagnostics, and export helpers. Underlying libraries handle the numerics.

OLS: linear regression with robust/clustered standard errors
IV / 2SLS: instrumental variables estimation
Panel FE: entity and time fixed effects (linearmodels)
DiD: canonical difference-in-differences
Staggered DiD: Callaway-Sant'Anna, Sun-Abraham (pyfixest)
RDD: sharp and fuzzy regression discontinuity (rdrobust)
Matching: propensity score and coarsened exact matching
DML: double/debiased machine learning (doubleml)
Causal Forest: heterogeneous treatment effects (econml)
Synthetic DiD: synthetic control with DiD adjustment
Quantile: conditional quantile regression
Decomposition: Oaxaca-Blinder, Kitagawa-Blinder
Bounds: Lee bounds, Manski bounds for partial identification
Shift-Share: Bartik instruments with inference corrections
Selection: Heckman two-step and MLE selection models
Randomization Inference: Fisher exact p-values via permutation
Event Study: dynamic treatment effect plots and pre-trend tests

Technology

Pure Python library. No web server, no database. The estimators wrap well-tested packages; Delphi provides the glue, project structure, CLI, and audit trail.

statsmodelsOLS, quantile
linearmodelspanel FE, IV
pyfixeststaggered DiD
doublemlDML
econmlcausal forest
rdrobustRDD
scikit-learnmatching, ML
clickCLI framework
pandasdata wrangling

Command-line interface

All commands use the econai entry point. Projects are self-contained directories with config, data, and output folders.

econai init: scaffold a new paper project with config and folder structure
econai fetch: run data pipelines (WB, FRED, Comtrade) and cache results
econai estimate: run estimation pipeline from project config
econai validate: check outputs against specification (coefficients, diagnostics)
econai pipeline: run full fetch, estimate, validate sequence
econai audit: generate replication audit log for a project

Get started

Install from the repository:

pip install -e delphi/

Scaffold a new paper project:

econai init my-growth-paper

Fetch data and run the estimation pipeline:

econai fetch --source wb --indicators NY.GDP.PCAP.KD && econai estimate

The full reference (every estimator, CLI flag, project config schema) lives in the repo:

delphi/README.md