Toolkit · empirical economics

Delphi

Reproducible econometrics, paper-ready

AI-augmented economics research toolkit: reproducible econometrics, data pipelines, and paper-oriented workflows. Estimation suite covering OLS, IV, DiD, staggered DiD, RDD, panel FE, matching, DML, causal forest, synthetic DiD, quantile, decomposition, bounds, shift-share, selection, and randomization inference. Data pipelines for World Bank, FRED, UN Comtrade. CLI-first.

Repository Design principles

Estimators

Modules

Paper projects

CLI

Interface

Design

What this toolkit promises

Estimation, not estimation theater

Each estimator wraps a proven Python library (statsmodels, linearmodels, pyfixest, doubleml, econml, rdrobust) with a consistent interface: fit, diagnostics, export. No custom solvers competing with established code.

Paper-oriented workflows

Projects are directories with config, data, estimation scripts, and output folders. The CLI generates project scaffolds, runs estimation pipelines, validates outputs, and produces audit logs for replication.

Data pipelines built in

Fetchers for World Bank WDI, FRED, and UN Comtrade. Each pipeline handles pagination, caching, and schema validation. Data lands in clean CSV or Parquet, ready for estimation.

Why the name

The Oracle at Delphi delivered answers from structured inquiry. This toolkit provides the same contract: ask a well-formed empirical question, get a reproducible answer with diagnostics and provenance.

Estimators

17 methods, consistent interface

Each estimator accepts a dataframe and a specification dict, returns fitted results with coefficient tables, diagnostics, and export helpers. Underlying libraries handle the numerics.

OLS: linear regression with robust/clustered standard errors

IV / 2SLS: instrumental variables estimation

Panel FE: entity and time fixed effects (linearmodels)

DiD: canonical difference-in-differences

Staggered DiD: Callaway-Sant'Anna, Sun-Abraham (pyfixest)

RDD: sharp and fuzzy regression discontinuity (rdrobust)

Matching: propensity score and coarsened exact matching

DML: double/debiased machine learning (doubleml)

Causal Forest: heterogeneous treatment effects (econml)

Synthetic DiD: synthetic control with DiD adjustment

Quantile: conditional quantile regression

Decomposition: Oaxaca-Blinder, Kitagawa-Blinder

Bounds: Lee bounds, Manski bounds for partial identification

Shift-Share: Bartik instruments with inference corrections

Selection: Heckman two-step and MLE selection models

Randomization Inference: Fisher exact p-values via permutation

Event Study: dynamic treatment effect plots and pre-trend tests

Stack

Technology

Pure Python library. No web server, no database. The estimators wrap well-tested packages; Delphi provides the glue, project structure, CLI, and audit trail.

statsmodelsOLS, quantile

linearmodelspanel FE, IV

pyfixeststaggered DiD

doublemlDML

econmlcausal forest

rdrobustRDD

scikit-learnmatching, ML

clickCLI framework

pandasdata wrangling

CLI

Command-line interface

All commands use the econai entry point. Projects are self-contained directories with config, data, and output folders.

econai init: scaffold a new paper project with config and folder structure

econai fetch: run data pipelines (WB, FRED, Comtrade) and cache results

econai estimate: run estimation pipeline from project config

econai validate: check outputs against specification (coefficients, diagnostics)

econai pipeline: run full fetch, estimate, validate sequence

econai audit: generate replication audit log for a project

Run

Get started

Install from the repository:

pip install -e delphi/

Scaffold a new paper project:

econai init my-growth-paper

Fetch data and run the estimation pipeline:

econai fetch --source wb --indicators NY.GDP.PCAP.KD && econai estimate

The full reference (every estimator, CLI flag, project config schema) lives in the repo:

delphi/README.md