Research toolkit and methodology

Delphi

Economics research toolkit for estimation, ingestion, and paper generation.

Summary

Delphi is a CLI-based economics research toolkit that runs estimation, ingests data from canonical sources, and drives papers from planning to submission. It enforces a six-phase workflow with mandatory gate documents and a first-principles discipline that blocks progression until each phase passes. The toolkit is built for reproducible research across development, trade, and policy economics.

What it is

Delphi is a structured Python environment for conducting reproducible economics research. The toolkit exposes a single Click CLI with four primary commands: ingest, estimate, paper, and manuscript. Estimation covers panel regression, instrumental variables, difference in differences, regression discontinuity, double machine learning, and spatial econometrics, drawing on statsmodels, linearmodels, pyfixest, DoubleML, EconML, and rdrobust. Ingestion modules pull series from World Bank WDI, FRED, UN Comtrade, IMF, and local Excel, Stata, or SPSS files using a common collect, validate, and store contract that records provenance for every dataset.

The defining feature is a phase-based research workflow that the toolkit detects from project context and file state. Phase 1 Planning runs first-principles, contribution-audit, and pre-analysis-plan. Phase 2 Data runs a data-quality audit before any regression. Phase 3 Estimation runs econometric-reasoning, economic-magnitude, mechanism-analysis, and structural-breaks audits. Phase 4 Writing runs introduction-writing and the paper generator. Phase 5 Audit runs paper-coverage, table-figure-necessity, hostile-referee, visual-check, and text-audit. Phase 6 Submission runs journal-targeting and referee-response. The workflow refuses to declare a paper ready until every phase checklist is closed, and the first-principles skill governs every decision: derive the question from economic logic, sit with results that do not make sense until they do, and never claim more than the evidence supports.

Methodology

  • Six-phase research workflow covering planning, data, estimation, writing, audit, and submission, each with mandatory gate documents.
  • First-principles discipline that forces derivation from economic logic rather than convention and blocks unverified claims.
  • Estimation library spanning OLS, IV, panel fixed effects, DiD, RD, double ML, causal forest, and spatial models.
  • Pre-analysis plan, contribution audit, and identification audit run before any estimation begins.
  • Econometric reasoning audit that checks identification strategy, clustering, magnitudes, robustness, and causal claims.
  • Mechanism analysis and hostile-referee simulation that build preemptive defenses before submission.
  • Automated paper generation pipeline with visual-check and text-audit on the compiled PDF.
  • Provenance enforcement on every dataset with no mock data, source URL, vintage, and license recorded.

Data sources

  • World Bank WDI via wbgapi
  • FRED via fredapi
  • UN Comtrade via comtradeapicall
  • IMF WEO, IFS, and DOTS via imfp
  • Penn World Table
  • CEPII BACI and gravity datasets
  • Bangladesh Bureau of Statistics
  • IPUMS census and survey microdata

Deliverables when used in engagements

  • Reproducible estimation outputs from panel, IV, DiD, RD, and double ML methods.
  • Pre-analysis plans, data quality reports, and results audits as gate documents per phase.
  • Automated working papers and final manuscripts with publication-ready tables and figures.
  • Provenance records for every dataset with source, vintage, transformations, and license.
  • Audit reports covering econometric reasoning, paper coverage, hostile referee, and visual and text checks.