Calculates 103 firm characteristics from CRSP + Compustat directly in Python – no WRDS SAS cloud
-
Updated
Feb 9, 2023 - Python
Calculates 103 firm characteristics from CRSP + Compustat directly in Python – no WRDS SAS cloud
Data matching for corporate governance research
This GitHub repository shows data collection and analysis for “Regulatory Fragmentation” paper by Kalmenovitz, Lowry and Volkova, The Journal of Finance (Forthcoming)
Fuzzy match entity names (primarily persons and companies) across databases
Pipeline dealing with WRDS (Wharton Research Data Services) datasets including crsp, master, etc, in order to build mega-database for scaling in Market Microstructure research
Replication code for "The Shape of Beta: Industry Factor Structure and Crisis Risk Premium" (Woo & Kim, 2026)
Repository for CQA: How much new information is there in earnings? Reproducible Empirical Accounting Research. Ass III
End-to-End Python implementation of Mo et al.'s (2025) ACT-Tensor methodology; a tensor completion framework for financial dataset imputation. Implements cluster-based CP decomposition, HOSVD factor extraction, temporal smoothing (CMA/EMA/Kalman), and downstream asset pricing evaluation. Transforms sparse data into dense machine readable data.
Classify CRSP-style delisting reasons from SEC EDGAR for quant pipelines
Minimal PEAD (post-earnings announcement drift) backtest using Wharton Research Data Services (IBES + CRSP) — Python pipeline for research & plots.
Academically rigorous implementation of the Fama-French (2015) five-factor model using WRDS (CRSP + Compustat) data.
Point-in-time insider filing de-noising, ML scoring, and cross-sectional return tests.
按决策难度匹配 Agent 介入方式的智能数据准备系统 | Intelligent Data Preparation Agent
Idiosyncratic volatility and abnormal returns during VIX spike events empirical study using a survivorship-bias-free CRSP universe (2005–2024)
Empirical analysis of ESG performance and financial returns using CRSP, Compustat, and Refinitiv data. Panel of 18K+ U.S. firm-years (2013–2023). Covers multi-database merging, OLS/panel regressions with fixed effects, and industry-level double materiality classification. Python · pandas · statsmodels · WRDS
Point-in-time asset-pricing pipeline linking Compustat geographic segments to macro states, CRSP returns, and leakage-aware ML diagnostics.
Rolling-window XGBoost cross-sectional return prediction for US equities (1995-2024). Out-of-sample annualized Sharpe 1.03, monthly CAPM alpha +2.19% (t=6.08), market beta -0.43 over 300 months (2000-2024).
Academically rigorous implementation of the Fama-French (1993) three-factor model using WRDS (CRSP + Compustat) data.
Add a description, image, and links to the crsp topic page so that developers can more easily learn about it.
To associate your repository with the crsp topic, visit your repo's landing page and select "manage topics."