Causal Inference and Causal Machine Learning with Practical Applications

Presenters

Dhiraj Gangaraju

Ex-Walmart Global Tech India (Senior Data Scientist). 

Bio

Dhiraj is an experienced data science practitioner working as a data scientist with Walmart Global Tech India for the last 4 years. He holds a master’s degree in analytics from the Indian Institute of Science (IISc), Bangalore. Dhiraj has worked on a range of techniques like propensity modeling, forecasting, ranking, etc. across domains like Customer, Pricing and Real Estate. He is passionate about creating and deploying end-to-end data science products for business use cases. He is particularly keen on topics like causality, MLOPs, Statistical Analysis, Machine Learning and Neural Networks.

Email – dhiraj.gangaraju@walmart.com


Abstract: One of the most important research areas in Machine Learning is to build prescriptive models. This requires understanding and measurement of the causal impact of any proposed treatment, followed by designing optimal strategy based on such causal estimation. Traditional impact measurement frameworks like A-B testing & Randomized Control Trials have certain limitations in terms of feasibility, time constraint, and unknown confounders. Observational Causal Inference techniques can achieve similar and better results in terms of measuring impact of new proposed changes to any systems. Our models serve a wide variety of users who may respond very differently to any changes, so such heterogeneous behavior can be well captured through Causal Machine Learning models which helps in developing better prescriptive recommendations and implementation strategies. The tutorial will cover techniques of observational causal inference like propensity and covariate matching, Causal ML techniques of conditional average treatment effect estimation, using wide variety of algorithms like meta-learners, direct uplift estimation, tree-based algorithms. It will also cover model validation and visualization for causal ML models, implementation of such models in industry case studies.

Duration: Half day

Target Audience: Any statistician, Data Scientist from industry or academia, who are interested in causal inference and machine learning, and learn how to implement them in large scale applications can attend. The audience can benefit from prior knowledge of statistical hypothesis testing, distance measures, basic supervised and unsupervised Machine learning models, and Python programming language.

Description & Outline:

Topic Details
Introduction to Causal Inference Explanation of basic concepts and terminology of Causal Inference
Causal Inference with hands-on Case study Traditional Impact Measurement systems
Counterfactuals
Confounders and Instrumental Variables
Matching Techniques
Regression Discontinuity Design, Difference-in-difference
Comparison of Observational Causal Inference with Traditional methods
Hands-On Case Study on Observational Inference
Causal Machine Learning Models with hands-on Case study Conditional Average Treatment Effect
Uplift Models
Meta Learners- Transformed Outcome, Two-model approach
Double Debiased ML models
Uplift Trees, Uplift Random Forest, Causal Forests
Deep Learning based Causal ML
Evaluation of Uplift Models
Experiment Design based on Uplift models
Hands-On Case Study on Uplift Models

References:

  1. Jaskowski, and Jaroszewicz, S., 2012, June. Uplift modeling for clinical trial data. In ICML Workshop on Clinical Data Analysis (Vol. 46).
  2. Athey, and Imbens, G.W., 2015. Machine learning methods for estimating heterogeneous causal effects. stat, 1050(5), pp.1-26.
  3. Rzepakowski, and Jaroszewicz, S., 2012. Decision trees for uplift modeling with single and multiple treatments. Knowledge and Information Systems, 32(2), pp.303-327.
  4. Athey, and Imbens, G., 2016. Recursive partitioning for heterogeneous causal effects. Proceedings of the National Academy of Sciences, 113(27), pp.7353-7360.
  5. Wager, and Athey, S., 2018. Estimation and inference of heterogeneous treatment effects using random forests. Journal of the American Statistical Association, 113(523), pp.1228-1242.
  6. Athey, , Tibshirani, J. and Wager, S., 2019. Generalized random forests. Annals of Statistics, 47(2), pp.1148-1178.
  7. Guelman, , Guillén, M. and Pérez-Marín, A.M., 2015. Uplift random forests. Cybernetics and Systems, 46(3-4), pp.230-248.
  8. Nie, and Wager, S., 2017. Quasi-oracle estimation of heterogeneous treatment effects. arXiv preprint arXiv:1712.04912.
  9. Devriendt, , Moldovan, D. and Verbeke, W., 2018. A literature survey and experimental evaluation of the state-of-the-art in uplift modeling: A stepping stone toward the development of prescriptive analytics. Big data, 6(1), pp.13-41.
  10. Zhang, Weijia, Jiuyong Li, and Lin “A unified survey on treatment effect heterogeneity modeling and uplift modeling.” arXiv preprint arXiv:2007.12769 (2020).
  11. Künzel, Sören , et al. “Metalearners for estimating heterogeneous treatment effects using machine learning.” Proceedings of the national academy of sciences 116.10 (2019): 4156-4165.
  12. Chen, Huigang, et “Causalml: Python package for causal machine learning.” arXiv preprint arXiv:2002.11631 (2020).
  13. Syrgkanis, V. Lei, M. Oprescu, M. Hei, K. Battocchi, G. Lewis. Machine Learning Estimation of Heterogeneous Treatment Effects with Instruments. Proceedings of the 33rd Conference on Neural Information Processing Systems (NeurIPS), 2019.
  14. Chernozhukov, Victor, et “Double/debiased/neyman machine learning of treatment effects.” American Economic Review 107.5 (2017): 261-65.
  15. Shi, Claudia, David Blei, and Victor Veitch. “Adapting neural networks for the estimation of treatment effects.” 33rd Conference on Neural Information Processing Systems (NeurIPS 2019), 2019.
  16. Louizos, Christos, et “Causal effect inference with deep latent-variable models.” arXiv preprint arXiv:1705.08821 (2017).

Need help? Get in Touch

bda2025@iiitb.ac.in

 

Sponsorship / Local Arrangements Queries

bda2025@iiitb.ac.in  sushree.behera@iiitb.ac.in

HOST CITY

Join us in Bengaluru this July to engage with thought leaders, share your work, and drive the future of big data and AI!