IDENTIFICATION AND ANALYSIS OF CONFOUNDING VARIABLES AND SIMPSON’S PARADOXTools Chattopadhyay, Ishani (2022) IDENTIFICATION AND ANALYSIS OF CONFOUNDING VARIABLES AND SIMPSON’S PARADOX. [Dissertation (University of Nottingham only)]
AbstractThis dissertation harnesses machine learning algorithms and model agnostic tools to explore the counter intuitive relationship between protein intake from legumes and pass rate in Malawi. This dissertation focuses on an exploratory analysis to study approaches towards creating sub-groups based on K-means clustering algorithm in order to identify Simpson’s Paradox. The curious case of negative relationship between protein intake from legumes and pass rate in Malawi, has been addressed through identification of confounders by harnessing logistic regression and chi-square tests. Random Forest Model and Partial Dependency Plots have been utilised to study the relationship between protein intake from legumes and pass rates by creating sub-groups of the confounders in order to isolate the effect of these confounders.
Actions (Archive Staff Only)
|