The Cloudera Data Science Challenge is a rigorous competition in which candidates must provide a solution to a real-world big data problem that surpases a benchmark specified by some of the world's elite data scientists.
In the U.S., Medicare reimburses private providers for medical procedures performed for covered individuals. As such, it needs to verify that the type of procedures performed and the cost of those procedures are consistent and reasonable. Finally, it needs to detect possible errors or fraud in claims for reimbursement from providers.
Our aim is to analyse data and try to detect abnormal data, uncover anomalous patients, procedures, providers, and regions in the United States governments' Medicare health insurance system.
Please find the following files: -Required .ipynb file -Data available