Skip to main content

Machine Learning Challenges with Imbalanced Data


Application of Machine learning algorithms to some of the real-world problems pertaining to areas, like fraud/intrusion detection, medical diagnosis/monitoring, bio-informatics, text categorization and et al. where data set are not approximately equally distributed suffer from the perspective of reduced performance. The imbalances in class distribution often causes machine learning algorithms to perform poorly on the minority class. The cost minority class mis-classification is often unknown at learning time and can be far too high. A number of technique in data sampling, predominantly over-sampling and under-sampling, are proposed to address issues related to imbalanced data without discussing exactly how or why such methods work or what underlying issues they address. This paper tries to highlight some of the key challenges related to classification of imbalanced data while applying standard classification technique. This discusses some of the prevalent methods related to balancing the imbalanced data sets and their short comes in a hunt for better methods to handle the imbalanced data.  

Awaiting session recording. Will post it soon.


Popular posts from this blog

Distinguished Engineer Award

Congrats! CSC Distinguished Engineers & Architects 2017

Yesterday CSC announced Distinguished Engineers & Architects, Batch 2017
Congrats to all the distinguished folks. Welcome on board...
Distinguished Architects  Randy Arthur (Americas) serves as product owner for CSC’s IaaS offerings and as a lead solutions architect for complex integration projects involving cloud computing technologies. During his 16-year career with CSC, Randy  has worked successfully in various roles including midrange service delivery, pre-sales solution development and product management. He was the first CTO of CSC’s Cloud  technology “incubator.” Bio on | LinkedIn| Twitter
Graham Chastney(UKI&N) is a global domain architect experienced in workplace technologies,  solution strategy and solution governance. He is a global collaborator who is relied upon to provide thought leadership to solution teams and to build and development teams. Graham is the founder  and lead author of the Technology Perspectives blog, which he regards as part of a broader ambition to …