Skip to main content

Machine Learning Challenges with Imbalanced Data

Abstract:

Application of Machine learning algorithms to some of the real-world problems pertaining to areas, like fraud/intrusion detection, medical diagnosis/monitoring, bio-informatics, text categorization and et al. where data set are not approximately equally distributed suffer from the perspective of reduced performance. The imbalances in class distribution often causes machine learning algorithms to perform poorly on the minority class. The cost minority class mis-classification is often unknown at learning time and can be far too high. A number of technique in data sampling, predominantly over-sampling and under-sampling, are proposed to address issues related to imbalanced data without discussing exactly how or why such methods work or what underlying issues they address. This paper tries to highlight some of the key challenges related to classification of imbalanced data while applying standard classification technique. This discusses some of the prevalent methods related to balancing the imbalanced data sets and their short comes in a hunt for better methods to handle the imbalanced data.  

Awaiting session recording. Will post it soon.

Comments

Popular posts from this blog

Do we know the enterprise IT challenges...???

Last night during the dinner chat with one of my old school pal, we stumbled on the topic of current issues that enterprises are stuck with. It went on almost for 30 mins. But what made it less interesting to me is that whole discussion was around cost cutting, our sourcing, rationalization etc., It is really boring, we are still taking about the tip of iceberg. But the question is due we really know what the real challenges are. I am not talking about a laundry list with 30/40/50 items. I am looking why we really have those items? (whatever the count is). I could not get this out of mind and started listing, order, consolidating, prioritizing those items to make sure I am completely confident that as a consultant I am doubly sure about them. Of course, it is debatable. But this is what I think are core problem and rest of list is the symptoms. 1. Dynamic market conditions are forcing business to adopt rapidly while IT is able to respond to this 2. Day by day IT is becoming exp

Infra store – the next IT marketplace

We are all familiar with the Apple App Store or Google Play Store we visit every day to download apps, games and necessary updates for our phones and tablets. The app store model revolutionized the marketplace idea, making it easy for both software vendors and consumers to publish and install software without the hassles of software building, distribution and deployment. Read further on CSC HyperThink

Just Buzz... Where is AI?

Speaking to Recode’s Kara Swisher and MSNBC’s Ari Melber, Pichai said AI is “one of the most important things that humanity is working on. It’s more profound than, I don’t know, electricity or fire,” adding that people learned to harness fire for the benefits of humanity, but also needed to overcome its downsides, too. Pichai also said that AI could be used to help solve climate change issues, or to cure cancer. We are seeing some exciting things in the industry, Samsung’s massive 8K TVs apparently use AI to upscale lower resolution images for the big screen. Sony has created a new version of the Aibo robot dog, which this time promises more artificial intelligence. Travelmate’s robot suitcase will use AI to drive around and follow its owner wherever they go.  Kohler has invented Numi, a toilet that has Amazon’s Alexa voice assistant built in etc., But despite all this, it does leave me wondering: is artificial intelligence really what we should be calling this revolution? Bec