Machine Learning Challenges with Imbalanced Data

Abstract:

Application of Machine learning algorithms to some of the real-world problems pertaining to areas, like fraud/intrusion detection, medical diagnosis/monitoring, bio-informatics, text categorization and et al. where data set are not approximately equally distributed suffer from the perspective of reduced performance. The imbalances in class distribution often causes machine learning algorithms to perform poorly on the minority class. The cost minority class mis-classification is often unknown at learning time and can be far too high. A number of technique in data sampling, predominantly over-sampling and under-sampling, are proposed to address issues related to imbalanced data without discussing exactly how or why such methods work or what underlying issues they address. This paper tries to highlight some of the key challenges related to classification of imbalanced data while applying standard classification technique. This discusses some of the prevalent methods related to balancing the imbalanced data sets and their short comes in a hunt for better methods to handle the imbalanced data.

Awaiting session recording. Will post it soon.

Comments

Just Buzz... Where is AI?

Speaking to Recode’s Kara Swisher and MSNBC’s Ari Melber, Pichai said AI is “one of the most important things that humanity is working on. It’s more profound than, I don’t know, electricity or fire,” adding that people learned to harness fire for the benefits of humanity, but also needed to overcome its downsides, too. Pichai also said that AI could be used to help solve climate change issues, or to cure cancer. We are seeing some exciting things in the industry, Samsung’s massive 8K TVs apparently use AI to upscale lower resolution images for the big screen. Sony has created a new version of the Aibo robot dog, which this time promises more artificial intelligence. Travelmate’s robot suitcase will use AI to drive around and follow its owner wherever they go. Kohler has invented Numi, a toilet that has Amazon’s Alexa voice assistant built in etc., But despite all this, it does leave me wondering: is artificial intelligence really what we should be calling this revolution?...

Effective Pattern Identification Model for DDoS Attack Detection

Abstract: Distributed Denial of Service (DDoS) attacks are one of the major challenges to Internet community. Attackers send legitimate packets with often changing information from various compromised systems at random and at a very high frequency, rendering the target non-responsive for normal traffic. DDoS attacks are difficult to detect with traditional detection methods and standard Intrusion Detection Systems (IDS). Standard IDS tries to analyze the network traffic or system logs trying to identify emerging patterns on the network traffic. But due to randomness of the package origins it is difficult segregate true, false positive and normal traffic. This paper proposes a model based on Artificial Neural Networks to identify anomalies and detect DDoS patterns. In the proposed system sets of known characteristic features, which can separate attacks from normal traffic, are fed to the system to train the Artificial Neural Networks (ANN). This self learn system improves with each n...

Evolving App Paradigm

Over last two decades we have seen enterprise application landscape changing very rapidly with rapidly evolving technology stacks and changing industry dynamics. It is time for another paradigm shift in terms of how we conceptualize, design, develop, test and maintain our applications to meet volatile business requirements. Some of the prime features of this evolving paradigm Flexible business user centric infrastructure (than IT centric) IT as a service model with emphasis on user self service. Flexible development and deployment infrastructure which is readily available over cloud (no setup time, no high budgets and initial spending) Develop application using modern development languages which help improve the developer Usage of more dynamic meta programming languages Reusable assets, code generators, ... Build once and run anywhere across platforms and devices End-to-End application lifecycle integration through ALM and DevOps Improve communication and collaborat...

Sankar Vema - Blog Space

Search This Blog