Best Practices for Imbalanced Data and Partitioning

Share this Session:
Share on facebook
Share on twitter
Share on linkedin
Session Date & Time:
On Demand

Presented by DataRobot

DataRobot is the leader in enterprise AI, delivering trusted AI technology and ROI enablement services to global enterprises. DataRobot’s enterprise AI platform democratizes data science with end-to-end automation for building, deploying, and managing machine learning models.

  • Session Description
  • Presenters
  • Additional Resources

In this two-part learning session, we discuss best practices around data partitioning and working with imbalanced datasets.

Five-fold cross-validation is often the silver bullet for partitioning your validation dataset, but there are some dangerous caveats you have to be aware of to make sure that you’re building robust models. In this learning session (part 1) , we talk about those pitfalls and outline strategies for handling them.

Binary target variables are very common in data science use cases, many of which are severely imbalanced. When you’re building models for infrequent events, such as predicting fraud or identifying product failures, it’s important to watch out for imbalance in your data. (In part 2 of this learning session we discuss strategies for working with imbalanced datasets and provide some rules-of-thumb for these types of use cases.)

Jack Jablonski

Working with a portfolio of clients, as the AI Success Manager, Jack guides the entire customer journey from onboarding to mastery and then expansion.  DataRobot wants

Login to View Content

Login Or Register

Event Registration

Almost there!

We need to confirm your email.

You're Almost thEre!

Check your Email for Confirmation and Login Instructions

You will receive an email confirming your registration – please click on the link in the email to confirm.

Login instructions will also be provided in the email. If you don’t get the email, please let us know at info@cognilytica.com.