Does PL automatically ensure/maintain class proportions (i.e. balance) when doing its own data splits?
I believe scikit-learn(not doubt elsewhere too) has an option to stratify (I hand coded it in my own external development since the basics are relatively straightforward with pandas).
I think stratification is pretty easy with a single category (y/n classifications etc.) but it can get awkward with multiple categories since proportionality can be hard to maintain as the subsets become smaller.
Apart from PL’s own capabilities (now, planned) in this area, does anyone have any tips-n-tricks or other best practice guidance?