@fetzert Or use data processing techniques for dirty categories https://t.co/lUpPJLVFXE https://t.co/RMSs9nKEWE (Shameless self-plug, but I still think that such directions are really useful)
22,304 followers
22,304 followers
@francoisfleuret @david_picard For high-cardinality categories, use TargetEncoder (strong baseline in https://t.co/lUpPJLVFXE ) or string-based methods (see https://t.co/RMSs9nKEWE ) Neural-net embeddings require much more data 2/3
22,304 followers
@ImadPhd For categorical encoding, I know of our own work, and citations within: https://t.co/UtTE6j4d6Z and https://t.co/5iz7rCUVNe but these are not specific to healthcare.
96 followers
RT @GaelVaroquaux: Learning on dirty categories? @patricio_cerda presented our work with @balazskegl at @ECMLPKDD His slides: https://t.c…