Kaggle Toxic Challenge
Big thanks to all the kernels authors, from whom I took some of the models (like Capsule GRU). And thanks to authors of the Attention layers, AttentionWithContext, DeepMoji and 3rd one, which I cannot find at the moment.
Single models score around 0.9852-53. There's a notebook for stacking both with GBM's and MLP's + K-Fold or bagging run pipeline.