Our (Team PRMAS) competition winning solution of Datathon @ IndoML 2024, sponsered by NielsenIQ.
Datathon@IndoML 2024 is sponsored by NielsenIQ. Like previous years, Datathon will be held in conjunction with IndoML 2024. We invite participation from students as well as early career professionals. Selected teams will also be invited to IndoML 2024 to present their solution to leading Machine learning researchers from around the world, both from industry and academia.
In recent years, e-commerce has grown tremendously, with major online retailers offering billions of products and shipping millions of packages daily. However, given the sheer volume of offerings, sellers find it extremely difficult to fill in extensive sets of product attributes, resulting in incomplete product profiles. E-commerce platforms, on the other hand, depend on such structured metadata, typically in the form of attribute-value pairs, for a deeper understanding of the products, and for facilitating critical downstream applications, such as search, product recommendation, question answering; as well as, for providing an enhanced customer experience.
Predicting attribute-value pairs from unstructured product descriptions is, therefore, a fundamental challenge for worldwide e-commerce catalogs such as Amazon, Walmart, and Alibaba. In this Datathon, your task would be to develop a model that automatically predicts attribute-value pairs for a given product description.
Along with a short product description obtained from receipts/invoices, you are provided with the retailer and price information. Please however note that the "retailer" information is anonymized. Hence, either the value may not be an actual retailer or the retailer may not even be associated with the product. You will then be required to predict the values for a total of 4 attributes related to the product: Super Group, Group, Module, and Brand. Please note that the values of these attributes may or may not appear in the product title.
See our solution report: here