Skip to content

DataHack Challenges - Challenges offered during our hackathon by top data companies.

Notifications You must be signed in to change notification settings

DataHackIL/DataChallenges

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 

Repository files navigation

DataChallenges

This is a list of sponsor challenges at DataHack events.

You can find us on our website, Facebook, Meetup, YouTube and Twitter, and also join our monthly newsletter.



Description: Are you passionate about making the world a better place? Are you excited to use AI for the benefit of mankind? Intel, DataHack 2016's co-host, is posing the AI for Social Good Challenge. Intel will award a cool prize to each member of the team whose project most effectively utilizes AI to address a social issue.

Potential Datasets: https://github.com/shreyashankar/datasets-for-good

Description: A taxi goes from Chinatown to Times Square. How long will it take to arrive? In this challenge, you are given data on taxi rides in New York, containing information on each ride such as the start and end points, date, time of day, distance, etc. The data is available here. Our purpose is to predict the travel time (in logarithmic scale) of a ride. The data is split to train and test sets, and we can use both general data of the ride with local data on similar rides from the train set.

Repository: https://github.com/RocketDataScientist/DataHack-2017

Description: WIX collects logs of user actions within its platform. One of the main tasks of our Data Science team is understanding and predicting user behavior in order to optimize user experience and company revenue. Our team focuses on building models that are compact & efficient without compromising on accuracy. Using historical user event data we want to predict if a user performs a specific action ("the target action") within 14 days from the last available activity data.

Description: Windward is a data and analytics company making sense of ship and cargo movements around the world. Our Data Platform takes raw, unstandardized big data from multiple sources – which is often partial and unreliable - and uses ML to fuse the data and analyze each ship's actual behavior to determine ship identities and what they are doing. This helps to create actionable, insightful knowledge about what’s happening at sea from otherwise hard-to-interpret, noisy data.

One of the most important data features is ship type. A ship type describes what class of ship it is and could be anything from a small fishing vessel to a massive oil tanker. Most ships report their true type but some don’t, which means their designation labels are either incorrect or missing. In this case, we have to infer it ourselves.

In our data challenge you will help us predict ship type according to ship behavior. We will provide information about ship activities (meetings with other vessels, port visits, etc.). Some ships will be labeled with their type and other labels will be missing. The challenge is to infer the type of unlabeled ships based on labeled ships exhibiting similar behavior. The underlying assumption is that ships engaged in similar activities (e.g. frequenting the same ports, meeting with the same ships) are more likely to be of the same type.

This is, in a way, the ship version of “people similar to you” used on social websites. So, are you up to the challenge?


Description: Are you passionate about making the world a better place? Are you excited to use AI for the benefit of mankind? Intel, DataHack 2017's co-host, is posing the AI for Social Good Challenge. Intel will award a cool prize to each member of the team whose project most effectively utilizes AI to address a social issue.

Potential Datasets: https://github.com/shreyashankar/datasets-for-good

Description: Ever wondered how it feels to press the red button and take down missiles? Well this challenge will get you fairly close to that goal! You will be provided with short length trajectories (5-15s) and you’ll need to decide what type of threat you are facing. This challenge, provided by Rafael, combines both supervised and unsupervised learning.

Repository: https://github.com/DataHackIL/RocketDataScientist

Description: You think finding a needle in a haystack is easy-peasy-lemon-squeezy? Well you’re in for a treat! In the instagram challenge you will receive ~1M photos taken from 10K albums, your task will be to find the images that belong to the album’s owner. But not to worry, OrCam is here to help (a bit) - for each image you will be given some metadata and a descriptor for the face residing in it.

Description: Did you always dream about being a detective? In that case we’ve got a great mystery for you to solve! In the word disambiguation challenge you will receive a sentence and a single token, you will then need to utilize all of your detective skills to find the right Wikipedia page defining this token.


Description: Are you passionate about making the world a better place? Are you excited to use AI for the benefit of mankind? Intel, DataHack 2018's co-host, is posing the AI for Social Good Challenge. Intel will award a cool prize to each member of the team whose project most effectively utilizes AI to address a social issue.

Presentation: https://github.com/DataHackIL/DataChallenges/blob/master/2018/Intel_challenge_datahack_2018.pdf

Potential Datasets: https://github.com/shreyashankar/datasets-for-good

Description: Are you passionate about making widespread, impactful global changes? Autonomous vehicles represent one of the biggest revolutions mankind has ever seen and they will affect every aspect of our daily lives. In this challenge you will help to enable the autonomous car revolution. Teams undertaking Innoviz’s Rigid Motion Segmentation Challenge will solve the problem of decomposing LIDAR data (point cloud) into background and moving objects.

Presentation: https://github.com/DataHackIL/DataChallenges/blob/master/2018/innoviz_challenge_datahack_2018.pdf

Repository: https://github.com/InnovizTech/DataHack2018

Description: Want to help a top Jerusalem startup pilot churn prediction on an actual project for its flagship app - a product already used by millions all over the world? Sift through noisy data to discover patterns predicting who will churn and even when these ‘suspects’ are likely to unsubscribe, to earn yourself a lucrative reward at DataHack 2018!

Presentation: https://github.com/DataHackIL/DataChallenges/blob/master/2018/Lightricks_challenge_2018.pdf

Repository: https://github.com/DataHackIL/datahack-2018-challenge

Description: Microsoft Open Source team is proud to host the first “The Math Teacher” challenge in Israel, where you can leverage your NLP skills and the Azure Open Cloud to understand and solve complex math problems. Microsoft's "The Math Teacher” Challenge is a NLP Challenge for building a personal math teacher using natural language for understanding and reasoning capacities around Math. The goal is to build and NLP model that can perform automatic problem solving (especially math word problems) written in natural language. Your mission, if you choose to accept it, is to build a model that can return the highest amount of correct answers above a given baseline on the number_word_std test set.

Presentation: https://github.com/DataHackIL/DataChallenges/blob/master/2018/Microsoft_challenge_datahack_2018.pdf

Repository: https://github.com/aribornstein/MathTeacherChallenge/



Description: Our brains use faces as the main classifier for a person’s identity. We even have a specific “face area” in the brain dedicated to this task. Computer vision tools are based on the same idea and use facial features for identifying people. However, as humans, we are able to recognize close friends and others from afar and even from behind. This is achieved using body features such as hairstyle, body structure, gait and other characteristics. Can we achieve the same using AI?

Orcam’s challenge invites you to recognize movie stars without using their faces. In our challenge, you will receive low resolution and occluded images of famous actors and will be asked to identify them. Other than the unique dataset we have created you will receive a set of features we prepared for each image so that you can focus on the algorithms and let us worry about computation. Can you do it?

Presentation: ???

Repository: https://github.com/DataHackIL/orcam_challenge_datahack2019

Leaderboard: https://leaderboard.datahack.org.il/orcam


Description: When you get thousands of new customers every day, and only have a few dozen consultants to work with, you have to carefully pick which accounts get that special VIP consulting services. In our lead-scoring challenge you'll get 6 months of anonymous user-events (over 4B records) and help us find the crème de la crème, pick of the litter, the best of the best customer that could use our extra attention and get the most out of monday.com!

Presentation: https://docs.google.com/presentation/d/1BEQdjo7tP_gGEBqbXOE2jiOGIR_5UXGU_gAabTCa0fQ/edit?usp=sharing

Leaderboard: https://leaderboard.datahack.org.il/monday


Description: Ever wondered what would happen if you just plug in that seemingly innocent USB you found laying around? You’re about to find out! In this devices-gone-rogue challenge - should you choose to accept it - you will gain access to traffic data of ~ 1M devices, and will be tasked with finding the devices that, well, misbehave. This challenge, provided by Armis, is fully unsupervised - so put your anomaly belt on and get to it!

Presentation: ???

Repository: https://github.com/DataHackIL/Armis_Challenge_DataHack2019


Description: Lightricks encourage its users to express their inner artist using the apps they develop. Whether You're just starting out or editing pros, all you need is a phone and their apps to create some incredible content. Spread the message that art & creation is everywhere. In this free form vertical challenge you can use data and models for the creation, analysis and manipulation of art, design and infographics. Use machine learning tools, supervised or unsupervised models, vision algorithms, or any method you think up. Surprise us! Unleash your inner artist and use your creativity to create something amazing. You can use tabular data, images, videos, audio or any other type of data.

Examples: The Lightricks logo is plotted using our users usage data. Each dot represents a user and each shape and color represents a different cluster of users’ behavior. The clusters are based on the users favorite tools, duration in app, and other usage data. The variance within the cluster determines the width of each cluster.

Presentation: ???

Links for inspiration:

About

DataHack Challenges - Challenges offered during our hackathon by top data companies.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •