DataConf

Held on Thursday, October 26th, between 09:00 and 18:00, DataConf 2017 drew a crowd of over 100 data science and machine learning experts from the top companies in Israel for a day of knowledge sharing.

Event website: http://dataconf.org/

Meetup event page: https://www.meetup.com/DataHack/events/244004618/

Facebook event page: https://www.facebook.com/events/1623405514382356/

Yakov Shambik, Mobileye - Eye of the Beholder: Object detection in Mobileye using Deep Neural Networks and other techniques

Speaker: Yakov Shambik, Vehicles Detection Technology Manager @ Mobileye

Title: Eye of the Beholder: Object detection in Mobileye using Deep Neural Networks and other techniques

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Mobileye_Yakov_Shambik.pdf

Ofer Ron, LivePerson - Concepts before machinery: Harnessing the power of domain expertise for machine-learning-based solutions

Speaker: Ofer Ron, Head of Data Science @ LivePerson

Title: Concepts before machinery: Harnessing the power of domain expertise for machine-learning-based solutions

Video: https://www.youtube.com/watch?v=wR2u7V8D5Y8&list=PLZYkt7161wELbPfqY92vAEmKVhsyxg5Nk&index=3

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_LivePerson_Ofer_Ron.pdf

Alex Ran, Intuit - Using Data Science for Automated Accounting

Speaker: Alex Ran, Distinguished Engineer @ Intuit

Title: Using Data Science for Automated Accounting

Video: https://www.youtube.com/watch?v=_ZBos8T35D0&list=PLZYkt7161wELbPfqY92vAEmKVhsyxg5Nk&index=2

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Intuit_Alex_Ran.pdf

Meir Maor, SparkBeyond - Developing Simple and Stable Machine Learning Models

Speaker: Meir Maor, Chief Architect @ SparkBeyond

Title: Developing Simple and Stable Machine Learning Models

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_SparkBeyond_Meir_Maor.pdf

Roii Spoliansky, PayPal - Active learning optimization as a function of label cost and mistake cost

Speaker: Roii Spoliansky, Lead Data Scientist @ PayPal

Title: Active learning optimization as a function of label cost and mistake cost

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_PayPal_Roii_Spoliansky.pdf

Gil Chamiel, Taboola - Don’t believe everything your network tells you: Uncertainty in deep learning for recommender systems

Speaker: Gil Chamiel, Director of Data Science and Algorithms @ Taboola

Title: Don’t believe everything your network tells you: Uncertainty in deep learning for recommender systems

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Taboola_Gil_Chamiel.pdf

Adina Lederhendler, Neura - General vs. subpopulation-specific modeling: When and why you need to get specific

Speaker: Adina Lederhendler, Senior Data Scientist @ Neura

Title: General vs. subpopulation-specific modeling: When and why you need to get specific

Video: https://www.youtube.com/watch?v=ft36Tq5FUz0&list=PLZYkt7161wELbPfqY92vAEmKVhsyxg5Nk&index=4

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Neura_Adina_Lederhandler.pdf

Yonatan Wexler, Orcam - Fast and Furious Face Recognition: Efficient metric learning for video stream data

Speaker: Yonatan Wexler, VP R&D @ Orcam

Title: Fast and Furious Face Recognition: Efficient metric learning for video stream data

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Orcam_Yonatan_Wexler.pdf

Itamar Ben-Ari, Research Scientist @ Intel - Differentiable Memory Allocation Mechanism For Neural Computing

Speaker: Itamar Ben-Ari, Research Scientist @ Intel

Title: Differentiable Memory Allocation Mechanism For Neural Computing

Video: https://www.youtube.com/watch?v=DAHTNElXXgk&list=PLZYkt7161wELbPfqY92vAEmKVhsyxg5Nk&index=4

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Intel_Itamar_Ben_Ari.pdf

Dr. Oshri, Rafael - Multi-agent deep reinforcement learning in communication networks

Speaker: Dr. Oshri, Senior Research Scientist @ Rafael

Title: Multi-agent deep reinforcement learning in communication networks

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2017/DataConf_2017_Rafael.pdf

DataConf 2018

Held on Thursday, October 4th, between 09:00 and 18:00, DataConf 2018 drew a crowd of over a 100 data science and machine learning experts from the top companies in Israel for a day of knowledge sharing.

YouTube playlist: https://www.youtube.com/playlist?list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53

Event website: http://dataconf.org/

Meetup event page: https://www.meetup.com/DataHack/events/255082526/

Facebook event page: https://www.facebook.com/events/1967922793269453/

Dana Kaner, Perimter X - Bootstrap, Random Forest and ll sorts of magic

Speaker: Dana Kaner, Data Scientist @ Perimeter X

Title: Bootstrap, Random Forest and All Sorts of Magic

Video: https://www.youtube.com/watch?v=ynkJVd6B13U

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_PerimeterX_Dana_Kaner.pdf

Abstract: The Bootstrap resampling method is often used for statistical inference. We demonstrate its power and simplicity through the well known Random Forest algorithm. We present both the theoretical background on the above topics and an implementation in R.

Pavel Levin, Booking.com - Where should I travel next? Modeling multi-destination trips with Recurrent Neural Networks.

Speaker: Pavel Levin, Senior Data Scientist @ Booking.com

Title: Where should I travel next? Modeling multi-destination trips with Recurrent Neural Networks.

Video: https://www.youtube.com/watch?v=pwfwUA4ZShI&t=0s&index=5&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Booking_Pavel_Levin.pdf

Abstract: Many real-world problems naturally give rise to sequential data. Language models are already widely used to tackle computational problems related to natural language. We would like to present a non-NLP example by walking through a solution to the problem of recommending next destinations to customers who are taking a single trip to multiple cities using RNN-based sequence modeling.

Ari Bornstien, Microsoft - Beyond Word Embeddings

Speaker: Ari Bornstien, Senior Cloud Developer Advocate @ Microsoft

Title: Beyond Word Embeddings

Video: https://www.youtube.com/watch?v=zeYwMIDo05w&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&index=6

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Microsoft_Ari_Bornstein.pdf

Abstract: Since the advent of word2vec, word embeddings have become a go to method for encapsulating distributional semantics in NLP applications. This presentation will review the strengths and weaknesses of using pre-trained word embeddings, and demonstrate how to incorporate more complex semantic representation schemes such as Semantic Role Labeling, Abstract Meaning Representation and Semantic Dependency Parsing in to your applications.

Dr. Michal Shmueli-Scheuer, IBM Research - Conversational bots for customer support

Speaker: Dr. Michal Shmueli-Scheuer, Researcher @ IBM Research

Title: Conversational Bots for Customer Support

Video: https://www.youtube.com/watch?v=i567nLfEGYs&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&index=9

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_IBM_Michal_Shmueli_Scheuer.pdf

Abstract: In this talk, I'll cover various aspects of conversational bots, focusing on the domain of customer support. Often, human conversations with bots mimic the way humans interact with each other. Moreover, even when customers know that they are interacting with virtual agents (bots), they still expect them to behave like humans. One way to improve interactions with bots is by giving them some human characteristics ,such as emotion and personality. I'll show how a model of neural response generation can be used to generate bot responses according to a target personality. I'll then cover a methodology for detecting egregious conversations in a setting using conversational bots by examining behavioral cues from the customer, patterns in the agents’ responses, and customer-agent interactions.

Nofar Betzalel, Paypal - Semi-Supervised Learning for Tagging Coverage Extension

Speaker: Nofar Betzalel, Data Scientist @ Paypal

Title: Semi-Supervised Learning – to extend our Tagging Coverage

Video: https://www.youtube.com/watch?v=c4-3697xwys&index=7&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&t=0s

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_PayPal_Nofar_Betzalel.pdf

Abstract: When PayPal's risk decision making processes approve a transaction, we soon know whether it was the right decision. However, for declined transactions this is not the case, as our tagging coverage is not complete. This makes it more challenging for analysts and data scientists to understand our False-Positives when performing research and when measuring our decision making processes. In this talk I will discuss how we use Semi-Supervised learning to tag declined transactions as ones that would have been fraudulent or not, if were approved. This approach enables us to utilize both tagged and non-tagged transactions to train a model for the issued task.

Dr. Lev Faivishevsky, Intel Advanced Analytics - Using Deep-Learning to Detect Video distortions

Speaker: Dr. Lev Faivishevsky, Researcher @ Intel Advanced Analytics

Title: Using Deep Learning to Detect Video Distortions

Video: https://www.youtube.com/watch?v=FhMWZgs0kJ8&t=0s&index=8&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Intel_Lev_Faivishevsky.pdf

Abstract: Since the acquisition of Mobileye, it became common knowledge that Intel is interested in building AI-based products and producing hardware for AI applications. A less widely known role of AI at Intel is an internal role, using the huge and diverse data related to Intel's own operations to transform the way the company works and create a large value. Processor design, manufacturing and sales are leveraging machine-learning methods, including computer-vision, natural language processing and reinforcement learning techniques. The talk will start with a little background about these applications, and focus on one deep-learning based video analytics solution, used in the context of the processor validation. We will describe this non-standard use-case and the challenges in resolving it, most of which are also relevant for other use-cases in the domain, including handling scarcity of labeled data and coping with tight requirements in terms of both accuracy and run-time.

Prof. Danny Pfeffermann, Central Bureau for Statistics - Can Big Data Really Replace Traditional Surveys for theProduction of Official Statistics?

Speaker: Prof. Danny Pfeffermann, National Statistician of Israel @ Central Bureau for Statistics

Title: Can Big Data Really Replace Traditional Surveys for theProduction of Official Statistics

Video: https://www.youtube.com/watch?v=OcD20PkNj-w&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&index=10

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Lamas_Danny_Pfeffermann.pdf

Abstract: The big advancements in technology, which enable to access and analyse 'big data', coupled with increased demand for more accurate, more detailed and more timely official data, but with tightened available budgets, puts inevitable pressure on producers of official statistics to replace traditional sample surveys by big data sources. In the first part of my presentation I shall discuss some of the major challenges in the use of big data for official statistics, pointing out their advantages and limitations. In the second part I shall consider a general class of statistical models, which can possibly link the big data under consideration to the corresponding target, finite population data. The use of a model in the class may allow estimating finite population parameters, without the need for reference samples or administrative files.

Avi Hendler-Bloom, MobilEye - Overcoming the Electronic Traffic Sign Problem

Speaker: Avi Hendler-Bloom, Algorithms Developer @ MobilEye

Title: Overcoming the Electronic Traffic Sign Problem

Video: https://www.youtube.com/watch?v=QN9gfUZUqDU

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Mobileye_Avi_Hendler_Bloom.pdf

Abstract: Electronic traffic signs are commonly made with LEDs. Due to the differences in frequency and phase between each LED light, classifying this type of sign is challenging.This talk will address the issues faced, and introduce a solution.

Daniel Benzaquen, Lightricks - AB testing at Scale

Speaker: Daniel Benzaquen, Data Scientist @ Lightricks

Title: AB testing at Scale

Video: https://www.youtube.com/watch?v=-k1X2MRgGlY

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Lightricks_Daniel_Benzaquen.pdf

Abstract: Deep Learning have been gaining increasing attention in the recommendation systems community, replacing some of the traditional methods. In this talk, we will share some lessons we learned from using deep learning at huge scale in Taboola's recommendation system. Specifically, we will talk about the motivation for using deep learning and the tradeoffs between deep models and simpler models. We will discuss our approach to building neural networks with multiple input types (numerical, categorical, text, and images); capturing non trivial interactions between features using both deep dense architectures and Factorization Machine models; Tradeoffs between memorization and generalization and other tips regarding network architectures.

Gil Chamiel, Taboola - Deep And Shallow Learning in Recommendation Systems

Speaker: Gil Chamiel, Director of Algorithms and Data Science @ Taboola

Title: Deep And Shallow Learning in Recommendation Systems

Video: https://www.youtube.com/watch?v=nghXG5OiUno&index=12&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Taboola_Gil_Chamiel.pdf

Abstract: A/B testing is a central statistical procedure used frequently by data-scientists. Unfortunately, the standard A/B testing framework was originally designed to cope with a handful number of tests, while these days, conducting tens and even hundreds of tests, simultaneously, is a common scenario.

Directly applying the standard procedure, however, is highly problematic as many tests imply many false-discoveries, that potentially lead to sub-optimal performances. With the goal of controlling the false-discovery-rate, several procedures were designed: probably the most naive one is Bonferroni correction; More advanced schemes are Fisher's least-significant-difference, Benjamini-Hochberg etc. Yet, utilizing these schemes comes with the price of high False-negative rate that scales with the number of tests being conducted.

In this talk we discuss our attempt to bypass these challenges by utilizing a Bayesian Multi-Armed-Bandit approach, namely, Thompson-Sampling (TS) that operates in an online-learning manner. We share our experience and insights based on simulations and real-life experiments.

Finally, we discuss some generalizations of the standard TS scheme we made, that allow us to optimize over (non-trivial) statistical quantities (i.e., unnecessarily the conversion-rate/click-through-rate, which are of obvious interest, but users Life-Time-Value (LTV) etc).

Oren Shamir, Innoviz Technologies - Neural networks for point clouds: adding the 3rd dimension

Speaker: Oren Shamir, Head of CV Algorithm Development @ Innoviz Technologies

Title: Neural networks for point clouds: Adding the 3rd Dimension

Video: https://www.youtube.com/watch?v=aE3mfLm5dMA&t=0s&list=PLZYkt7161wEIjQOuWA93Tt4JS8DgCyz53&index=11

Slides: https://github.com/DataHackIL/DataConf/blob/master/DataConf_2018/DataConf_2018_Booking_Innoviz_Shamir.pdf

Abstract: Since Alexnet, DNNs have been used with rapidly increasing success to perform a wide variety of tasks on 2D images. This is the result of increased data availability, increased effective processing power, as well as incremental algorithmic improvements. Today, DNNs achieve super-human results on multiple tasks in the 2D data domain.

Processing of 3D data using DNNs has been studied less during that time. 3D sensors are less abundant, and are more variable in their capabilities and properties. In the past few years various methods for processing of 3D data have emerged, driven mainly by the medical imaging industry and, more recently, the autonomous car industry. 3D data may be unstructured, sparse and irregular, yielding unique challenges relative to 2D image data.

In this talk I will discuss the challenges of working with 3D data, and present an overview of approaches towards 3D data processing in DNNs.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
DataConf_2016		DataConf_2016
DataConf_2017		DataConf_2017
DataConf_2018		DataConf_2018
README.rst		README.rst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DataConf

DataConf 2016

Prof. Shai Shalev Shwartz, The Hebrew University of Jerusalem - Reinforcement Learning

Shahar Weinstock, Intel - Customers discovery using machine learning

Adi Nesher, PayPal - Defining the right label: How to create a valuable population tag in situations of uncertainty

Assaf Feldman, CTO, Riskified - Feature engineering at scale

Olha Shainoha, Wix - Site topic classification at Wix

Amalia Bryl & Shahar Wilner, EDvantage - Can machine learning empower education?

Dr. Yoram Gdalyahu, VP algorithms, Mobileye - Autonomous Driving and Crowd Mapping

Guy Adini, Istra - Toy Models for Algorithmic Trading

DataConf 2017

Yakov Shambik, Mobileye - Eye of the Beholder: Object detection in Mobileye using Deep Neural Networks and other techniques

Ofer Ron, LivePerson - Concepts before machinery: Harnessing the power of domain expertise for machine-learning-based solutions

Alex Ran, Intuit - Using Data Science for Automated Accounting

Meir Maor, SparkBeyond - Developing Simple and Stable Machine Learning Models

Roii Spoliansky, PayPal - Active learning optimization as a function of label cost and mistake cost

Gil Chamiel, Taboola - Don’t believe everything your network tells you: Uncertainty in deep learning for recommender systems

Adina Lederhendler, Neura - General vs. subpopulation-specific modeling: When and why you need to get specific

Yonatan Wexler, Orcam - Fast and Furious Face Recognition: Efficient metric learning for video stream data

Itamar Ben-Ari, Research Scientist @ Intel - Differentiable Memory Allocation Mechanism For Neural Computing

Dr. Oshri, Rafael - Multi-agent deep reinforcement learning in communication networks

DataConf 2018

Dana Kaner, Perimter X - Bootstrap, Random Forest and ll sorts of magic

Pavel Levin, Booking.com - Where should I travel next? Modeling multi-destination trips with Recurrent Neural Networks.

Ari Bornstien, Microsoft - Beyond Word Embeddings

Dr. Michal Shmueli-Scheuer, IBM Research - Conversational bots for customer support

Nofar Betzalel, Paypal - Semi-Supervised Learning for Tagging Coverage Extension

Dr. Lev Faivishevsky, Intel Advanced Analytics - Using Deep-Learning to Detect Video distortions

Prof. Danny Pfeffermann, Central Bureau for Statistics - Can Big Data Really Replace Traditional Surveys for theProduction of Official Statistics?

Avi Hendler-Bloom, MobilEye - Overcoming the Electronic Traffic Sign Problem

Daniel Benzaquen, Lightricks - AB testing at Scale

Gil Chamiel, Taboola - Deep And Shallow Learning in Recommendation Systems

Oren Shamir, Innoviz Technologies - Neural networks for point clouds: adding the 3rd dimension

About

Releases

Packages

DataHackIL/DataConf

Folders and files

Latest commit

History

Repository files navigation

DataConf

About

Resources

Stars

Watchers

Forks