I'm a Lead Data Scientist, Deep Learning Engineer and NLP Engineer.
As a Lead Data Scientist, I apply my expertise in deep learning, natural language processing (NLP) and large language models (LLMs) to build and deploy state-of-the-art solutions. I have over 10 years of experience in developing and delivering machine learning and deep learning products with global reach out that use structured and unstructured data at petabyte-scale. I am a published author in Nature and Springer journals, a multiple patents owner, and a conference speaker.
I am passionate about working with the latest advancements in NLP and LLMs, such as transformers networks, pre-trained models like Llama 2 or Mistral, fine-tuning techniques, RAG and more. I use Python, PyTorch, TensorFlow, HuggingFace Transformers, PEFT, and TRL as my models development toolkit, and GCP Vertex AI and BigQuery as my cloud platforms.
I enjoy conducting end-to-end research projects, solving open-ended problems with a scientific approach, and communicating the results. I am also a product-oriented technical problem solver, keeping the business value perspective always in mind. I am experienced in project management and capable of handling multiple projects, priorities, or products simultaneously. I am a technical leader who helps others to deliver and learn, always a team player who cares for others, and a believer that only good teams can do great things.
- Mistral 7B model parameter efficient fine-tuning for dialogue summarization with LoRA - I have implemented a parameter efficient fine-tuning (PEFT) of Mistral-7B-Instruct-v0.2 base model for dialogue summarization task using the samsum dataset using LoRA technique. In this experiment my goal was to find out how well can decoder-only model learn to perform better in a seq2seq type of task like summarization. The fine-tuned model turned out to perform really well.
- Mathematical Transformer - building and pre-training mathematical LLM from scratch - My goal in this project is building mathematical transformer architecture from scratch and training it to learn basic integer based mathematics like addition, subtraction and multiplication. I implemented the original "Attention Is All You Need" paper architecture in PyTorch and expand from there. Can tokens be used for math calculations and inference? How well embeddings can represent numerical values and their relations? I want to verify how to expand these in order to create more symbolic-like math representations with a flavor of Wolfram Mathematica or Theano.
- RAG-based text documents Q&A chat using Mistral 7B, ChatGPT and LangChain - I have built a simple RAG demo where the LLM answers questions regarding the set of external text files. RAG stands for retrieval augmented generation. It works by retrieving external documents and using them when executing queries to the LLMs. Text files served as input data and two LLMs to compare their performance: commercial ChatGPT API and open-source Mistral 7B. Finally LangChain was used to connect it all into a RAG application.
- Classics with LLMs - fine-tuning BERT for Iris dataset classification task - Ever wondered how new generation of LLMs handle classical ML tasks? Goal of this series of experiments is to verify how LLMs perform on classical datasets. This time we'll see how fine-tuned BERT handles the Iris classification task dataset.
- An open-source ECG signal QRS complex pattern recognition Python module - I am creator and maintainer of an open-source ECG signal QRS complex (ECG heart contraction marker) pattern recognition Python module. I have implemented the algorithm, adjusted it to work with both real-time data stream and offline datasets, prepared usage documentation and assisted users helping solving technical issues. It is used by dozens of research teams globally for various scientific projects.
๐ค Machine learning and deep learning
- Models research, development and production deployment
- Models architectures, data preprocessing, feature engineering, hyperparameters tuning, loss functions and performance metrics
โ๏ธ Natural Language Processing (NLP) and Large Language Models (LLMs)
- Architectures: RNN, LSTM, seq2seq, Transformers, etc.
- Pre-trained models: Llama 2, Mistral, GPT, BERT, T5, etc.
- Techniques: fine-tuning, PEFT (LoRA, QLoRA, Prompt Tuning), RAG, prompt engineering
- Tasks: sequence and token classification, translation, summarization, question answering, language modeling, dialogue, etc.
๐๏ธ Models development toolkit
- Python, NumPy, pandas, matplotlib and scikit-learn
- PyTorch, TensorFlow
- HuggingFace Transformers, PEFT, TRL, LangChain
โ๏ธ Cloud models development and data acquisition
- GCP Vertex AI models development and production deployment
- GCP BigQuery and SQL used with petabyte-scale data sets
- MLOps models deployment technologies and pipelines
๐ง Problem solving and research
- Problem-solving with a creative, innovative and logical approach
- Conduct scientific research individually and collectively
- Written and verbal communication and reporting skills, with the ability to explain complex technical concepts to non-experts
๐ Team and project management
- Technical leadership of ML and DL projects
- Project management and organizational skills for handling multiple projects
- Team player with natural team-building skills