Skip to content
View ImtiazKhanDS's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report ImtiazKhanDS

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ImtiazKhanDS/README.md

Hi I am Imtiaz Khan , Nice to meet you πŸ‘‹

Professional Summary

Staff Data Scientist with 11 plus years of experience executing data-driven solutions to increase efficiency, accuracy and utility of data processing. Proficient in building Natural Language Understanding and Natural Language Generation Models (NLP=NLU+NLG). I am looking for open source colloborations and working on productive remote teams.

Technical Skills

Languages: Python, C++,SQL

Developer Tools: Jupyter, VS Code, Git,Confluence,Jira, Azure/AWS/GCP

ML Tech : NLP NLU, NLG, RNN,LSTM ,BERT,GPT,CNN,Transformers,HuggingFace, Fastai

ML Tools : MLFlow, Grafana, Prometheus, Gradio, WanDB

Experience

Cisco April 2024 – Present

Netacad Copilot - Engineering Technical Leader Generative AI Enhanced Netacad platform with Generative AI capabilities for course learners to query on course topics and get answers from course sections using Retrieval Augmented Generative Search (RAG) data using LLM models like Anthropic Claude sonnet -3.4 llama3.1, and mistral 7B models.

  1. Employed Prompt tuning and Prompt Engineering techniques using pydantic to retrieve information from OpenSearch Database.

  2. Created monitoring stack on llm responses using Apache Superset and Postgres Database.

  3. Supported Translation capabilites by creating a llm translation wrapper using Mixtral 45B model.

  4. Developed a data pipeline workflow to move course data present in gitlab repositories to OpenSearch using PGSync Connector and Intermediate(Postgres Database) mechanism.

  5. Added Security level access between data resources for course learners and course instructors.

H1 Life Sciences July 2023 – April 2024

GenosAI - Senior Staff Generative AI Enabled Retrieval Augmented Generative Search (RAG) for clinical Trials, Health Care Professionals(HCP's) and Institutions data using LLM models like openai gpt-4, llama2, mistral etc.

  1. Employed Prompt tuning and Prompt Engineering techniques using pydantic to generate elastic search queries.

  2. created embeddings for documents and stored on elastic search to enable K nearest neighbours KNN search on the embeddings.

  3. Developed fastapi apis and exposed them on Amazon elastic kubernetes service.

  4. Conducted Code reviews ,and ci-cd pipelining using circle-ci.

  5. Added Infra setup and Deployed the app on Amazon Kubernetes service

PaloAlto Networks

PII Information Classification April 2022 – April 2023 Senior Staff Machine Learning

  1. Enabled a pii/phi/pci detection strategy across multiple communication channels and different types of files using NER techniques.
  2. Reduced the false positive rate from 15 percent to 7 percent by using advanced semi-supervised learning techniques and architectures like longformer.
  3. Strategized the structured pii identification using character cnn models and integrated it with existing regex based system.

Novartis Nov 2020 – March 2022

DocZ Document AI Assistant Product Engineering AI/NLP Expert

  1. Enhanced In-house DocZ product to condense clinical study report information with NLP Actions using techniques like Named entity recognition (NER in scispacy and microsoft text analytics for health).
  2. Condensed the clinical study report document by 75 percent by using One-shot Summarization by using Universal sentence Encoder Embeddings.
  3. Improved the table extraction of measurements by 95 percent from irregular rtf files to excel by using tabula module in python.

Deloitte Dec 2018 – Nov 2020

Fraud Detection Machine Learning/NLP Consultant

  1. Implemented machine learning to reduce fraud by 8 percent by using Gradient Boosting Trees.
  2. Brought down the client metric (false positive/true positive) ratio under 4 as opposed to 6.5. Complaint Categorization
  3. Automated the complaint categorization from manual process by using tfidf,text analytics, logistic regression with 0.8 F1 Score at each level.
  4. Reduced time of complaint categorization for 1000 complaints from 20 business hours to 2 minutes.

Accenture May 2012 – Dec 2018

Question Generation Wizard Software NLP Engineer

  1. Automated generation of FAQ questions given answer and context using LSTM/RNN Encoder Decoder deep learning models.
  2. Able to reduce the time of FAQ creation of questions when compared with an Subject Matter Expert by 80 percent. Ticket Classification
  3. Leveraged Azure Machine learning for efficient classification of incoming software/hardware related tickets into issue categories using email description by using ensemble of models like logistic regression,boosted decision trees and naive bayes algorithms.
  4. The time of classification of tickets to correct categories was reduced to 10ms. Forecasting Consumer Goods
  5. Converted Alteryx workflows of forecasting of top 8 products to R.
  6. Reduced forecast time by 63 percent and increased revenue by 29 percent to the client.

Education

Indian School of Business Business Analytics Graduate Certificate

V.R Siddhartha engineering College Bachelors in Electronics and Communication Engineering

Pinned Loading

  1. blog_md blog_md Public

    Python

  2. dsa dsa Public

    Python

  3. fasthtml fasthtml Public

    Forked from AnswerDotAI/fasthtml

    The fastest way to create an HTML app

    Jupyter Notebook

  4. PDFAgent PDFAgent Public

    Python