Skip to content
View rhuanbarros's full-sized avatar
Block or Report

Block or report rhuanbarros

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
rhuanbarros/README.md

Hi there πŸ‘‹

I'm a Master's graduate from the Federal University of Rio Grande do Sul, specializing in Data Science, Artificial Intelligence, and Natural Language Processing. My expertise lies in applying these technologies to legal documents and the justice system.

πŸ“– Master's degree project

  • Title: "Case Law Analysis with Machine Learning in Brazilian Court"
  • This research involved the automatic extraction of judicial decisions from web pages and their processing through Natural Language Processing techniques.
  • Subsequently, this data was used in the development of Machine Learning models to classify documents based on the judge's ruling.
  • The necessary statistical tests were also developed to validate the findings.
  • Finally, graphs and dashboards were created to enable analysis by stakeholders.
  • πŸš€ It received over 20 academic citations, and its presentation (in English) at a conference in Canada. Link

Skils

  • Machine Learning: EDA, ETL, Sci-kit Leaning lib, Tensor Flow, NLP, LLM, Vision models.
  • Full stack development: Full stack C# Blazor developer.
  • Code: Python, C#, HTML, CSS, Java Script, Powershel, ...

πŸ“– Publications

  • Case Law Analysis with Machine Learning in Brazilian Court Link
  • Programming the Nationality Identity in the Federal Constitution of Brazil Link

Blog posts

Projects portfolio

  • 🧠 Machine Learning projects

    • Analysis of Court Decisions using Machine Learning with Weak Supervision
      • Decription:
        • Automatic extraction of documents from internet and pre-processing unsing NLP techniques
        • Development of Machine Learning models to classify documents based on the judge's ruling.
        • πŸ“Š Statistical tests to validate the findings.
        • Graphs plots and dashboards
      • Technologies: Python, Sci-kit Learning lib, Scrapy, Google BigQuery, Snorkel Framework
      • Project link
  • πŸ’» Fullstack projects

    • Materiale
      • Solution for managing construction material budgets for stores serving various clients. Deployed using Supabase Serverless technologies. The system works entirely on the front end, leveraging C# Blazor WebAssembly, and Supabase Realtime Postgres database.
      • Technologies: C#, Blazor, Supabase, HTML, CSS
      • Project link
  • πŸ€– LLM AI Agents projects

    • LLM RAG Agent Knowledgebase

      • Full-stack AI project to talk with personal documents. πŸš€
      • AI agent for chatting about ingested document files.
      • πŸ“ Handles file ingestion, vector stores, user chat, and advanced search queries.
      • πŸ› οΈ Key technologies:
        • LangGraph: Agent orchestration.
        • FastAPI: Backend framework.
        • Unstructured Package: File ingestion and OCR.
        • Weaviate: Vector store.
        • 🐳 Docker Compose: Weaviate containerization.
        • C# Blazor: Frontend framework.
        • Backend (FastAPI)
          • Processes diverse files and performs OCR in Portuguese.
          • Enhances semantic similarity with chunk splitting.
          • Engages in conversation, supports tool use, and saves history via SQLite.
          • Implements keyword, semantic, and hybrid search.
        • Frontend (C# Blazor):
          • πŸš€ Runs in the browser with WebAssembly.
          • Fetches data from the backend API.
    • πŸ“š English sentence creator

      • Prototype Purpose: Assists students in learning English with tech industry sentences.
      • πŸ“ Transcribe Audio Files: Use Whisper model from OpenAI to transcribe audio to text.
      • Sentence Separation: Separate transcribed sentences for better language model understanding.
      • Generate Tech Vocabulary Sentences: Adapt existing course content with tech vocabulary.
      • πŸŽ™οΈ Convert Text to Speech: Use Microsoft's speecht5_tts model for speech synthesis.
      • 🎡 Process Audio Files: Convert generated audio to MP3 format.
      • πŸ”‘ Key Points:
        • Transcription Accuracy: Ensure high accuracy using Whisper model.
        • Sentence Separation: Develop methods to cleanly segment sentences.
        • Tech Vocabulary Adaptation: Adapt sentences to include tech terms.
        • πŸ”Š Speech Conversion: Ensure natural and clear text-to-speech using speecht5_tts.
        • 🎢 Audio Processing: Convert and optimize audio files to MP3.
    • πŸ€– Machine Learning Interview Preparation Trainer

      • Problem Addressed: Lack of specific machine learning quizzes and progress tracking.
      • πŸ’‘ Solution:
        • πŸ“ Customized Prompts: Create better questions by prompting ChatGPT with subject texts.
        • Integrated Explanations: Use Gemini model to explain topics within the app.
        • πŸ“Š Result Tracking: Track study progress and quiz results using Supabase backend.
        • Cloud Accessibility: Streamlit UI hosted in the cloud for mobile access.
      • πŸ”‘ Key Points:
        • Enhanced Learning: Improved question creation with customized prompts.
        • Seamless Knowledge Access: Direct topic explanations from Gemini model.
        • Progress Tracking: Monitor study progress and quiz outcomes.
        • Anywhere Access: Use the app on mobile devices via cloud hosting.
    • πŸ“§πŸ€– LLM Agent Gmail Parser Better RAG

      • Problem Addressed: Challenges in indexing emails for RAG applications or knowledge extraction.
        • Issues include noise and garbage in text and dealing with email threads.
      • πŸ’‘ Solution:
        • Utilized various prompt techniques to extract crucial information from emails.
        • Found that report-style summaries are more effective than generic summaries.
      • 🌟 Key Points:
        • Noise Reduction: Implemented techniques to filter out irrelevant information.
        • Thread Handling: Developed methods to accurately parse and summarize email threads.
        • πŸ“ Report-Style Summaries: Retain more essential information than generic summaries.
        • πŸ”§ Prompt Engineering: Experimented with different prompt structures to improve extraction of valuable insights.

How to reach me πŸ“«

Pinned Loading

  1. court_decisions_jurimetric_analysis court_decisions_jurimetric_analysis Public

    Analysis of Court Decisions using Machine Learning with Weak Supervision

    Jupyter Notebook 6 1

  2. llm-rag-agent-knowledgebase llm-rag-agent-knowledgebase Public

    Full-stack artificial intelligence project with both backend (Langchain, Langgraph) and frontend (C# Blazor) components

    Jupyter Notebook

  3. llm-english-study-audio-sentece-creator llm-english-study-audio-sentece-creator Public

    Orchestration of various Artificial Intelligence technologies like audio-to-text, LLM prompt techniques, and text-to-speech.

    Jupyter Notebook

  4. llm-quiz-creator-streamlitapp-trainer llm-quiz-creator-streamlitapp-trainer Public

    Effective quiz creation using prompt techinques. UI with LLM embeded to explain concepts using Supabase as backend

    Jupyter Notebook

  5. llm-agent-gmail_parser-better-rag llm-agent-gmail_parser-better-rag Public

    Prompt techiniques to parse emails to RAG applications

    Jupyter Notebook

  6. BlazorWebAssemblySupabaseTemplate BlazorWebAssemblySupabaseTemplate Public

    Template Blazor applications using WebAssembly, with a Supabase backend integration.

    C# 2