Rhuan Barros rhuanbarros

Hi there 👋

I'm a Master's graduate from the Federal University of Rio Grande do Sul, specializing in Data Science, Artificial Intelligence, and Natural Language Processing. My expertise lies in applying these technologies to legal documents and the justice system.

📖 Master's degree project

Title: "Case Law Analysis with Machine Learning in Brazilian Court"
This research involved the automatic extraction of judicial decisions from web pages and their processing through Natural Language Processing techniques.
Subsequently, this data was used in the development of Machine Learning models to classify documents based on the judge's ruling.
The necessary statistical tests were also developed to validate the findings.
Finally, graphs and dashboards were created to enable analysis by stakeholders.
🚀 It received over 20 academic citations, and its presentation (in English) at a conference in Canada. Link

Skils

Machine Learning: PyTorch, TensorFlow, NLP, LLM, Transformers, Vision models, EDA, ETL, Sci-kit Leaning lib,
Full stack development: Full stack C# Blazor developer.
Code: Python, C#, HTML, CSS, Java Script, Powershel, ...

📖 Publications

Case Law Analysis with Machine Learning in Brazilian Court Link
Programming the Nationality Identity in the Federal Constitution of Brazil Link

Blog posts

Análise jurisprudencial com técnica de aprendizado de máquina

Projects portfolio

🧠 Machine Learning projects
- Analysis of Court Decisions using Machine Learning with Weak Supervision
  - Decription:
    - Automatic extraction of documents from internet and pre-processing unsing NLP techniques
    - Development of Machine Learning models to classify documents based on the judge's ruling.
    - 📊 Statistical tests to validate the findings.
    - Graphs plots and dashboards
  - Technologies: Python, Sci-kit Learning lib, Scrapy, Google BigQuery, Snorkel Framework
💻 Fullstack projects
- Materiale
  - Solution for managing construction material budgets for stores serving various clients. Deployed using Supabase Serverless technologies. The system works entirely on the front end, leveraging C# Blazor WebAssembly, and Supabase Realtime Postgres database.
  - Technologies: C#, Blazor, Supabase, HTML, CSS
🤖 LLM AI Agents projects
- LLM RAG Agent Knowledgebase
  - Full-stack AI project to talk with personal documents. 🚀
  - AI agent for chatting about ingested document files.
  - 📁 Handles file ingestion, vector stores, user chat, and advanced search queries.
  - 🛠️ Key technologies:
    - LangGraph: Agent orchestration.
    - FastAPI: Backend framework.
    - Unstructured Package: File ingestion and OCR.
    - Weaviate: Vector store.
    - 🐳 Docker Compose: Weaviate containerization.
    - C# Blazor: Frontend framework.
    - Backend (FastAPI)
      - Processes diverse files and performs OCR in Portuguese.
      - Enhances semantic similarity with chunk splitting.
      - Engages in conversation, supports tool use, and saves history via SQLite.
      - Implements keyword, semantic, and hybrid search.
    - Frontend (C# Blazor):
      - 🚀 Runs in the browser with WebAssembly.
      - Fetches data from the backend API.
- 🤖 Machine Learning Interview Preparation Trainer - Streamlit version
  - Problem Addressed: Lack of specific machine learning quizzes and progress tracking.
  - 💡 Solution:
    - 📝 Customized Prompts: Create better questions by prompting ChatGPT with subject texts.
    - Integrated Explanations: Use Gemini model to explain topics within the app.
    - 📊 Result Tracking: Track study progress and quiz results using Supabase backend.
    - Cloud Accessibility: Streamlit UI hosted in the cloud for mobile access.
  - 🔑 Key Points:
    - Enhanced Learning: Improved question creation with customized prompts.
    - Seamless Knowledge Access: Direct topic explanations from Gemini model.
    - Progress Tracking: Monitor study progress and quiz outcomes.
    - Anywhere Access: Use the app on mobile devices via cloud hosting.
- 📚 English sentence creator
  - Prototype Purpose: Assists students in learning English with tech industry sentences.
  - 📝 Transcribe Audio Files: Use Whisper model from OpenAI to transcribe audio to text.
  - Sentence Separation: Separate transcribed sentences for better language model understanding.
  - Generate Tech Vocabulary Sentences: Adapt existing course content with tech vocabulary.
  - 🎙️ Convert Text to Speech: Use Microsoft's speecht5_tts model for speech synthesis.
  - 🎵 Process Audio Files: Convert generated audio to MP3 format.
  - 🔑 Key Points:
    - Transcription Accuracy: Ensure high accuracy using Whisper model.
    - Sentence Separation: Develop methods to cleanly segment sentences.
    - Tech Vocabulary Adaptation: Adapt sentences to include tech terms.
    - 🔊 Speech Conversion: Ensure natural and clear text-to-speech using speecht5_tts.
    - 🎶 Audio Processing: Convert and optimize audio files to MP3.
- 📧🤖 LLM Agent Gmail Parser Better RAG
  - Problem Addressed: Challenges in indexing emails for RAG applications or knowledge extraction.
    - Issues include noise and garbage in text and dealing with email threads.
  - 💡 Solution:
    - Utilized various prompt techniques to extract crucial information from emails.
    - Found that report-style summaries are more effective than generic summaries.
  - 🌟 Key Points:
    - Noise Reduction: Implemented techniques to filter out irrelevant information.
    - Thread Handling: Developed methods to accurately parse and summarize email threads.
    - 📝 Report-Style Summaries: Retain more essential information than generic summaries.
    - 🔧 Prompt Engineering: Experimented with different prompt structures to improve extraction of valuable insights.
- 🤖 Machine Learning Interview Preparation Quiz Trainer - Anvil framework version
  - Problem Addressed: Lack of specific machine learning quizzes, progress tracking and memorization techiniques included.
  - 💡 Solution:
    - 🐍 Python: Completely coded in Python using the Anvil framework (frontend and backend).
    - 📝 Customized Prompts: Create better questions by prompting ChatGPT with subject texts.
    - Integrated Explanations: Use LLM models to explain topics within the app.
    - 📊 Result Tracking: Track study progress and quiz results using Supabase backend.
    - Cloud Accessibility: hosted in the Anvil cloud.
  - 🔑 Key Points:
    - Enhanced Learning: Improved question creation with customized prompts.
    - Seamless Knowledge Access: Direct topic explanations from Gemini model.
    - Progress Tracking: Monitor study progress and quiz outcomes.
    - Anywhere Access: Use the app on mobile devices via cloud hosting.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rhuan Barros rhuanbarros

Achievements