2024 - Present
Machine Learning Engineer GenAI Intern LifeSage Inc. Austin, Texas
Driving innovation in healthcare with cutting-edge AI solutions that redefine patient interaction.
- Developed a Custom LLM System: Leading the end-to-end development of a healthcare-specific Large Language Model system, focusing on providing accurate, reliable, and context-sensitive responses to medical queries.
- Data Collection and Curation: Gathered and cleaned public chat-format data, utilizing T5 and BERT models to create a masked prediction task for correcting grammatical and other errors, ensuring high-quality datasets for LLM training.
- Fine-tuned Advanced LLMs: Fine-tuned and instruction-tuned multiple advanced language models, including Mistral MOE, Llama3, Llama3.1, and Gemma 2, optimizing them for specialized healthcare use cases.
- Implemented PEFT Techniques: Utilized Parameter-Efficient Fine-Tuning (PEFT) techniques such as LORA and Q-LORA to achieve faster and more efficient model training. Integrated accelerator tools like UnSloth and FlashAttention to enhance the performance of these fine-tuned models.
- Developed Inference Web Application: Designed and deployed an inference web application using Triton Inference Server, TensorRT, and vLLM. The frontend was built with Gradio, creating a user-friendly interface for testing and real-time interaction with the LLM.
- RAG-Based Applications: Currently working on Retrieval-Augmented Generation (RAG) based applications, enabling the system to retrieve and process information from proprietary pharmaceutical and insurance datasets, delivering accurate and contextually relevant responses.
- Exploring SOTA Techniques: Testing and integrating state-of-the-art techniques such as graph RAG and memory tuning to further enhance model performance and ensure the system can handle complex queries with high accuracy and reliability.