Projects

CareerNavigator
- Developed, CareerNavigator, a Machine Learning model that evaluates candidates’ employability by analyzing key attributes and predicting suitability for a job role.
- Cleaned, pre-processed, and performed feature engineering on the dataset containing 70k+ datapoints.
- Designed, trained, and evaluated multiple algorithms, utilizing performance metrics such as accuracy, confusion matrix, and F1-score to optimize model effectiveness.
- Selected kernel support vector machine as the best-fit model with the highest accuracy of 80% and F1 score of 0.82.
- Presented findings to a panel of faculty and industry experts, receiving recognition for its innovation and effectiveness.
AudeX: Vision-to-Speech Model
- Developed a dynamic web application combining Optical Character Recognition (OCR) and Text-to-Speech (TTS) technologies to improve accessibility.
- Integrated Tesseract for multi language OCR and Web Speech API for TTS, enabling accurate text extraction from images and high-quality text-to-speech conversion.
- Designed a responsive interface with intuitive navigation, ensuring a seamless user experience.
- Enabled PDF export, word/character count, and keyword search for efficient document handling.
- Developed functionality for managing user profiles, such as signup, login, activity tracking, and editable data.
Advanced Text to Speech Optimization
- Fine-tuned Microsoft's SpeechT5 model to improve pronunciation of Technical English Terms focusing on modifying the phonetic representation to ensure precise pronunciation of abbreviations and acronyms.
- Achieved a 25% enhancement in speech quality over the baseline TTS model, as reflected in a significant Mean Opinion Score (MOS) improvement, with notable enhancements in handling technical terms.
- Optimized the baseline model to generate a Native Italian Voice by enhancing pronunciation, prosody, and stress patterns in line with the phonological rules of the Italian language, significantly improving speech quality and naturalness compared to other existing models.
- Harnessed tools like Transformers, PyTorch, and Hugging Face Datasets to implement advanced machine learning and NLP techniques, ensuring optimal model performance and reliability.
- Implemented 8-bit dynamic quantization to linear layers using PyTorch’s native API, reducing memory usage by 30% while maintaining inference accuracy.
Computer Vision Object Detection System
- Developed an object detection system utilizing YOLOv5 and PyTorch, designed to capture and process individual videos and photos for object detection.
- Optimized the computer vision pipeline to achieve 20+ FPS processing speed for 640x640 input resolution.
- Created a visualization system to render detection results with color-coded bounding boxes and confidence scores ranging from 0 to 1.
- Applied non-maximum suppression with IoU threshold of 0.3 for optimal detection accuracy.
- Added support for over 80 COCO dataset object types, with confidence threshold filtering at 0.3.
Solar Panel Detection System
- Developed an object detection model using YOLOv8n to identify and locate solar panels in low resolution satellite imagery.
- Processed and annotated a comprehensive dataset of 29,625 solar panel instances with high precision.
- Achieved 94.27% precision and 91.77% recall, significantly improving solar infrastructure mapping capabilities.
- Implemented sophisticated object detection techniques with mean Average Precision (mAP50) of 96.8%.
- Designed and deployed an optimized real-time inference pipeline, hosting it on Hugging Face for seamless accessibility and large-scale solar panel detection.

Aumkesh Chaudhary

About me

Education

Projects

Extracurricular Activities