vedant madane - Resume

PROFESSIONAL SUMMARY

Python ML Engineer with 4+ years of hands-on experience in designing and implementing Machine Learning pipelines, MLOps practices, and distributed computing solutions. Expert in Python fundamentals with deep knowledge of SOLID principles, design patterns, and production-grade ML systems. Proven expertise in building scalable ML/MLOps pipelines, big data processing using Apache ecosystem (Spark, Kafka, Airflow), and implementing comprehensive testing frameworks. Strong background in deploying ML models at scale using distributed computing frameworks and cloud infrastructure.

CORE COMPETENCIES

Python Programming: Advanced Syntax, Data Types, Control Structures, OOP, SOLID Principles, Design Patterns

Machine Learning: ML Algorithms, Deep Learning, NLP, Computer Vision, Model Training & Optimization

MLOps & Pipelines: CI/CD, Model Versioning, Deployment, Monitoring, ML Pipeline Design, Production Systems

Big Data & Distributed: Apache Spark, Kafka, Airflow, PySpark, Distributed Computing Frameworks

Testing Frameworks: Pytest, Unit Testing, Integration Testing, Test Case Development

Data Engineering: ETL Pipelines, Data Processing, Data Quality, Pandas, NumPy, SQL

PROFESSIONAL EXPERIENCE

Maharashtra Knowledge Corporation Limited | Aug 2021 - Present

Senior Project Associate -2 | Worked on Python ML Engineering, MLOps & Data Engineering

Aug 2024 - Present

Designed and implemented end-to-end ML pipelines using Python with focus on scalability, maintainability, and SOLID design principles for production-grade ML systems
Built MLOps infrastructure implementing CI/CD pipelines for automated model training, validation, deployment, and monitoring using Docker, Kubernetes, and cloud platforms
Developed distributed data processing workflows using Apache Spark and PySpark for handling large-scale datasets (100,000+ records) with optimized performance
Implemented streaming data pipelines using Apache Kafka for real-time data ingestion and processing, enabling low-latency ML inference
Created comprehensive unit testing frameworks using Pytest with 85%+ code coverage, ensuring robustness and reliability of ML components
Orchestrated ML workflow automation using Apache Airflow for scheduling, monitoring, and managing complex data and ML pipelines
Applied design patterns (Factory, Strategy, Observer) and SOLID principles to build maintainable and extensible ML applications

Senior Project Associate-1 | Worked on ML Engineeringm Python & Big Data Systems

Aug 2023 - Jul 2024

Developed production ML models using Python (TensorFlow, PyTorch, Scikit-learn) with focus on Python fundamentals including advanced data structures, control flow, and type systems
Built data engineering pipelines processing 100,000+ documents using PySpark with distributed computing frameworks for optimal performance and scalability
Implemented ML model versioning and deployment systems following MLOps best practices including model registry, A/B testing, and gradual rollout strategies
Created automated testing suites for ML models including unit tests, integration tests, and performance benchmarks using Pytest framework
Designed big data architectures using Apache ecosystem tools (Spark, Kafka, Airflow) for real-time and batch processing of large-scale datasets
Optimized Python code performance using profiling tools, vectorization techniques, and distributed computing patterns achieving 40% latency reduction
Developed RESTful APIs using FastAPI and Flask for ML model serving with proper error handling, logging, and monitoring

Project Associate | Worked on ML & Data Engineering

Aug 2022 - Jul 2023

Built scalable ML pipelines implementing feature engineering, model training, hyperparameter tuning, and deployment using Python-based frameworks
Developed distributed computing solutions leveraging Apache Spark for large-scale data processing and model training on cloud infrastructure
Implemented data quality frameworks ensuring data validation, cleaning, and transformation using Python (Pandas, NumPy) and SQL
Created automated ML workflows using Apache Airflow for scheduling daily model retraining, evaluation, and deployment tasks
Wrote comprehensive test cases using Pytest for data processing functions, ML model components, and API endpoints ensuring code reliability
Deployed ML models to production on AWS and Azure using Docker containers with CI/CD pipelines and monitoring systems
Applied OOP principles and design patterns to build modular, reusable ML components following software engineering best practices

Project Trainee| Worked on Python & Analytics

Aug 2021 - Jul 2022

Developed ML models using Python (Scikit-learn, TensorFlow) with strong foundation in Python syntax, data types, control structures, and functional programming
Built data processing pipelines using Python libraries (Pandas, NumPy) for ETL operations, data cleaning, and feature engineering
Implemented unit testing frameworks using Pytest to ensure code quality and maintain high test coverage for data processing modules
Created data visualization dashboards for monitoring ML model performance and data quality metrics
Developed automated data workflows for batch processing and analysis of large datasets with focus on scalability and performance
Applied design principles including separation of concerns, DRY, and KISS to build maintainable data science applications

KEY PROJECTS

AI-Powered Document Processing System with MLOps

Production ML Pipeline with Distributed Computing

Built end-to-end ML pipeline processing 100,000+ documents using Apache Spark for distributed computing and Python-based ML frameworks
Implemented MLOps practices including automated training, testing (Pytest), deployment, and monitoring with CI/CD integration
Technologies: Python, Apache Spark, Kafka, Airflow, TensorFlow, PyTorch, Docker, Pytest, FastAPI
Impact: 94% accuracy, 90% reduction in processing time using distributed computing, 85%+ test coverage

Autonomous Log Analyzer with ML & Big Data

Real-time Analytics System using Apache Ecosystem

Developed real-time log analysis system using Apache Kafka for streaming, Spark for processing, and ML for anomaly detection
Built with SOLID principles and design patterns, comprehensive Pytest suite, and production-grade error handling
Technologies: Python, Apache Spark, Kafka, Airflow, ML Algorithms, Pytest, Distributed Computing
Impact: Real-time anomaly detection, scalable architecture handling millions of log entries

TECHNICAL SKILLS

Python Expertise: Advanced Syntax, Data Types, Control Structures, OOP, Decorators, Generators, Context Managers

Design Principles: SOLID Principles, Design Patterns (Factory, Strategy, Observer, Singleton), Clean Code

ML/DL Frameworks: TensorFlow, PyTorch, Scikit-learn, Keras, XGBoost, LightGBM

MLOps Tools: MLflow, Kubeflow, Docker, Kubernetes, CI/CD, Model Versioning, Monitoring

Apache Ecosystem: Spark, PySpark, Kafka, Airflow, Hadoop, Hive

Testing Frameworks: Pytest, Unittest, Integration Testing, Mock, Coverage.py

Big Data & ETL: Data Engineering, ETL Pipelines, Data Quality, Distributed Computing

Cloud & Databases: AWS (EMR, S3, Lambda), Azure, SQL, NoSQL, Redis

Data Processing: Pandas, NumPy, Dask, Data Cleaning, Feature Engineering

APIs & Web: FastAPI, Flask, REST APIs, Microservices

EDUCATION

Bachelor of Engineering - Electronics and Telecommunication

Graduated June 2021

University of Mumbai | CGPI: 7.49

KEY ACHIEVEMENTS

4+ years hands-on experience in Python ML engineering with strong fundamentals in syntax, data types, and control structures

Expert knowledge of SOLID principles and design patterns applied to production ML systems

Proven experience in ML and MLOps pipelines with distributed computing frameworks (Spark, Kafka, Airflow)

Comprehensive testing expertise using Pytest with 85%+ code coverage across multiple projects

Deep knowledge of big data and data engineering with Apache open-source tools

Successfully deployed scalable ML solutions processing 100,000+ records using distributed computing

Immediate joiner with willingness to relocate to Chennai