PROFESSIONAL SUMMARY
        Python ML Engineer with 4+ years of hands-on experience in designing and implementing Machine Learning pipelines, MLOps practices, and distributed computing solutions. Expert in Python fundamentals with deep knowledge of SOLID principles, design patterns, and production-grade ML systems. Proven expertise in building scalable ML/MLOps pipelines, big data processing using Apache ecosystem (Spark, Kafka, Airflow), and implementing comprehensive testing frameworks. Strong background in deploying ML models at scale using distributed computing frameworks and cloud infrastructure.
        CORE COMPETENCIES
        
        
            
                Python Programming:
                Advanced Syntax, Data Types, Control Structures, OOP, SOLID Principles, Design Patterns
            
            
                Machine Learning:
                ML Algorithms, Deep Learning, NLP, Computer Vision, Model Training & Optimization
            
            
                MLOps & Pipelines:
                CI/CD, Model Versioning, Deployment, Monitoring, ML Pipeline Design, Production Systems
            
            
                Big Data & Distributed:
                Apache Spark, Kafka, Airflow, PySpark, Distributed Computing Frameworks
            
            
                Testing Frameworks:
                Pytest, Unit Testing, Integration Testing, Test Case Development
            
            
                Data Engineering:
                ETL Pipelines, Data Processing, Data Quality, Pandas, NumPy, SQL
            
         
        PROFESSIONAL EXPERIENCE
        Maharashtra Knowledge Corporation Limited | Aug 2021 - Present
        
            
            
                - Designed and implemented end-to-end ML pipelines using Python with focus on scalability, maintainability, and SOLID design principles for production-grade ML systems
 
                - Built MLOps infrastructure implementing CI/CD pipelines for automated model training, validation, deployment, and monitoring using Docker, Kubernetes, and cloud platforms
 
                - Developed distributed data processing workflows using Apache Spark and PySpark for handling large-scale datasets (100,000+ records) with optimized performance
 
                - Implemented streaming data pipelines using Apache Kafka for real-time data ingestion and processing, enabling low-latency ML inference
 
                - Created comprehensive unit testing frameworks using Pytest with 85%+ code coverage, ensuring robustness and reliability of ML components
 
                - Orchestrated ML workflow automation using Apache Airflow for scheduling, monitoring, and managing complex data and ML pipelines
 
                - Applied design patterns (Factory, Strategy, Observer) and SOLID principles to build maintainable and extensible ML applications
 
            
         
        
            
            
                - Developed production ML models using Python (TensorFlow, PyTorch, Scikit-learn) with focus on Python fundamentals including advanced data structures, control flow, and type systems
 
                - Built data engineering pipelines processing 100,000+ documents using PySpark with distributed computing frameworks for optimal performance and scalability
 
                - Implemented ML model versioning and deployment systems following MLOps best practices including model registry, A/B testing, and gradual rollout strategies
 
                - Created automated testing suites for ML models including unit tests, integration tests, and performance benchmarks using Pytest framework
 
                - Designed big data architectures using Apache ecosystem tools (Spark, Kafka, Airflow) for real-time and batch processing of large-scale datasets
 
                - Optimized Python code performance using profiling tools, vectorization techniques, and distributed computing patterns achieving 40% latency reduction
 
                - Developed RESTful APIs using FastAPI and Flask for ML model serving with proper error handling, logging, and monitoring
 
            
         
        
            
            
                - Built scalable ML pipelines implementing feature engineering, model training, hyperparameter tuning, and deployment using Python-based frameworks
 
                - Developed distributed computing solutions leveraging Apache Spark for large-scale data processing and model training on cloud infrastructure
 
                - Implemented data quality frameworks ensuring data validation, cleaning, and transformation using Python (Pandas, NumPy) and SQL
 
                - Created automated ML workflows using Apache Airflow for scheduling daily model retraining, evaluation, and deployment tasks
 
                - Wrote comprehensive test cases using Pytest for data processing functions, ML model components, and API endpoints ensuring code reliability
 
                - Deployed ML models to production on AWS and Azure using Docker containers with CI/CD pipelines and monitoring systems
 
                - Applied OOP principles and design patterns to build modular, reusable ML components following software engineering best practices
 
            
         
        
            
            
                - Developed ML models using Python (Scikit-learn, TensorFlow) with strong foundation in Python syntax, data types, control structures, and functional programming
 
                - Built data processing pipelines using Python libraries (Pandas, NumPy) for ETL operations, data cleaning, and feature engineering
 
                - Implemented unit testing frameworks using Pytest to ensure code quality and maintain high test coverage for data processing modules
 
                - Created data visualization dashboards for monitoring ML model performance and data quality metrics
 
                - Developed automated data workflows for batch processing and analysis of large datasets with focus on scalability and performance
 
                - Applied design principles including separation of concerns, DRY, and KISS to build maintainable data science applications
 
            
         
        KEY PROJECTS
        
            AI-Powered Document Processing System with MLOps
            Production ML Pipeline with Distributed Computing
            
                - Built end-to-end ML pipeline processing 100,000+ documents using Apache Spark for distributed computing and Python-based ML frameworks
 
                - Implemented MLOps practices including automated training, testing (Pytest), deployment, and monitoring with CI/CD integration
 
                - Technologies: Python, Apache Spark, Kafka, Airflow, TensorFlow, PyTorch, Docker, Pytest, FastAPI
 
                - Impact: 94% accuracy, 90% reduction in processing time using distributed computing, 85%+ test coverage
 
            
         
        
            Autonomous Log Analyzer with ML & Big Data
            Real-time Analytics System using Apache Ecosystem
            
                - Developed real-time log analysis system using Apache Kafka for streaming, Spark for processing, and ML for anomaly detection
 
                - Built with SOLID principles and design patterns, comprehensive Pytest suite, and production-grade error handling
 
                - Technologies: Python, Apache Spark, Kafka, Airflow, ML Algorithms, Pytest, Distributed Computing
 
                - Impact: Real-time anomaly detection, scalable architecture handling millions of log entries
 
            
         
        TECHNICAL SKILLS
        
            
                Python Expertise:
                Advanced Syntax, Data Types, Control Structures, OOP, Decorators, Generators, Context Managers
            
            
                Design Principles:
                SOLID Principles, Design Patterns (Factory, Strategy, Observer, Singleton), Clean Code
            
            
                ML/DL Frameworks:
                TensorFlow, PyTorch, Scikit-learn, Keras, XGBoost, LightGBM
            
            
                MLOps Tools:
                MLflow, Kubeflow, Docker, Kubernetes, CI/CD, Model Versioning, Monitoring
            
            
                Apache Ecosystem:
                Spark, PySpark, Kafka, Airflow, Hadoop, Hive
            
            
                Testing Frameworks:
                Pytest, Unittest, Integration Testing, Mock, Coverage.py
            
            
                Big Data & ETL:
                Data Engineering, ETL Pipelines, Data Quality, Distributed Computing
            
            
                Cloud & Databases:
                AWS (EMR, S3, Lambda), Azure, SQL, NoSQL, Redis
            
            
                Data Processing:
                Pandas, NumPy, Dask, Data Cleaning, Feature Engineering
            
            
                APIs & Web:
                FastAPI, Flask, REST APIs, Microservices
            
         
        EDUCATION
        
        University of Mumbai | CGPI: 7.49
        KEY ACHIEVEMENTS
        4+ years hands-on experience in Python ML engineering with strong fundamentals in syntax, data types, and control structures
        Expert knowledge of SOLID principles and design patterns applied to production ML systems
        Proven experience in ML and MLOps pipelines with distributed computing frameworks (Spark, Kafka, Airflow)
        Comprehensive testing expertise using Pytest with 85%+ code coverage across multiple projects
        Deep knowledge of big data and data engineering with Apache open-source tools
        Successfully deployed scalable ML solutions processing 100,000+ records using distributed computing
        Immediate joiner with willingness to relocate to Chennai