Led the development of a scalable e-procurement marketplace with ML-based fraud detection, achieving 89% accuracy. Implemented real-time data processing pipelines using Apache Kafka and Spark Streaming, reducing data latency by 70%.
Built a comprehensive product data aggregator from multiple supplier APIs, enhancing data integration and accessibility for the e-procurement platform.
Implemented MLOps best practices to streamline machine learning model deployment, significantly reducing technical debt and deployment time by 60%.
Created and deployed BI dashboards using SAP Analytics Cloud, increasing BI solution adoption by 30%. Performed detailed analysis of invoices, purchase orders, and suppliers.
Developed a machine learning model for product classification to optimize product repatriation in SAP S4/Hana, achieving 72% precision. Implemented data augmentation and annotation techniques for unbalanced data.
Developed and deployed an advanced image classification system using OpenCV, TensorFlow, and Keras. Achieved high accuracy with various models including SVM (93%), LSTM (94%), and CNN (94%).
Redesigned the data warehouse schema, implementing a star schema that improved query performance by 200%. Introduced data partitioning and indexing strategies, reducing storage costs by 40%.
Created a predictive analysis tool for e-commerce, focusing on customer behavior analysis. Integrated with existing systems to provide actionable insights for marketing and sales teams.
Developed a real-time analytics dashboard for monitoring and visualizing data streams from IoT devices. Implemented data processing logic using Apache Flink and created visualizations with Grafana.
Built a face mask detector app using OpenCV and deep learning. The project involved data collection, preparation, and using a DL algorithm to classify images. OpenCV was used to infer the classifier and display results.
Created a word count application using AWS EMR and Spark. Generated a 20 GB corpus using NLTK, set up an EMR cluster, loaded data to S3, and defined a Spark application for processing.
Created a translation system from a parallel corpus using OpenNMT. The process included sub-tokenization, training, translation, detokenization, and evaluation of the machine translation model.
Developed a recommendation system using the MovieLens dataset. Created a pipeline for data loading, preparation, model training, cross-validation, and evaluation. Compared SVM and KNN models, with KNN (item-based using cosine similarity) performing best despite longer training time.
Analyzed crime statistics from the European Union to determine the most dangerous countries. Used datasets on assaults, intentional homicides, car thefts, and robberies. Created visualizations using chloropleth maps, tree maps, and bar charts in Plotly.