I am an Applied Science manager with Microsoft, based out of the Redmond HQ in the beautiful Pacific Northwest. I completed my PhD in Computer Engineering (2015) from Iowa State University. My research was on large scale data analytics, specially mining large networks. Recently I have done some research on GNNs. My research advisor was Prof. Srikanta Tirthapura. I earned my Bachelor's degree from National Institute of Technology, Durgapur, India and completed my schooling from South Point High School, Kolkata, India, which held the Guinness Book of World Records as the largest educational institute in the world from 1984 till 1992. I come from Kolkata, the City of Joy!!
Resume
Build state of the art AI powered products and solutions to solve real business problems that enables organizations and people to do their best work. Bring technical leadership and lead teams that can work efficiently across different parts of an organization.
Experience
Principal Applied Science Manager
Since Sep 2021
Senior Applied Scientist
Feb 2020 to Aug 2021
Senior Software Engineer
Sep 2018 - Aug 2019
Software Engineer - II
Jun 2015 to Aug 2017
Application Developer, IBM
Jun 2007 to Jul 2010
Education
PhD. Computer Engineering
Iowa State University
Fall 2010 to Spring 2015
B.Tech. Information Technology
NIT Durgapur
Fall 2003 to Spring 2007
Skills & Languages
ML Tools:
• Spark ML, pandas, numpy, scikit-learn, pytorch, SHAP, pytorch-geometric
Big Data Tools:
• Hadoop, Pig, Hive, Giraph, Spark, Cassandra, MPI, OpenMP, Cassandra
Other Tools:
• SQL, Sockets, GNU make, Latex, gnuplot, R (Coursera Veried Certicate,
License NRJVKJHDHG)
Programming Languages:
• C#, C++, C, Java, SQL, Python, Pig Latin, HiveQL, UNIX shell scripting
Service & Leadership
Reviewer - IEEE Transactions on Big Data, Journal of Parallel and Distributed, Computing, Parallel Computing, IEEE Transactions on Knowledge and Data Engineering, IEEE International Symposium on Quality of Service (IWQoS)
CIO and later President, Graduate and Professional Student Senate (GPSS)
President of the ACM Student Chapter - Iowa State University 2013 - 2015
Founder / President, GNU/Linux Users Group, NIT Durgapur 2004 - 2007
Principal Data & Applied Science Manager, Microsoft (Since Sep 2021)
• Managing a team of Data and Applied Scientists to build state of the art Copilot experiences in Microsoft Business Applications.
• Management Excellence at Microsoft: Model, Coach, Care
• Proactive Copilot - Shipped Next Best Action feature in Microsoft Business Applications so that Copilot experiences can make recommendations to the user on what action the user can take next, thus simplifying the experience.
• Started a new workstream in our org to develop ne-tuned models for Business Copilot scenarios by developing and pitching the idea to our Leadership with the goal to reduce LLM latency and cost.
• Shipped a Demand Forecasting Feature for Dynamics 365 Supply Chain on Spark ML . Implementation includes Time Series Decomposition (STL), Outlier Detection (STL & IQR) and Demand Forecasting with legacy models like STS, ETS as well as more "modern" approaches like Prophet.
• Shipped various Supply Planning and Inventory Optimization (ML as well as OR) problems for various Business Applications, including Disruption Mitigation for Microsoft Supply Chain Platform, which was mentioned in the CEO's Annual Report 2023.
• Shipped Marketing Attribution Feature for Dynamics 365 Marketing, including AI as well as non-AI rule based approaches.
• Rallied the entire team to build a new GPT based Content Ideas Generation Model for Dynamics 365 Marketing in under a month. Feature was showcased by Microsoft CVP.
Senior Data & Applied Scientist, Microsoft
(Sept 2019 to Aug 2021)
• Shipped Out of the Box Models Microsoft Dynamics 365 Customer Insights. These models can be utilized by companies to unlock their data without the need of having large data and ML teams.
• Designed the ML architecture to support the end to end Spark based pipelines for Out of the box ML scenarios in Customer Insights, enabling a scale out design.
• Shipped the Subscription Churn model, which was the rst model that was shipped as part of Customer Insights.
• Shipped the Product Recommendation model for Customer Insights as a Tech Lead by coordinating work across a v-team. Experimented with various traditional (Ex: Collaborative Filtering) versus newer (Ex: GNNs) techniques.
Senior Software Engineer, Microsoft
(Sep 2018 to Aug 2019)
Software Engineer - II, Microsoft
(Jun 2015 to Aug 2017)
Application Developer, IBM
(Jun 2007 to Jul 2010)
• Part of a small team that built and shipped a new product - Dynamics 365 Customer Insights from scratch.
• With bringing ML to this product in mind, designed and developed the user activity modeling for the product.
• Built the Apache Spark based hydration tool for data ingestion into Azure Cosmos Db & Cognitive Search. Handle billions of data records.
Engineering a highly available (99.999%), low latency, high throughput internet scale distributed platform for real time access to customer prole data.
SAP Netweaver and Business Intelligence (BW) Consultant,
IBM India (worked from Kolkata, India, London, UK and Houston, USA)
Project : SAP BW implementation including design, development and support
Client: One of the six super{majors" in the eld of oil and gas
Patents (pending)
1. Junxuan Li, Allie Giddings, Arko P. Mukherjee, Sahil Bhatnagar, Ryan Wickman NL2ORA: using Natural Language to solve complex optimization problems in Ops Research", 2023
2. Mayank Shrivastava, Sagar Goyal, Sahil Bhatnagar, Pin-Jung Chen, Pushpraj Shukla, Arko P. Mukherjee Self-supervised system generating embeddings representing sequenced activity", 2021
Conference Publications
1. Susheel Suresh, Mayank Srivastava, Arko Mukherjee, Jennifer Neville and Pan Li Expressive and Efficient Representation Learning for Ranking Links in Temporal Graphs" In Proceedings ACM International World Wide Web Conference (WWW), 2023 (Acceptance Rate - 19.2%)
2. Susheel Suresh, Danny Godbout, Arko Mukherjee, Mayank Srivastava, Jennifer Neville and Pan Li Federated Graph Representation Learning using Self-Supervision" In Proceedings The 1st International Workshop on Federated Learning with Graph Data at ACM CIKM 2022 (12 out of 16 papers accepted.)
3. Arko Provo Mukherjee, Pan Xu and Srikanta Tirthapura Mining Maximal Cliques from an Uncertain Graph" In Proceedings IEEE International Conference on Data Engineering, 2015 (Acceptance Rate - 13%)
Invited to a special issue of the journal IEEE TKDE for the best five papers from ICDE 2015.
4. Arko Provo Mukherjee, and Srikanta Tirthapura Enumerating Maximal BiCliques from a Large Graph using MapReduce". In Proceedings IEEE International Congress on Big Data, 2014 (Acceptance Rate - 35%)
Journal Publications
1. Arko Provo Mukherjee, and Srikanta Tirthapura Enumerating Maximal BiCliques from a Large Graph using MapReduce" In IEEE Transactions on Services Computing, 2017
2. Arko Provo Mukherjee, Pan Xu and Srikanta Tirthapura Enumeration of Maximal Cliques from an Uncertain Graph" In IEEE Transactions on Knowledge and Data Engineering, 2017
3. Michael Svendsen, Arko Provo Mukherjee, and Srikanta Tirthapura Mining Maximal Cliques from a Large Graph using MapReduce: Tackling Highly Uneven Subproblem Sizes" In Journal of Parallel and Distributed Computing, 2015
4. Bibudh Lahiri, Arko Provo Mukherjee, and Srikanta Tirthapura Identifying Correlated Heavy-Hitters in a Two-Dimensional Data Stream" In Data Mining and Knowledge Discovery, 2016