API World + CloudX + DataWeek 2025

API World + CloudX + DataWeek 2025

12:00pm PDT

PRO WORKSHOP (DataWeek): Building a RAG System for Video Search and Analysis

Wednesday September 3, 2025 12:00pm - 12:50pm PDT

DataWeek -- Main Stage

Elizabeth Fuentes Leone, AWS, Developer Advocate

This talk addresses the challenge of making video content searchable and analyzable using modern AI techniques. While text and image RAG systems are common, video presents unique challenges due to its multimodal nature combining visual frames and audio content.

Speakers

Elizabeth Fuentes Leone

Developer Advocate, AWS

As a Data Analytics and Machine Learning/Artificial Intelligence (ML/AI) Specialist, my mission is to break down complex concepts into easily understandable terms. I strive to develop innovative solutions that tackle real-world challenges effectively. By sharing my knowledge and experience... Read More →

Wednesday September 3, 2025 12:00pm - 12:50pm PDT
DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Science and Machine Learning (DataWeek), Data Engineering / Architecture and Streaming (DataWeek)
In-Person/Virtual In Person

12:00pm PDT

PRO WORKSHOP (DataWeek): Democratizing AI: Ensuring Security, Governance, and Compliance in AI Development

Wednesday September 3, 2025 12:00pm - 12:50pm PDT

DataWeek -- Workshop Stage C (PRO)

David Adeleke, ZeeH Technologies, CEO

In this session, we will explore how the democratization of AI is not only driving innovation but also raising important questions around security, governance, and compliance. As organizations increasingly adopt AI technologies, it becomes imperative to establish robust frameworks for ensuring the security and ethical use of AI systems. From data privacy concerns to algorithmic bias, we will examine the key challenges facing AI practitioners and discuss strategies for mitigating risks and fostering responsible AI development. Through insightful discussions and practical examples, attendees will gain a deeper understanding of the intersection between AI democratization and security, governance, and compliance considerations. Join us as we navigate the complex landscape of AI ethics and governance and chart a path towards building trust in AI-powered solutions.

Speakers

David Adeleke

CEO, ZeeH Technologies

David Adeleke Olaoluwa is a serial entrepreneur, investor, pioneer and change agent driven by a mission to raise the standard of living of people through business and wealth creation as a catalyst for positive change globally.David is a leading investment professional and executive... Read More →

Wednesday September 3, 2025 12:00pm - 12:50pm PDT
DataWeek -- Workshop Stage C (PRO)

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Governance and Security (DataWeek), Data Tools / Technology and Management (DataWeek)
In-Person/Virtual In Person

1:00pm PDT

PRO WORKSHOP (DataWeek): Securing the Data Layer: Privacy, Compliance, and Trust in AI-Driven Pipelines

Wednesday September 3, 2025 1:00pm - 1:50pm PDT

DataWeek -- Main Stage

Advait Patel, Broadcom, Senior Site Reliability Engineer

As organizations scale AI-driven applications and cloud-native pipelines, data privacy and compliance have become top concerns, not just for legal teams, but for developers and data engineers too. With regulations like GDPR, HIPAA, and the AI Act taking shape, teams need to rethink how data flows through their ML lifecycle.

In this session, Advait Patel, a cloud security engineer and contributor to the Cloud Security Alliance’s AI Control Matrix, will explore how to architect secure and compliant data pipelines for AI and analytics workloads. From ingestion and transformation to model training and inference, every layer presents unique risks, and unique opportunities to build trust.

Speakers

Advait Patel

Senior Site Reliability Engineer, Broadcom

Advait Patel is a skilled Senior Site Reliability Engineer based in Chicago, with a passion for leveraging technology to drive impactful solutions. With extensive experience in Cloud Computing, Cloud Security, and Cybersecurity, he currently works at Broadcom, where he plays a key... Read More →

Wednesday September 3, 2025 1:00pm - 1:50pm PDT
DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Governance and Security (DataWeek), Data Tools / Technology and Management (DataWeek)
In-Person/Virtual In Person

2:00pm PDT

PRO WORKSHOP (DataWeek): Building New Cost-Effective Analytics Platform with Open Source tools for Fanatics

Wednesday September 3, 2025 2:00pm - 2:50pm PDT

DataWeek -- Main Stage

Bhanu Cherukumille, Fanatics, Director - Data Engineering

At Fanatics, we previously relied on a mix of in-house, open-source, and commercial tools for Analytics and BI. To streamline and modernize our data ecosystem, we built a unified platform — a "one to rule them all" solution — powered entirely by an open-source stack: Kafka, StarRocks, and Superset. The result? A cost-effective, high-performance, and feature-rich platform that unlocks powerful new capabilities for the business.

Speakers

Bhanu Cherukumille

Director - Data Engineering, Fanatics

Specialize in building real-time and self-serve analytics solutions in the cloud. Empowering teams with agile, data-driven insights and transforming complex data into actionable intelligence for innovative organizations

Wednesday September 3, 2025 2:00pm - 2:50pm PDT
DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Tools / Technology and Management (DataWeek), Data Engineering / Architecture and Streaming (DataWeek)
In-Person/Virtual In Person

3:00pm PDT

PRO WORKSHOP (DataWeek): AI-Driven Data Optimization: Smart Strategies for Scalability and Performance

Wednesday September 3, 2025 3:00pm - 3:50pm PDT

DataWeek -- Main Stage

Indu Chaube, Cisco Systems, Senior Software Engineer

In my session, 'AI-Driven Data Optimization: Smart Strategies for Scalability and Performance,' I will examine the transformative role of artificial intelligence in modern data management. This session will delve into how AI-powered techniques streamline data ingestion, automate preprocessing workflows, enhance storage efficiency, and enable real-time analytics for intelligent decision-making. Through advanced optimization strategies, AI-driven solutions significantly reduce latency, improve resource allocation, and ensure seamless scalability in high-volume data ecosystems. Attendees will gain a deep understanding of AI’s impact on data governance, predictive analytics, and automation, learning how to implement robust end-to-end pipelines that drive operational efficiency and business intelligence.

Speakers

Indu Chaube

Senior Software Engineer, Cisco Systems

Indu Chaube is a highly accomplished software architect and visionary leader with more than a decade-long track record in software product design and development, encompassing User Interface, User Experience, and web API domains. Collaborating with industry giants like Cisco and SAMSUNG... Read More →

Wednesday September 3, 2025 3:00pm - 3:50pm PDT
DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Tools / Technology and Management (DataWeek)
In-Person/Virtual In Person

4:00pm PDT

PRO WORKSHOP (DataWeek): Leveraging Open Source AI Securely and Privately

Wednesday September 3, 2025 4:00pm - 4:50pm PDT

DataWeek -- Main Stage

JJ Asghar, IBM, Developer Advocate

Honestly, you're probably jealous of those people saying "ChatGPT" is making their lives easier. You may think that leveraging AI is this generation's adoption of the calculator. I'm here to say yes, you are right, but let's be honest: we have no idea how the ChatGPTs of the world are trained or if they are secure.
If you run a technology company, your data is your secret sauce; if your boss found out you were leveraging ChatGPT to do your job, would they be happy? Probably not.
In this presentation, we will explore how LLMs work locally through hands-on activities, such as setting up some local open-source private and free LLMs to help you get closer to the promise of generative AI.
Overall, this presentation will start with some basic installation leveraging Ollama and VS Code or any of the JetBrains platforms (and you'll learn what that means and what the differences are!), then pull in AnythingLLM and/or OpenWebUI to help give you a straightforward chat-like interface to your LLM. We will provide you with the knowledge, tools, some generic prompts, and use cases to gain more confidence in your daily life by leveraging LLMs.
You'll walk out of the workshop with a working LLM that you can engage with no fees or usage, all locally and securely.

Speakers

JJ Asghar

Developer Advocate, IBM

JJ works as a Developer Advocate representing IBM worldwide. He mainly focuses on open-source AI and OpenShift, trying to help companies and users successfully onboard to the Cloud-Native ecosystem. He’s also known in the DevOps tooling ecosystem and generalized Linux communities... Read More →

Wednesday September 3, 2025 4:00pm - 4:50pm PDT
DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, AI Models and Management (DataWeek), Data Tools / Technology and Management (DataWeek)
In-Person/Virtual In Person

5:00pm PDT

PRO WORKSHOP (DataWeek): From Data to Insights in Minutes: Accelerating Business Growth with AutoML and LLM-Powered Data Prep

Wednesday September 3, 2025 5:00pm - 5:50pm PDT

DataWeek -- Main Stage

Shailaja Sampat, Fujitsu Research of America, Senior Researcher

- Are you tired of manually processing large datasets in spreadsheets? 
- Have you considered using Machine Learning (ML) to automate your tasks but feel limited by your coding skills or time constraints?
 - Looking for ways to reduce the time spent in making your data AI-ready, despite your ML background?

If these challenges sound familiar, this session is for you!

Traditionally, building predictive models has required specialized coding and statistical expertise. Moreover, data professionals often dedicate over 80% of their time to the labor-intensive data-wrangling process- to prepare raw data for consumption by automated machine learning (AutoML) tools. To overcome these hurdles, we introduce AutoDW, an innovative data wrangler that utilizes the power of Large Language Models (LLMs) and sophisticated automation to facilitate the seamless preparation of AI-ready data. This session will demonstrate how the synergistic application of AutoDW and AutoML empowers users to rapidly develop predictive prototypes for their business use cases end-to-end without writing any code. Through step-by-step instruction and a live demonstration, attendees will witness AutoDW's intelligent data processing, observe AutoML's autonomous algorithm selection tailored to specific applications, and, most importantly, gain a comprehensive understanding of how to interpret the resulting predictive outputs.

#AutoML #DataWrangling #NoCodeML #FastMLPrototyping #DataDrivenDevelopment

Speakers

Shailaja Sampat

Senior Researcher, Fujitsu Research of America

Shailaja Sampat is a senior researcher in the AI lab at Fujitsu's research division in the USA. She earned her Ph.D. from Arizona State University, with a thesis focusing on the intersection of natural language processing and computer vision. Her current research spans autoML, data... Read More →

Wednesday September 3, 2025 5:00pm - 5:50pm PDT
DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Science and Machine Learning (DataWeek), Data Tools / Technology and Management (DataWeek)
In-Person/Virtual In Person

9:30am PDT

PRO Session (DataWeek): Intelligent Automation of Data Engineering Workflows with LLMs

Thursday September 4, 2025 9:30am - 9:55am PDT

DataWeek -- Main Stage

Manohar Sai Jasti, Workday, Analytics Engineer

I will share how I developed an AI-driven system to transform raw SQL into production-ready dbt models using Large Language Models (LLMs). By combining retrieval-augmented generation techniques with dbt’s semantic framework, I automated SQL refactoring, modularization, testing, and documentation. This approach accelerates data engineering workflows, reduces manual effort, and enables scalable, production-ready analytics pipelines. I will walk through the architecture, challenges faced during scaling, validation strategies for AI-generated SQL, and key lessons learned from deploying this solution in real-world environments. Attendees will gain practical insights into applying LLMs for data workflow automation, improving pipeline quality, and driving faster AI productionization across modern data stacks.

Speakers

Manohar Sai Jasti

Analytics Engineer, Workday

Manohar Sai Jasti is an experienced Analytics Engineer specializing in building efficient and scalable data pipelines. With expertise in tools like dbt, Trino, and cloud platforms, he helps organizations turn data into actionable insights. Manohar is passionate about simplifying data... Read More →

Thursday September 4, 2025 9:30am - 9:55am PDT
DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, Data Engineering / Architecture and Streaming (DataWeek)
In-Person/Virtual In Person

10:00am PDT

PRO Session (DataWeek): Generative AI Operation Evaluation Framework

Thursday September 4, 2025 10:00am - 10:25am PDT

DataWeek -- Main Stage

Cigil Achenkunju, LivePerson, Data and Product Management

How do we know if this gen AI investment is moving the needle? It is a question heard almost daily across finance, healthcare, and retail. And honestly, it is the right question to ask. On top of that, should we continue to invest in AI at the same rate or optimize? How can a stakeholder show that solution usage has a positive or maybe negative impact on their operation?
Survey results indicate that up to 85% of AI initiatives eventually fail to deliver their promises. Organizations using gen AI want to understand the impact of such solutions clearly. Can you blame them? So, let’s define a strategic decision-making framework that broadly answers these business questions in an operational setting that balances the benefits of business value and AI integration.

An analytical framework for operations measurement 2S/2E: Think of it as four pillars used together to tell a complete story of your gen AI-enabled operation's health: Each pillar reveals a different facet of your performance, and I'll show you exactly how to measure them. These pillars offer valuable insights to measure your operations. What makes this framework powerful is its systematic approach and adaptability.

Speakers

Cigil Achenkunju

Data and Product Management, LivePerson

A leader in advanced data analytics and a strategic advisor, Cigil has a robust background in data science and product management. With extensive experience across various organizations, Cigil has helped companies transform data into actionable insights that drive business success... Read More →

Thursday September 4, 2025 10:00am - 10:25am PDT
DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, Data Governance and Security (DataWeek), Data Strategy / Analytics & Business Intelligence (DataWeek)
In-Person/Virtual In Person

10:00am PDT

Hackathon Kickoff Networking

Thursday September 4, 2025 10:00am - 11:00am PDT

Meetup with your team, connect with other challengers and ask questions to the sponsors directly.

Thursday September 4, 2025 10:00am - 11:00am PDT
Expo Hall

Talk Type OPEN Session
Tracks API World, DataWeek, CloudX
In-Person/Virtual In Person

10:30am PDT

OPEN Session (DataWeek): Hiring for AI Success: Why Your First Hire Should Be a Data Engineer

Thursday September 4, 2025 10:30am - 10:55am PDT

DataWeek -- Main Stage

Brenna Buuck, MinIO, Developer Evangelist

AI initiatives are at the top of every organization’s priority list, yet many fail before they even begin—not because of poor models, but because of poor data foundations. While hiring an AI/ML engineer may seem like the logical first step, success depends on a different approach: hiring a data engineer first.

In this session, I'll explore why data infrastructure is the true bottleneck in AI adoption and how the right data engineering expertise ensures AI models perform at scale. Drawing on real-world experience, I’ll walk through the hiring missteps organizations often make and how to avoid costly mistakes when building AI initiatives from the ground up.

Speakers

Brenna Buuck

Developer Evangelist, MinIO

Brenna Buuck is the subject matter expert at MinIO for databases and datalakes. A data engineer turned developer evangelist, she is passionate about coding, data, and learning. She endeavors to inspire and educate other developers about the latest tools and technologies with the goal... Read More →

Thursday September 4, 2025 10:30am - 10:55am PDT
DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek
In-Person/Virtual In Person

11:30am PDT

OPEN Session (DataWeek): Strategies for Image Dataset Curation from High-Volume Industrial IoT data

Thursday September 4, 2025 11:30am - 11:55am PDT

DataWeek -- Main Stage

Apurva Godghase, Brambles, Senior Computer Vision Engineer

In Industrial IoT for Supply chain, and logistics, massive amounts of data is generated by edge devices that capture data continuously. For embedded vision systems, managing the sheer volume of images and metadata can be challenging. Selecting a diverse subset of high-quality data is crucial for effective modeling and analysis. This work outlines a comprehensive method for selecting relevant images from an extensive dataset to build a high-quality image database for building and monitoring computer vision and machine learning models. This systematic approach not only enhances the efficiency of data management in industrial IoT applications but also improves the generalizability and accuracy of Computer Vision learning models.

Speakers

Apurva Godghase

Senior Computer Vision Engineer, Brambles

Apurva is a Senior Computer Vision Engineer at Brambles, with over seven years of R&D experience across diverse industrial domains. At Brambles, she specializes in designing and deploying cutting-edge machine learning and computer vision IoT prototypes to enhance supply chain efficiencies... Read More →

Thursday September 4, 2025 11:30am - 11:55am PDT
DataWeek -- Main Stage

Talk Type OPEN Session
Tracks OPEN Session, DataWeek, Data Science and Machine Learning (DataWeek)
In-Person/Virtual In Person

1:00pm PDT

PRO Session (DataWeek): AI-Driven Innovation: Scalable Data Architectures

Thursday September 4, 2025 1:00pm - 1:25pm PDT

DataWeek -- Main Stage

Pritam Roy, Capgemini, Sr. Manager

As enterprises embrace AI for scalable automation, predictive analytics, and real-time decision intelligence, the need for robust data architectures and machine learning frameworks has never been greater. This session, led by Pritam Roy, a seasoned AI and data engineering leader, will explore how to design and implement scalable AI-powered data solutions that optimize business operations, cloud efficiency, and enterprise intelligence.

Speakers

Pritam Roy

Sr. Manager, Capgemini

Pritam Roy is a seasoned AI and data engineering leader, specializing in enterprise-scale AI solutions, cloud computing, and machine learning-driven business transformation. With over 20 years of experience, he has played a pivotal role in AI innovation, predictive analytics, and... Read More →

Thursday September 4, 2025 1:00pm - 1:25pm PDT
DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek
In-Person/Virtual In Person

1:30pm PDT

PRO Session (DataWeek): Integrating Data Governance into Cyber Risk Management

Thursday September 4, 2025 1:30pm - 1:55pm PDT

DataWeek -- Main Stage

Nandini Singh, Google, Sr. TPM

This session is designed for cybersecurity professionals, data governance leaders, and IT managers seeking to strengthen their organization's cybersecurity posture through effective data governance practices. Attendees will leave with actionable insights and strategies to enhance their organization's resilience against cyber threats.

Drawing upon my experience of working at the Office of Cybersecurity Resilience at Google, I will share lessons learned from integrating data governance into cyber risk management, with a focus on evaluating metric quality levels (introducing the concept of Metric Bill of Materials) and developing a continuous improvement and adaptation roadmap.

Speakers

Nandini Singh

Sr. TPM, Google

Nandini Singh is a seasoned professional in the fields of data modeling, analytics, and cybersecurity technologies, with a robust career that spans over a decade. She currently serves as a Senior Technical Program Manager at Google, where she leads initiatives on product, platform... Read More →

Thursday September 4, 2025 1:30pm - 1:55pm PDT
DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, Data Governance and Security (DataWeek), Data Strategy / Analytics & Business Intelligence (DataWeek)
In-Person/Virtual In Person

2:30pm PDT

KEYNOTE (DataWeek): Informatica -- Mastering Enterprise AI Agents: Best Practices for Multi-Agent Frameworks

Thursday September 4, 2025 2:30pm - 2:55pm PDT

DataWeek -- Main Stage

Sumeet Kumar Agrawal, Informatica, Vice President Product Management

Explore the design and deployment of enterprise-grade AI agent systems. This session covers best practices for building scalable multi-agent frameworks, ensuring security and governance, and features real-world examples from industries like customer service and supply chain management. Learn how to integrate AI agents effectively while addressing key enterprise challenges.

Speakers

Sumeet Kumar Agrawal

Vice President Product Management, Informatica

Sumeet Agrawal is the Vice President of Products at Informatica, where he spearheads product management of innovative cloud-based technology products. With over 15 of experience in data engineering and product management, Sumeet has a proven track record of driving innovative solutions... Read More →

Thursday September 4, 2025 2:30pm - 2:55pm PDT
DataWeek -- Main Stage

5. KEYNOTES & FEATURED

Talk Type OPEN Session
Tracks OPEN Session, DataWeek, Data Science and Machine Learning (DataWeek), Data Strategy / Analytics & Business Intelligence (DataWeek)
In-Person/Virtual In Person

3:00pm PDT

OPEN Session (DataWeek): AI Leadership in Data Strategy: Transforming Large-Scale Data Systems for Business Growth

Thursday September 4, 2025 3:00pm - 3:25pm PDT

DataWeek -- Main Stage

Vijay Panwar, Panasonic Avionics Corporation, Senior Software Engineer

As organizations progressively depend on data to foster innovation, the significance of leadership in shaping and executing AI-driven strategies becomes crucial. In this session, I will present insights gained from over 12 years of experience spearheading transformative initiatives incorporating AI into extensive data systems. The discussion will emphasize strategic frameworks for the adoption of AI, the alignment of technological advancements with business goals, and the development of scalable data ecosystems. By referencing real-world examples, including my involvement in managing and optimizing terabyte-scale data, I will demonstrate how AI can transform backend systems, enhance workflows, and provide tangible value.

Speakers

Vijay Panwar

Senior Software Engineer, Panasonic Avionics Corportion

I am, an accomplished IT professional with a decade of experience, possess expertise in a wide array of technologies, including Python, SQL Server, MySQL, PHP, Web services, REST API, and more. I have a proven track record of contributing to the field, having published two research... Read More →

Thursday September 4, 2025 3:00pm - 3:25pm PDT
DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, OPEN Session
In-Person/Virtual In Person

3:30pm PDT

OPEN Session (DataWeek): Balancing Velocity with Academic Rigor When Building with LLMs

Thursday September 4, 2025 3:30pm - 3:55pm PDT

DataWeek -- Main Stage

Lauren Peate, Multitudes, CEO & founder

We’re all building AI features now. But building with LLMs brings its own challenges – namely: How can we use cutting-edge practices, weave in AI ethics, and consider the cost of different models without blowing past delivery dates. Not to mention making sure that the features we build will be stable, reliable and maintainable in the future.

We recently built our first LLM feature, to show the quality of feedback given in code reviews. In 1 month, we did a literature review, consultation with academic experts, data labelling, model experimentation, a cost assessment, and finally, all the ML engineering to launch it into production. The outcome: <1% extreme misclassification and zero hallucinations. In this talk, we’ll share our approach to building LLM features – how we partnered with academia (without being delayed by their timelines), what tooling we used, and how we made the cost and money tradeoffs to keep business stakeholders happy. I’ll also speak to how we built this into our microservices architecture, including how we used tools to generate structured outputs from LLMs on top of AWS’s Bedrock API to have parseable responses from a range of models.

You'll walk away with practical strategies for leading your own teams through AI implementations, identifying ethical issues early, addressing them efficiently, and still delivering on time and on budget.

Speakers

Lauren Peate

CEO & founder, Multitudes

Lauren Peate is the CEO and founder of Multitudes, which helps engineering teams improve delivery sustainably. She’s focused her career on using data to support people, including as the founder of Ally Skills NZ, a consultancy helping global tech companies improve team performance... Read More →

Thursday September 4, 2025 3:30pm - 3:55pm PDT
DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, AI Models and Management (DataWeek), Data Science and Machine Learning (DataWeek)
In-Person/Virtual In Person

6:30pm PDT

DevAfter Hours

Thursday September 4, 2025 6:30pm - 8:00pm PDT

Santa Clara Convention Center - Lobby

Join and network with other attendees.

Thursday September 4, 2025 6:30pm - 8:00pm PDT
Santa Clara Convention Center - Lobby

2. Experiences & Official Events

Talk Type OPEN Session
Tracks API World, DataWeek, CloudX, OPEN Session
In-Person/Virtual In Person

9:30am PDT

PRO Session (DataWeek): The Rise of Agentic Commerce: Where AI Intelligence Meets Infinite Retail Possibility

Friday September 5, 2025 9:30am - 9:55am PDT

DataWeek -- Main Stage

Aswini Atibudhi, Walmart, Driving Innovation with Generative AI

In this visionary session, we’ll explore how Agentic AI is not just enhancing retail — it’s fundamentally reinventing it. As commerce shifts from static transactions to dynamic, intelligent interactions, Agentic AI emerges as the architect of a new era: Agentic Commerce. Learn how AI-powered agents, composable architectures, and autonomous decision-making systems are creating infinitely adaptable, customer-first retail ecosystems. We’ll dive into real-world examples, transformative architectures, and the strategic shifts needed to thrive in an AI-first retail future.
Join us to uncover how businesses can unlock limitless innovation, personalized experiences, and operational agility by embracing the rise of Agentic Commerce.

Speakers

Aswini Atibudhi

Distinguished Architect for Customer Space, Walmart

Aswini is the Distinguished Architect for Customer Space at Walmart. He has more than 20 years of IT experience in design and development of scalable microservice and microfrontend based Web/Cloud/AI & ML Applications. His Portfolio is jam-packed with Multiple domains such as Finance... Read More →

Friday September 5, 2025 9:30am - 9:55am PDT
DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, AI Models and Management (DataWeek), Data Tools / Technology and Management (DataWeek)
In-Person/Virtual In Person

10:00am PDT

PRO Session (DataWeek): Data Sovereignty in the Age of AI

Friday September 5, 2025 10:00am - 10:25am PDT

DataWeek -- Main Stage

Michel Tricot, Airbyte, Co-founder and CEO

This session will explore the intersection of data sovereignty and artificial intelligence, addressing how organizations can maintain control of their valuable data assets while still leveraging the power of AI. Drawing from extensive experience building open-source data infrastructure solutions, Michel will illuminate the challenges companies face when integrating AI into their data ecosystems without compromising ownership, security, or compliance requirements.

The session targets data leaders, CDOs, and enterprise architects who are navigating the complex landscape of AI adoption while maintaining strict data governance standards. Michel will share practical frameworks for implementing a self-managed data integration strategy that enables AI innovation while preserving first-party data sovereignty—a crucial consideration as regulatory requirements around data protection continue to evolve globally. Attendees will gain actionable insights on building resilient data architectures that support AI initiatives without surrendering control of sensitive information.

This session aligns perfectly with Data Week's focus on "Data Engineering & Governance" and "AI & ML" tracks, offering attendees a unique perspective on balancing innovation with control. Conference participants will benefit from Michel's vision of how open-source data integration infrastructure can serve as the foundation for responsible AI development, empowering organizations to build competitive advantages while maintaining complete sovereignty over their data. The presentation will include real-world examples of companies that have successfully implemented these principles.

Speakers

Michel Tricot

Co-founder and CEO, Airbyte

Michel Tricot is co-founder and CEO of Airbyte, the open data movement platform. The company was started in 2020 with a vision of commoditizing data integration pipelines across all industries and organizations and today has more than 170,000 deployments. Michel has been working in... Read More →

Friday September 5, 2025 10:00am - 10:25am PDT
DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, AI Models and Management (DataWeek), Data Engineering / Architecture and Streaming (DataWeek)
In-Person/Virtual In Person

10:30am PDT

OPEN Session (DataWeek): Ask Us Anything: Building Real-World AI Agents for Business

Friday September 5, 2025 10:30am - 10:55am PDT

DataWeek -- Main Stage

Lee Dickson, raia AI, Director of Sales and Operations
Rich Swier, raia AI, Founder

You’ve seen the hype. Now let’s get practical. In this live AMA-style session, Lee Dickson and Rich Swier — co-hosts of The AI Guys podcast and builders of raia AI — open the floor to the questions business leaders, operators, and tech pros are really asking about AI.

From deployment timelines to hallucination prevention, ethics to employee augmentation — nothing’s off limits. Whether you’re planning your first AI rollout or scaling from one use case to a dozen, this is your chance to get direct insights from AI practitioners who’ve helped companies from $5M startups to $500M enterprises make AI work.

Speakers

Lee Dickson

Director of Sales and Operations, raia AI

Lee Dickson brings 7+ years of experience in productizing AI solutions and predictive analytics, with a strong track record of streamlining operations and improving customer engagement for SMBs and enterprise clients. With an extensive background in technology and SaaS, he's an advocate... Read More →

Rich Swier

Founder, raia

Rich is a serial entrepreneur based in Sarasota, Florida. For the past 30 years, Rich has built and exited numerous successful tech businesses and continues to launch and incubate new ventures.

Friday September 5, 2025 10:30am - 10:55am PDT
DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, OPEN Session, Data Tools / Technology and Management (DataWeek), AI Models and Management (DataWeek)
In-Person/Virtual In Person

11:00am PDT

OPEN Session (DataWeek): Data Integrity in the Age of AI: SBOMs, Lineage, and Trust in the Pipeline

Friday September 5, 2025 11:00am - 11:25am PDT

DataWeek -- Main Stage

Saloni Garg, Wayfair, Senior Software Engineer

With AI models consuming more data than ever, ensuring the integrity and traceability of that data is critical. This talk focuses on how to build trust into your data pipelines -- using concepts like SBOMs (Software Bill of Materials) for datasets, audit trails, and metadata tagging to make data consumption safer and more transparent. I’ll also touch on how this ties into emerging compliance frameworks and how we’ve approached this in practice.

Speakers

Saloni Garg

Senior Software Engineer, Wayfair

International Red Hat Women in Open Source Awardee | Mozilla Open Leader 2019 | a strong open source diversity supporter | Google Venkat Scholarship winner | Speaker

Friday September 5, 2025 11:00am - 11:25am PDT
DataWeek -- Main Stage

Talk Type OPEN Session
Tracks OPEN Session, DataWeek, Data Governance and Security (DataWeek), Data Strategy / Analytics & Business Intelligence (DataWeek)
In-Person/Virtual In Person

11:30am PDT

OPEN Session (DataWeek): Rearchitecting Data Processing for Today’s Demands

Friday September 5, 2025 11:30am - 11:55am PDT

DataWeek -- Main Stage

Rajan Goyal, DataPelago, CEO & Co-founder

IT leaders face mounting pressure to leverage their organization’s data for genAI and lakehouse analytics. Yet, with data volumes doubling every two years and nearly 90% of new data being unstructured, traditional data processing architectures can’t keep up. Existing systems were designed for structured data and CPU-based computing, and businesses are finding they struggle with processing latency, high costs, and siloed data. In fact, more than 80% of IT leaders say that data silos are hindering digital transformation.

To drive value in today’s data environment, IT leaders need to implement new data processing architectures that are designed to handle massive volumes of complex data and are capable of taking advantage of the accelerated hardware (GPUs, FPGAs, CPU/SIMD, etc.) available in today’s cloud environments.

In this session, DataPelago CEO Rajan Goyal will outline the shortcomings of current data processing architectures and introduce attendees to the Universal Data Processing Engine, a software solution that sits between a data lake and query engine. Overcoming the shortcomings of traditional processing models, the UDPE is designed to handle all types of data (structured, unstructured, semi-structured) and work on top of any hardware. Integrating seamlessly into any tech stack, it accelerates data processing speed by 2-3x while cutting processing costs by 30-60% — enabling organizations to utilize all of their data for lakehouse analytics and AI workloads.

Speakers

Rajan Goyal

CEO & Co-founder, DataPelago

Rajan is the co-founder and Chief Executive Officer of DataPelago, the company revolutionizing data processing for the accelerated computing era. His expertise and visionary approach have been instrumental in shaping the future of data infrastructure and processing. A seasoned innovator... Read More →

Friday September 5, 2025 11:30am - 11:55am PDT
DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, OPEN Session, Data Tools / Technology and Management (DataWeek)
In-Person/Virtual In Person

1:00pm PDT

PRO Session (DataWeek): Compute for Your AI Model: GPUs, LPUs, TPUs and Beyond..

Friday September 5, 2025 1:00pm - 1:25pm PDT

DataWeek -- Main Stage

Kushaagra Goyal, Rubrik, Tech Lead

In the rapidly evolving landscape of computing, Graphics Processing Units (GPUs), Language Processing Units (LPUs), and Tensor Processing Units (TPUs) play pivotal roles in accelerating complex tasks, particularly in machine learning and artificial intelligence.

GPUs are renowned for their parallel processing capabilities, making them ideal for rendering graphics and handling large datasets. LPUs are specialized for optimizing natural language processing tasks, enhancing efficiency in understanding and generating human language. TPUs, developed by Google, are tailored specifically for training and inference of machine learning models, offering significant performance advantages for large-scale AI applications.

As we explore these technologies, we'll also look at emerging processing units designed for specific AI use-cases and the future of computational advancements.

Join me to dive into the intricacies of these processing units, their applications, and what lies ahead in the world of computing technology.

Speakers

Kushaagra Goyal

Tech Lead, Rubrik

Kushaagra Goyal is an accomplished technology leader with deep expertise in engineering and AI infrastructure. He holds a Bachelor’s degree from the Indian Institute of Technology, Delhi (2016), and a Master’s degree from Stanford University, where he developed a strong foundation... Read More →

Friday September 5, 2025 1:00pm - 1:25pm PDT
DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, AI Models and Management (DataWeek), Data Science and Machine Learning (DataWeek)
In-Person/Virtual In Person

1:30pm PDT

PRO Session (DataWeek): Securing Multi-Tenant Data Warehouses: How Federated Learning Revolutionizes Privacy

Friday September 5, 2025 1:30pm - 1:55pm PDT

DataWeek -- Main Stage

Jayant Tyagi, Salesforce, Lead Member of Technical Staff

Enterprises are being forced to reconsider how they manage sensitive data in cloud data warehouses due to data privacy laws and security incidents. In multi-tenant settings, traditional centralized analytics techniques are becoming more and more susceptible, putting businesses at risk for data leaks and regulatory issues.
In this session, we'll look at how federated learning is transforming data warehouse security while preserving analytical capabilities. Attendees will learn how businesses can use safe federated ways that preserve sensitive data while still obtaining insightful information, based on current research in privacy-preserving technologies.

Speakers

Jayant Tyagi

Lead Member of Technical Staff, Salesforce

Jayant Tyagi is a seasoned full-stack engineer with 13 years of experience in designing and developing high-scale applications. As a Lead Member of Technical Staff at Salesforce, he has played a pivotal role in building and optimizing enterprise applications, spearheading innovations... Read More →

Friday September 5, 2025 1:30pm - 1:55pm PDT
DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, Data Warehousing and Storage (DataWeek)
In-Person/Virtual In Person

2:00pm PDT

OPEN Session (DataWeek): Transforming Seller Onboarding in Retail: Responsible AI, RAG, and Risk Management

Friday September 5, 2025 2:00pm - 2:25pm PDT

DataWeek -- Main Stage

Banani Mohapatra, Walmart, Senior Manager, Data Science
Bhavnish Walia, Amazon, Senior Risk Manager AI/ML

Onboarding new sellers onto retail platforms like Walmart and Amazon involves a complex, multi-step process designed to mitigate fraud and ensure compliance with global regulations. One of the most critical and cumbersome steps is Know Your Customer (KYC) verification, requiring sellers to upload documentation for identity verification, business registration, and compliance checks. This manual review process often leads to long approval times and delays, frustrating legitimate sellers and creating operational bottlenecks for compliance teams.
To address these challenges, we leveraged foundational models with custom prompting strategies, in-document summarization, and retrieval-augmented generation (RAG) to ground responses in trusted data sources, powered by open-source LLM APIs. By automating document analysis and augmenting human reviewers with AI outputs, we reduced overall onboarding time by more than 20 percent, improving seller experience and operational efficiency.
However, deploying AI into a regulated process like KYC required a robust responsible AI framework combining scalability with governance. We implemented guardrail models to flag edge cases and ensure human oversight, enforced strict data anonymization protocols to protect sensitive information, and applied privacy-preserving techniques for model training. We also established a rigorous validation pipeline to test outputs against regulatory standards, mitigating risks such as hallucinations and interpretability gaps.
This talk offers actionable insights for data scientists, compliance officers, regulators, and machine learning practitioners working at the intersection of AI, risk management, and regulatory compliance. Presented by Bhavnish Walia, Senior Risk Manager at Amazon, and Banani Mohapatra, Senior Data Science Manager at Walmart, attendees will walk away with a practical framework for deploying AI in sensitive domains—covering risk management strategies, scalable AI architectures aligned with compliance, and key lessons on balancing innovation with accountability.

Speakers

Banani Mohapatra

Senior Manager, Data Science, Walmart

Banani Mohapatra is a data science leader with 12+ years of experience in e-commerce, payments, and real estate, specializing in machine learning, generative AI, LLMs, and causal AI. She leads a global data science team at Walmart, driving subscription growth with multi-billion-dollar... Read More →

Bhavnish Walia

Senior Risk Manager AI/ML, Amazon

Bhavnish Walia is a Senior Risk Manager at Amazon, where he leads AI Risk Management efforts focused on developing large language model (LLM) frameworks for data governance and regulatory compliance. He ensures the safe and compliant deployment of AI systems at scale. With over 12... Read More →

Friday September 5, 2025 2:00pm - 2:25pm PDT
DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, OPEN Session, Data Governance and Security (DataWeek)
In-Person/Virtual In Person

2:30pm PDT

OPEN Session (API): Hackathon Top 5 Demo

Friday September 5, 2025 2:30pm - 2:55pm PDT

DataWeek Expo Stage (OPEN)

Friday September 5, 2025 2:30pm - 2:55pm PDT
DataWeek Expo Stage (OPEN)

Talk Type OPEN Session
Tracks OPEN Session, API World, DataWeek, CloudX
In-Person/Virtual In Person

9:00am PDT

[Virtual] PRO WORKSHOP (DataWeek): Building a RAG System for Video Search and Analysis

Wednesday September 10, 2025 9:00am - 9:50am PDT

VIRTUAL DataWeek -- Main Stage

Elizabeth Fuentes Leone, AWS, Developer Advocate

This talk addresses the challenge of making video content searchable and analyzable using modern AI techniques. While text and image RAG systems are common, video presents unique challenges due to its multimodal nature combining visual frames and audio content.

Speakers

Elizabeth Fuentes Leone

Developer Advocate, AWS

As a Data Analytics and Machine Learning/Artificial Intelligence (ML/AI) Specialist, my mission is to break down complex concepts into easily understandable terms. I strive to develop innovative solutions that tackle real-world challenges effectively. By sharing my knowledge and experience... Read More →

Wednesday September 10, 2025 9:00am - 9:50am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Science and Machine Learning (DataWeek), Data Engineering / Architecture and Streaming (DataWeek), Virtual
In-Person/Virtual Virtual

9:00am PDT

[Virtual] PRO WORKSHOP (DataWeek): Democratizing AI: Ensuring Security, Governance, and Compliance in AI Development

Wednesday September 10, 2025 9:00am - 9:50am PDT

VIRTUAL DataWeek -- Workshop Stage C (PRO)

David Adeleke, ZeeH Technologies, CEO

In this session, we will explore how the democratization of AI is not only driving innovation but also raising important questions around security, governance, and compliance. As organizations increasingly adopt AI technologies, it becomes imperative to establish robust frameworks for ensuring the security and ethical use of AI systems. From data privacy concerns to algorithmic bias, we will examine the key challenges facing AI practitioners and discuss strategies for mitigating risks and fostering responsible AI development. Through insightful discussions and practical examples, attendees will gain a deeper understanding of the intersection between AI democratization and security, governance, and compliance considerations. Join us as we navigate the complex landscape of AI ethics and governance and chart a path towards building trust in AI-powered solutions.

Speakers

David Adeleke

CEO, ZeeH Technologies

David Adeleke Olaoluwa is a serial entrepreneur, investor, pioneer and change agent driven by a mission to raise the standard of living of people through business and wealth creation as a catalyst for positive change globally.David is a leading investment professional and executive... Read More →

Wednesday September 10, 2025 9:00am - 9:50am PDT
VIRTUAL DataWeek -- Workshop Stage C (PRO)

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Governance and Security (DataWeek), Data Tools / Technology and Management (DataWeek), Virtual
In-Person/Virtual Virtual

10:00am PDT

[Virtual] PRO WORKSHOP (DataWeek): Securing the Data Layer: Privacy, Compliance, and Trust in AI-Driven Pipelines

Wednesday September 10, 2025 10:00am - 10:50am PDT

VIRTUAL DataWeek -- Main Stage

Advait Patel, Broadcom, Senior Site Reliability Engineer

As organizations scale AI-driven applications and cloud-native pipelines, data privacy and compliance have become top concerns, not just for legal teams, but for developers and data engineers too. With regulations like GDPR, HIPAA, and the AI Act taking shape, teams need to rethink how data flows through their ML lifecycle.

In this session, Advait Patel, a cloud security engineer and contributor to the Cloud Security Alliance’s AI Control Matrix, will explore how to architect secure and compliant data pipelines for AI and analytics workloads. From ingestion and transformation to model training and inference, every layer presents unique risks, and unique opportunities to build trust.

Speakers

Advait Patel

Senior Site Reliability Engineer, Broadcom

Advait Patel is a skilled Senior Site Reliability Engineer based in Chicago, with a passion for leveraging technology to drive impactful solutions. With extensive experience in Cloud Computing, Cloud Security, and Cybersecurity, he currently works at Broadcom, where he plays a key... Read More →

Wednesday September 10, 2025 10:00am - 10:50am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Governance and Security (DataWeek), Data Tools / Technology and Management (DataWeek), Virtual
In-Person/Virtual Virtual

11:00am PDT

[Virtual] PRO WORKSHOP (DataWeek): Building New Cost-Effective Analytics Platform with Open Source tools for Fanatics

Wednesday September 10, 2025 11:00am - 11:50am PDT

VIRTUAL DataWeek -- Main Stage

Bhanu Cherukumille, Fanatics, Director - Data Engineering

At Fanatics, we previously relied on a mix of in-house, open-source, and commercial tools for Analytics and BI. To streamline and modernize our data ecosystem, we built a unified platform — a "one to rule them all" solution — powered entirely by an open-source stack: Kafka, StarRocks, and Superset. The result? A cost-effective, high-performance, and feature-rich platform that unlocks powerful new capabilities for the business.

Speakers

Bhanu Cherukumille

Director - Data Engineering, Fanatics

Specialize in building real-time and self-serve analytics solutions in the cloud. Empowering teams with agile, data-driven insights and transforming complex data into actionable intelligence for innovative organizations

Wednesday September 10, 2025 11:00am - 11:50am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Tools / Technology and Management (DataWeek), Data Engineering / Architecture and Streaming (DataWeek), Virtual
In-Person/Virtual Virtual

12:00pm PDT

[Virtual] PRO WORKSHOP (DataWeek): AI-Driven Data Optimization: Smart Strategies for Scalability and Performance

Wednesday September 10, 2025 12:00pm - 12:50pm PDT

VIRTUAL DataWeek -- Main Stage

Indu Chaube, Cisco Systems, Senior Software Engineer

In my session, 'AI-Driven Data Optimization: Smart Strategies for Scalability and Performance,' I will examine the transformative role of artificial intelligence in modern data management. This session will delve into how AI-powered techniques streamline data ingestion, automate preprocessing workflows, enhance storage efficiency, and enable real-time analytics for intelligent decision-making. Through advanced optimization strategies, AI-driven solutions significantly reduce latency, improve resource allocation, and ensure seamless scalability in high-volume data ecosystems. Attendees will gain a deep understanding of AI’s impact on data governance, predictive analytics, and automation, learning how to implement robust end-to-end pipelines that drive operational efficiency and business intelligence.

Speakers

Indu Chaube

Senior Software Engineer, Cisco Systems

Indu Chaube is a highly accomplished software architect and visionary leader with more than a decade-long track record in software product design and development, encompassing User Interface, User Experience, and web API domains. Collaborating with industry giants like Cisco and SAMSUNG... Read More →

Wednesday September 10, 2025 12:00pm - 12:50pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Tools / Technology and Management (DataWeek), Virtual
In-Person/Virtual Virtual

1:00pm PDT

[Virtual] PRO WORKSHOP (DataWeek): Leveraging Open Source AI Securely and Privately

Wednesday September 10, 2025 1:00pm - 1:50pm PDT

VIRTUAL DataWeek -- Main Stage

JJ Asghar, IBM, Developer Advocate

Honestly, you're probably jealous of those people saying "ChatGPT" is making their lives easier. You may think that leveraging AI is this generation's adoption of the calculator. I'm here to say yes, you are right, but let's be honest: we have no idea how the ChatGPTs of the world are trained or if they are secure.
If you run a technology company, your data is your secret sauce; if your boss found out you were leveraging ChatGPT to do your job, would they be happy? Probably not.
In this presentation, we will explore how LLMs work locally through hands-on activities, such as setting up some local open-source private and free LLMs to help you get closer to the promise of generative AI.
Overall, this presentation will start with some basic installation leveraging Ollama and VS Code or any of the JetBrains platforms (and you'll learn what that means and what the differences are!), then pull in AnythingLLM and/or OpenWebUI to help give you a straightforward chat-like interface to your LLM. We will provide you with the knowledge, tools, some generic prompts, and use cases to gain more confidence in your daily life by leveraging LLMs.
You'll walk out of the workshop with a working LLM that you can engage with no fees or usage, all locally and securely.

Speakers

JJ Asghar

Developer Advocate, IBM

JJ works as a Developer Advocate representing IBM worldwide. He mainly focuses on open-source AI and OpenShift, trying to help companies and users successfully onboard to the Cloud-Native ecosystem. He’s also known in the DevOps tooling ecosystem and generalized Linux communities... Read More →

Wednesday September 10, 2025 1:00pm - 1:50pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, AI Models and Management (DataWeek), Data Tools / Technology and Management (DataWeek), Virtual
In-Person/Virtual Virtual

2:00pm PDT

[Virtual] PRO WORKSHOP (DataWeek): From Data to Insights in Minutes: Accelerating Business Growth with AutoML and LLM-Powered Data Prep

Wednesday September 10, 2025 2:00pm - 2:50pm PDT

VIRTUAL DataWeek -- Main Stage

Shailaja Sampat, Fujitsu Research of America, Senior Researcher

- Are you tired of manually processing large datasets in spreadsheets? 
- Have you considered using Machine Learning (ML) to automate your tasks but feel limited by your coding skills or time constraints?
 - Looking for ways to reduce the time spent in making your data AI-ready, despite your ML background?

If these challenges sound familiar, this session is for you!

Traditionally, building predictive models has required specialized coding and statistical expertise. Moreover, data professionals often dedicate over 80% of their time to the labor-intensive data-wrangling process- to prepare raw data for consumption by automated machine learning (AutoML) tools. To overcome these hurdles, we introduce AutoDW, an innovative data wrangler that utilizes the power of Large Language Models (LLMs) and sophisticated automation to facilitate the seamless preparation of AI-ready data. This session will demonstrate how the synergistic application of AutoDW and AutoML empowers users to rapidly develop predictive prototypes for their business use cases end-to-end without writing any code. Through step-by-step instruction and a live demonstration, attendees will witness AutoDW's intelligent data processing, observe AutoML's autonomous algorithm selection tailored to specific applications, and, most importantly, gain a comprehensive understanding of how to interpret the resulting predictive outputs.

#AutoML #DataWrangling #NoCodeML #FastMLPrototyping #DataDrivenDevelopment

Speakers

Shailaja Sampat

Senior Researcher, Fujitsu Research of America

Shailaja Sampat is a senior researcher in the AI lab at Fujitsu's research division in the USA. She earned her Ph.D. from Arizona State University, with a thesis focusing on the intersection of natural language processing and computer vision. Her current research spans autoML, data... Read More →

Wednesday September 10, 2025 2:00pm - 2:50pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Workshop Day (Wed)
Tracks DataWeek, Data Science and Machine Learning (DataWeek), Data Tools / Technology and Management (DataWeek), Virtual
In-Person/Virtual Virtual

9:30am PDT

[Virtual] PRO Session (DataWeek): Intelligent Automation of Data Engineering Workflows with LLMs

Thursday September 11, 2025 9:30am - 9:55am PDT

VIRTUAL DataWeek -- Main Stage

Manohar Sai Jasti, Workday, Analytics Engineer

I will share how I developed an AI-driven system to transform raw SQL into production-ready dbt models using Large Language Models (LLMs). By combining retrieval-augmented generation techniques with dbt’s semantic framework, I automated SQL refactoring, modularization, testing, and documentation. This approach accelerates data engineering workflows, reduces manual effort, and enables scalable, production-ready analytics pipelines. I will walk through the architecture, challenges faced during scaling, validation strategies for AI-generated SQL, and key lessons learned from deploying this solution in real-world environments. Attendees will gain practical insights into applying LLMs for data workflow automation, improving pipeline quality, and driving faster AI productionization across modern data stacks.

Speakers

Manohar Sai Jasti

Analytics Engineer, Workday

Manohar Sai Jasti is an experienced Analytics Engineer specializing in building efficient and scalable data pipelines. With expertise in tools like dbt, Trino, and cloud platforms, he helps organizations turn data into actionable insights. Manohar is passionate about simplifying data... Read More →

Thursday September 11, 2025 9:30am - 9:55am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, Data Engineering / Architecture and Streaming (DataWeek), Virtual
In-Person/Virtual Virtual

10:00am PDT

[Virtual] PRO Session (DataWeek): Generative AI Operation Evaluation Framework

Thursday September 11, 2025 10:00am - 10:25am PDT

VIRTUAL DataWeek -- Main Stage

Cigil Achenkunju, LivePerson, Data and Product Management

How do we know if this gen AI investment is moving the needle? It is a question heard almost daily across finance, healthcare, and retail. And honestly, it is the right question to ask. On top of that, should we continue to invest in AI at the same rate or optimize? How can a stakeholder show that solution usage has a positive or maybe negative impact on their operation?
Survey results indicate that up to 85% of AI initiatives eventually fail to deliver their promises. Organizations using gen AI want to understand the impact of such solutions clearly. Can you blame them? So, let’s define a strategic decision-making framework that broadly answers these business questions in an operational setting that balances the benefits of business value and AI integration.

An analytical framework for operations measurement 2S/2E: Think of it as four pillars used together to tell a complete story of your gen AI-enabled operation's health: Each pillar reveals a different facet of your performance, and I'll show you exactly how to measure them. These pillars offer valuable insights to measure your operations. What makes this framework powerful is its systematic approach and adaptability.

Speakers

Cigil Achenkunju

Data and Product Management, LivePerson

A leader in advanced data analytics and a strategic advisor, Cigil has a robust background in data science and product management. With extensive experience across various organizations, Cigil has helped companies transform data into actionable insights that drive business success... Read More →

Thursday September 11, 2025 10:00am - 10:25am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, Data Governance and Security (DataWeek), Data Strategy / Analytics & Business Intelligence (DataWeek), Virtual
In-Person/Virtual Virtual

10:30am PDT

[Virtual] OPEN Session (DataWeek): Hiring for AI Success: Why Your First Hire Should Be a Data Engineer

Thursday September 11, 2025 10:30am - 10:55am PDT

VIRTUAL DataWeek -- Main Stage

Brenna Buuck, MinIO, Developer Evangelist

AI initiatives are at the top of every organization’s priority list, yet many fail before they even begin—not because of poor models, but because of poor data foundations. While hiring an AI/ML engineer may seem like the logical first step, success depends on a different approach: hiring a data engineer first.

In this session, I'll explore why data infrastructure is the true bottleneck in AI adoption and how the right data engineering expertise ensures AI models perform at scale. Drawing on real-world experience, I’ll walk through the hiring missteps organizations often make and how to avoid costly mistakes when building AI initiatives from the ground up.

Speakers

Brenna Buuck

Developer Evangelist, MinIO

Brenna Buuck is the subject matter expert at MinIO for databases and datalakes. A data engineer turned developer evangelist, she is passionate about coding, data, and learning. She endeavors to inspire and educate other developers about the latest tools and technologies with the goal... Read More →

Thursday September 11, 2025 10:30am - 10:55am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, Virtual
In-Person/Virtual Virtual

11:30am PDT

[Virtual] OPEN Session (DataWeek): Strategies for Image Dataset Curation from High-Volume Industrial IoT data

Thursday September 11, 2025 11:30am - 11:55am PDT

VIRTUAL DataWeek -- Main Stage

Apurva Godghase, Brambles, Senior Computer Vision Engineer

In Industrial IoT for Supply chain, and logistics, massive amounts of data is generated by edge devices that capture data continuously. For embedded vision systems, managing the sheer volume of images and metadata can be challenging. Selecting a diverse subset of high-quality data is crucial for effective modeling and analysis. This work outlines a comprehensive method for selecting relevant images from an extensive dataset to build a high-quality image database for building and monitoring computer vision and machine learning models. This systematic approach not only enhances the efficiency of data management in industrial IoT applications but also improves the generalizability and accuracy of Computer Vision learning models.

Speakers

Apurva Godghase

Senior Computer Vision Engineer, Brambles

Apurva is a Senior Computer Vision Engineer at Brambles, with over seven years of R&D experience across diverse industrial domains. At Brambles, she specializes in designing and deploying cutting-edge machine learning and computer vision IoT prototypes to enhance supply chain efficiencies... Read More →

Thursday September 11, 2025 11:30am - 11:55am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type OPEN Session
Tracks OPEN Session, DataWeek, Data Science and Machine Learning (DataWeek), Virtual
In-Person/Virtual Virtual

1:00pm PDT

[Virtual] PRO Session (DataWeek): AI-Driven Innovation: Scalable Data Architectures

Thursday September 11, 2025 1:00pm - 1:25pm PDT

VIRTUAL DataWeek -- Main Stage

Pritam Roy, Capgemini, Sr. Manager

As enterprises embrace AI for scalable automation, predictive analytics, and real-time decision intelligence, the need for robust data architectures and machine learning frameworks has never been greater. This session, led by Pritam Roy, a seasoned AI and data engineering leader, will explore how to design and implement scalable AI-powered data solutions that optimize business operations, cloud efficiency, and enterprise intelligence.

Speakers

Pritam Roy

Sr. Manager, Capgemini

Pritam Roy is a seasoned AI and data engineering leader, specializing in enterprise-scale AI solutions, cloud computing, and machine learning-driven business transformation. With over 20 years of experience, he has played a pivotal role in AI innovation, predictive analytics, and... Read More →

Thursday September 11, 2025 1:00pm - 1:25pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, Virtual
In-Person/Virtual Virtual

1:30pm PDT

[Virtual] PRO Session (DataWeek): Integrating Data Governance into Cyber Risk Management

Thursday September 11, 2025 1:30pm - 1:55pm PDT

VIRTUAL DataWeek -- Main Stage

Nandini Singh, Google, Sr. TPM

This session is designed for cybersecurity professionals, data governance leaders, and IT managers seeking to strengthen their organization's cybersecurity posture through effective data governance practices. Attendees will leave with actionable insights and strategies to enhance their organization's resilience against cyber threats.

Drawing upon my experience of working at the Office of Cybersecurity Resilience at Google, I will share lessons learned from integrating data governance into cyber risk management, with a focus on evaluating metric quality levels (introducing the concept of Metric Bill of Materials) and developing a continuous improvement and adaptation roadmap.

Speakers

Nandini Singh

Sr. TPM, Google

Nandini Singh is a seasoned professional in the fields of data modeling, analytics, and cybersecurity technologies, with a robust career that spans over a decade. She currently serves as a Senior Technical Program Manager at Google, where she leads initiatives on product, platform... Read More →

Thursday September 11, 2025 1:30pm - 1:55pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, Data Governance and Security (DataWeek), Data Strategy / Analytics & Business Intelligence (DataWeek), Virtual
In-Person/Virtual Virtual

2:30pm PDT

[Virtual] KEYNOTE (DataWeek): Informatica -- Mastering Enterprise AI Agents: Best Practices for Multi-Agent Frameworks

Thursday September 11, 2025 2:30pm - 2:55pm PDT

VIRTUAL DataWeek -- Main Stage

Sumeet Kumar Agrawal, Informatica, Vice President Product Management

Explore the design and deployment of enterprise-grade AI agent systems. This session covers best practices for building scalable multi-agent frameworks, ensuring security and governance, and features real-world examples from industries like customer service and supply chain management. Learn how to integrate AI agents effectively while addressing key enterprise challenges.

Speakers

Sumeet Kumar Agrawal

Vice President Product Management, Informatica

Sumeet Agrawal is the Vice President of Products at Informatica, where he spearheads product management of innovative cloud-based technology products. With over 15 of experience in data engineering and product management, Sumeet has a proven track record of driving innovative solutions... Read More →

Thursday September 11, 2025 2:30pm - 2:55pm PDT
VIRTUAL DataWeek -- Main Stage

5. KEYNOTES & FEATURED

Talk Type OPEN Session
Tracks OPEN Session, DataWeek, Data Science and Machine Learning (DataWeek), Data Strategy / Analytics & Business Intelligence (DataWeek), Virtual
In-Person/Virtual Virtual

3:00pm PDT

[Virtual] OPEN Session (DataWeek): AI Leadership in Data Strategy: Transforming Large-Scale Data Systems for Business Growth

Thursday September 11, 2025 3:00pm - 3:25pm PDT

VIRTUAL DataWeek -- Main Stage

Vijay Panwar, Panasonic Avionics Corporation, Senior Software Engineer

As organizations progressively depend on data to foster innovation, the significance of leadership in shaping and executing AI-driven strategies becomes crucial. In this session, I will present insights gained from over 12 years of experience spearheading transformative initiatives incorporating AI into extensive data systems. The discussion will emphasize strategic frameworks for the adoption of AI, the alignment of technological advancements with business goals, and the development of scalable data ecosystems. By referencing real-world examples, including my involvement in managing and optimizing terabyte-scale data, I will demonstrate how AI can transform backend systems, enhance workflows, and provide tangible value.

Speakers

Vijay Panwar

Senior Software Engineer, Panasonic Avionics Corportion

I am, an accomplished IT professional with a decade of experience, possess expertise in a wide array of technologies, including Python, SQL Server, MySQL, PHP, Web services, REST API, and more. I have a proven track record of contributing to the field, having published two research... Read More →

Thursday September 11, 2025 3:00pm - 3:25pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, OPEN Session, Virtual
In-Person/Virtual Virtual

3:30pm PDT

[Virtual] OPEN Session (DataWeek): Balancing Velocity with Academic Rigor When Building with LLMs

Thursday September 11, 2025 3:30pm - 3:55pm PDT

VIRTUAL DataWeek -- Main Stage

Lauren Peate, Multitudes, CEO & founder

We’re all building AI features now. But building with LLMs brings its own challenges – namely: How can we use cutting-edge practices, weave in AI ethics, and consider the cost of different models without blowing past delivery dates. Not to mention making sure that the features we build will be stable, reliable and maintainable in the future.

We recently built our first LLM feature, to show the quality of feedback given in code reviews. In 1 month, we did a literature review, consultation with academic experts, data labelling, model experimentation, a cost assessment, and finally, all the ML engineering to launch it into production. The outcome: <1% extreme misclassification and zero hallucinations. In this talk, we’ll share our approach to building LLM features – how we partnered with academia (without being delayed by their timelines), what tooling we used, and how we made the cost and money tradeoffs to keep business stakeholders happy. I’ll also speak to how we built this into our microservices architecture, including how we used tools to generate structured outputs from LLMs on top of AWS’s Bedrock API to have parseable responses from a range of models.

You'll walk away with practical strategies for leading your own teams through AI implementations, identifying ethical issues early, addressing them efficiently, and still delivering on time and on budget.

Speakers

Lauren Peate

CEO & founder, Multitudes

Lauren Peate is the CEO and founder of Multitudes, which helps engineering teams improve delivery sustainably. She’s focused her career on using data to support people, including as the founder of Ally Skills NZ, a consultancy helping global tech companies improve team performance... Read More →

Thursday September 11, 2025 3:30pm - 3:55pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, AI Models and Management (DataWeek), Data Science and Machine Learning (DataWeek), Virtual
In-Person/Virtual Virtual

9:30am PDT

[Virtual] PRO Session (DataWeek):The Rise of Agentic Commerce: Where AI Intelligence Meets Infinite Retail Possibility

Friday September 12, 2025 9:30am - 9:55am PDT

VIRTUAL DataWeek -- Main Stage

Aswini Atibudhi, Walmart, Driving Innovation with Generative AI

In this visionary session, we’ll explore how Agentic AI is not just enhancing retail — it’s fundamentally reinventing it. As commerce shifts from static transactions to dynamic, intelligent interactions, Agentic AI emerges as the architect of a new era: Agentic Commerce. Learn how AI-powered agents, composable architectures, and autonomous decision-making systems are creating infinitely adaptable, customer-first retail ecosystems. We’ll dive into real-world examples, transformative architectures, and the strategic shifts needed to thrive in an AI-first retail future.
Join us to uncover how businesses can unlock limitless innovation, personalized experiences, and operational agility by embracing the rise of Agentic Commerce.

Speakers

Aswini Atibudhi

Distinguished Architect for Customer Space, Walmart

Aswini is the Distinguished Architect for Customer Space at Walmart. He has more than 20 years of IT experience in design and development of scalable microservice and microfrontend based Web/Cloud/AI & ML Applications. His Portfolio is jam-packed with Multiple domains such as Finance... Read More →

Friday September 12, 2025 9:30am - 9:55am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, AI Models and Management (DataWeek), Data Tools / Technology and Management (DataWeek), Virtual
In-Person/Virtual Virtual

10:00am PDT

[Virtual] PRO Session (DataWeek): Data Sovereignty in the Age of AI

Friday September 12, 2025 10:00am - 10:25am PDT

VIRTUAL DataWeek -- Main Stage

Michel Tricot, Airbyte, Co-founder and CEO

This session will explore the intersection of data sovereignty and artificial intelligence, addressing how organizations can maintain control of their valuable data assets while still leveraging the power of AI. Drawing from extensive experience building open-source data infrastructure solutions, Michel will illuminate the challenges companies face when integrating AI into their data ecosystems without compromising ownership, security, or compliance requirements.

The session targets data leaders, CDOs, and enterprise architects who are navigating the complex landscape of AI adoption while maintaining strict data governance standards. Michel will share practical frameworks for implementing a self-managed data integration strategy that enables AI innovation while preserving first-party data sovereignty—a crucial consideration as regulatory requirements around data protection continue to evolve globally. Attendees will gain actionable insights on building resilient data architectures that support AI initiatives without surrendering control of sensitive information.

This session aligns perfectly with Data Week's focus on "Data Engineering & Governance" and "AI & ML" tracks, offering attendees a unique perspective on balancing innovation with control. Conference participants will benefit from Michel's vision of how open-source data integration infrastructure can serve as the foundation for responsible AI development, empowering organizations to build competitive advantages while maintaining complete sovereignty over their data. The presentation will include real-world examples of companies that have successfully implemented these principles.

Speakers

Michel Tricot

Co-founder and CEO, Airbyte

Michel Tricot is co-founder and CEO of Airbyte, the open data movement platform. The company was started in 2020 with a vision of commoditizing data integration pipelines across all industries and organizations and today has more than 170,000 deployments. Michel has been working in... Read More →

Friday September 12, 2025 10:00am - 10:25am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, AI Models and Management (DataWeek), Data Engineering / Architecture and Streaming (DataWeek), Virtual
In-Person/Virtual Virtual

10:30am PDT

[Virtual] OPEN Session (DataWeek): Ask Us Anything: Building Real-World AI Agents for Business

Friday September 12, 2025 10:30am - 10:55am PDT

VIRTUAL DataWeek -- Main Stage

Lee Dickson, raia AI, Director of Sales and Operations
Rich Swier, raia AI, Founder

You’ve seen the hype. Now let’s get practical. In this live AMA-style session, Lee Dickson and Rich Swier — co-hosts of The AI Guys podcast and builders of raia AI — open the floor to the questions business leaders, operators, and tech pros are really asking about AI.

From deployment timelines to hallucination prevention, ethics to employee augmentation — nothing’s off limits. Whether you’re planning your first AI rollout or scaling from one use case to a dozen, this is your chance to get direct insights from AI practitioners who’ve helped companies from $5M startups to $500M enterprises make AI work.

Speakers

Lee Dickson

Director of Sales and Operations, raia AI

Lee Dickson brings 7+ years of experience in productizing AI solutions and predictive analytics, with a strong track record of streamlining operations and improving customer engagement for SMBs and enterprise clients. With an extensive background in technology and SaaS, he's an advocate... Read More →

Rich Swier

Founder, raia

Rich is a serial entrepreneur based in Sarasota, Florida. For the past 30 years, Rich has built and exited numerous successful tech businesses and continues to launch and incubate new ventures.

Friday September 12, 2025 10:30am - 10:55am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, OPEN Session, Data Tools / Technology and Management (DataWeek), AI Models and Management (DataWeek), Virtual
In-Person/Virtual Virtual

11:00am PDT

[Virtual] OPEN Session (DataWeek): Data Integrity in the Age of AI: SBOMs, Lineage, and Trust in the Pipeline

Friday September 12, 2025 11:00am - 11:25am PDT

VIRTUAL DataWeek -- Main Stage

Saloni Garg, Wayfair, Senior Software Engineer

With AI models consuming more data than ever, ensuring the integrity and traceability of that data is critical. This talk focuses on how to build trust into your data pipelines -- using concepts like SBOMs (Software Bill of Materials) for datasets, audit trails, and metadata tagging to make data consumption safer and more transparent. I’ll also touch on how this ties into emerging compliance frameworks and how we’ve approached this in practice.

Speakers

Saloni Garg

Senior Software Engineer, Wayfair

International Red Hat Women in Open Source Awardee | Mozilla Open Leader 2019 | a strong open source diversity supporter | Google Venkat Scholarship winner | Speaker

Friday September 12, 2025 11:00am - 11:25am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type OPEN Session
Tracks OPEN Session, DataWeek, Data Governance and Security (DataWeek), Data Strategy / Analytics & Business Intelligence (DataWeek), Virtual
In-Person/Virtual Virtual

11:30am PDT

[Virtual] OPEN Session (DataWeek): Rearchitecting Data Processing for Today’s Demands

Friday September 12, 2025 11:30am - 11:55am PDT

VIRTUAL DataWeek -- Main Stage

Rajan Goyal, DataPelago, CEO & Co-founder

IT leaders face mounting pressure to leverage their organization’s data for genAI and lakehouse analytics. Yet, with data volumes doubling every two years and nearly 90% of new data being unstructured, traditional data processing architectures can’t keep up. Existing systems were designed for structured data and CPU-based computing, and businesses are finding they struggle with processing latency, high costs, and siloed data. In fact, more than 80% of IT leaders say that data silos are hindering digital transformation.

To drive value in today’s data environment, IT leaders need to implement new data processing architectures that are designed to handle massive volumes of complex data and are capable of taking advantage of the accelerated hardware (GPUs, FPGAs, CPU/SIMD, etc.) available in today’s cloud environments.

In this session, DataPelago CEO Rajan Goyal will outline the shortcomings of current data processing architectures and introduce attendees to the Universal Data Processing Engine, a software solution that sits between a data lake and query engine. Overcoming the shortcomings of traditional processing models, the UDPE is designed to handle all types of data (structured, unstructured, semi-structured) and work on top of any hardware. Integrating seamlessly into any tech stack, it accelerates data processing speed by 2-3x while cutting processing costs by 30-60% — enabling organizations to utilize all of their data for lakehouse analytics and AI workloads.

Speakers

Rajan Goyal

CEO & Co-founder, DataPelago

Rajan is the co-founder and Chief Executive Officer of DataPelago, the company revolutionizing data processing for the accelerated computing era. His expertise and visionary approach have been instrumental in shaping the future of data infrastructure and processing. A seasoned innovator... Read More →

Friday September 12, 2025 11:30am - 11:55am PDT
VIRTUAL DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, OPEN Session, Data Tools / Technology and Management (DataWeek), Virtual
In-Person/Virtual Virtual

1:00pm PDT

[Virtual] PRO Session (DataWeek): Compute for Your AI Model: GPUs, LPUs, TPUs and Beyond..

Friday September 12, 2025 1:00pm - 1:25pm PDT

VIRTUAL DataWeek -- Main Stage

Kushaagra Goyal, Rubrik, Tech Lead

In the rapidly evolving landscape of computing, Graphics Processing Units (GPUs), Language Processing Units (LPUs), and Tensor Processing Units (TPUs) play pivotal roles in accelerating complex tasks, particularly in machine learning and artificial intelligence.

GPUs are renowned for their parallel processing capabilities, making them ideal for rendering graphics and handling large datasets. LPUs are specialized for optimizing natural language processing tasks, enhancing efficiency in understanding and generating human language. TPUs, developed by Google, are tailored specifically for training and inference of machine learning models, offering significant performance advantages for large-scale AI applications.

As we explore these technologies, we'll also look at emerging processing units designed for specific AI use-cases and the future of computational advancements.

Join me to dive into the intricacies of these processing units, their applications, and what lies ahead in the world of computing technology.

Speakers

Kushaagra Goyal

Tech Lead, Rubrik

Kushaagra Goyal is an accomplished technology leader with deep expertise in engineering and AI infrastructure. He holds a Bachelor’s degree from the Indian Institute of Technology, Delhi (2016), and a Master’s degree from Stanford University, where he developed a strong foundation... Read More →

Friday September 12, 2025 1:00pm - 1:25pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, AI Models and Management (DataWeek), Data Science and Machine Learning (DataWeek), Virtual
In-Person/Virtual Virtual

1:30pm PDT

[Virtual] PRO Session (DataWeek): Securing Multi-Tenant Data Warehouses: How Federated Learning Revolutionizes Privacy

Friday September 12, 2025 1:30pm - 1:55pm PDT

VIRTUAL DataWeek -- Main Stage

Jayant Tyagi, Salesforce, Lead Member of Technical Staff

Enterprises are being forced to reconsider how they manage sensitive data in cloud data warehouses due to data privacy laws and security incidents. In multi-tenant settings, traditional centralized analytics techniques are becoming more and more susceptible, putting businesses at risk for data leaks and regulatory issues.
In this session, we'll look at how federated learning is transforming data warehouse security while preserving analytical capabilities. Attendees will learn how businesses can use safe federated ways that preserve sensitive data while still obtaining insightful information, based on current research in privacy-preserving technologies.

Speakers

Jayant Tyagi

Lead Member of Technical Staff, Salesforce

Jayant Tyagi is a seasoned full-stack engineer with 13 years of experience in designing and developing high-scale applications. As a Lead Member of Technical Staff at Salesforce, he has played a pivotal role in building and optimizing enterprise applications, spearheading innovations... Read More →

Friday September 12, 2025 1:30pm - 1:55pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type PRO Session
Tracks DataWeek, Data Warehousing and Storage (DataWeek), Virtual
In-Person/Virtual Virtual

2:00pm PDT

[Virtual] OPEN Session (DataWeek): Transforming Seller Onboarding in Retail: Responsible AI, RAG, and Risk Management

Friday September 12, 2025 2:00pm - 2:25pm PDT

VIRTUAL DataWeek -- Main Stage

Banani Mohapatra, Walmart, Senior Manager, Data Science
Bhavnish Walia, Amazon, Senior Risk Manager AI/ML

Onboarding new sellers onto retail platforms like Walmart and Amazon involves a complex, multi-step process designed to mitigate fraud and ensure compliance with global regulations. One of the most critical and cumbersome steps is Know Your Customer (KYC) verification, requiring sellers to upload documentation for identity verification, business registration, and compliance checks. This manual review process often leads to long approval times and delays, frustrating legitimate sellers and creating operational bottlenecks for compliance teams.
To address these challenges, we leveraged foundational models with custom prompting strategies, in-document summarization, and retrieval-augmented generation (RAG) to ground responses in trusted data sources, powered by open-source LLM APIs. By automating document analysis and augmenting human reviewers with AI outputs, we reduced overall onboarding time by more than 20 percent, improving seller experience and operational efficiency.
However, deploying AI into a regulated process like KYC required a robust responsible AI framework combining scalability with governance. We implemented guardrail models to flag edge cases and ensure human oversight, enforced strict data anonymization protocols to protect sensitive information, and applied privacy-preserving techniques for model training. We also established a rigorous validation pipeline to test outputs against regulatory standards, mitigating risks such as hallucinations and interpretability gaps.
This talk offers actionable insights for data scientists, compliance officers, regulators, and machine learning practitioners working at the intersection of AI, risk management, and regulatory compliance. Presented by Bhavnish Walia, Senior Risk Manager at Amazon, and Banani Mohapatra, Senior Data Science Manager at Walmart, attendees will walk away with a practical framework for deploying AI in sensitive domains—covering risk management strategies, scalable AI architectures aligned with compliance, and key lessons on balancing innovation with accountability.

Speakers

Bhavnish Walia

Senior Risk Manager AI/ML, Amazon

Bhavnish Walia is a Senior Risk Manager at Amazon, where he leads AI Risk Management efforts focused on developing large language model (LLM) frameworks for data governance and regulatory compliance. He ensures the safe and compliant deployment of AI systems at scale. With over 12... Read More →

Banani Mohapatra

Senior Manager, Data Science, Walmart

Banani Mohapatra is a data science leader with 12+ years of experience in e-commerce, payments, and real estate, specializing in machine learning, generative AI, LLMs, and causal AI. She leads a global data science team at Walmart, driving subscription growth with multi-billion-dollar... Read More →

Friday September 12, 2025 2:00pm - 2:25pm PDT
VIRTUAL DataWeek -- Main Stage

Talk Type OPEN Session
Tracks DataWeek, OPEN Session, Data Governance and Security (DataWeek), Virtual
In-Person/Virtual Virtual