Loading…
Subject: Data Science and Machine Learning (DataWeek) clear filter
Wednesday, September 3
 

12:00pm PDT

PRO WORKSHOP (DataWeek): Building a RAG System for Video Search and Analysis
Wednesday September 3, 2025 12:00pm - 12:50pm PDT
Elizabeth Fuentes Leone, AWS, Developer Advocate

This talk addresses the challenge of making video content searchable and analyzable using modern AI techniques. While text and image RAG systems are common, video presents unique challenges due to its multimodal nature combining visual frames and audio content. 
Speakers
avatar for Elizabeth Fuentes Leone

Elizabeth Fuentes Leone

Developer Advocate, AWS
As a Data Analytics and Machine Learning/Artificial Intelligence (ML/AI) Specialist, my mission is to break down complex concepts into easily understandable terms. I strive to develop innovative solutions that tackle real-world challenges effectively. By sharing my knowledge and experience... Read More →
Wednesday September 3, 2025 12:00pm - 12:50pm PDT
DataWeek -- Main Stage

5:00pm PDT

PRO WORKSHOP (DataWeek): From Data to Insights in Minutes: Accelerating Business Growth with AutoML and LLM-Powered Data Prep
Wednesday September 3, 2025 5:00pm - 5:50pm PDT
Shailaja Sampat, Fujitsu Research of America, Senior Researcher

- Are you tired of manually processing large datasets in spreadsheets?

- Have you considered using Machine Learning (ML) to automate your tasks but feel limited by your coding skills or time constraints?

- Looking for ways to reduce the time spent in making your data AI-ready, despite your ML background?

If these challenges sound familiar, this session is for you!

Traditionally, building predictive models has required specialized coding and statistical expertise. Moreover, data professionals often dedicate over 80% of their time to the labor-intensive data-wrangling process- to prepare raw data for consumption by automated machine learning (AutoML) tools. To overcome these hurdles, we introduce AutoDW, an innovative data wrangler that utilizes the power of Large Language Models (LLMs) and sophisticated automation to facilitate the seamless preparation of AI-ready data. This session will demonstrate how the synergistic application of AutoDW and AutoML empowers users to rapidly develop predictive prototypes for their business use cases end-to-end without writing any code. Through step-by-step instruction and a live demonstration, attendees will witness AutoDW's intelligent data processing, observe AutoML's autonomous algorithm selection tailored to specific applications, and, most importantly, gain a comprehensive understanding of how to interpret the resulting predictive outputs.

#AutoML #DataWrangling #NoCodeML #FastMLPrototyping #DataDrivenDevelopment
Speakers
avatar for Shailaja Sampat

Shailaja Sampat

Senior Researcher, Fujitsu Research of America
Shailaja Sampat is a senior researcher in the AI lab at Fujitsu's research division in the USA. She earned her Ph.D. from Arizona State University, with a thesis focusing on the intersection of natural language processing and computer vision. Her current research spans autoML, data... Read More →
Wednesday September 3, 2025 5:00pm - 5:50pm PDT
DataWeek -- Main Stage
 
Thursday, September 4
 

11:30am PDT

OPEN Session (DataWeek): Strategies for Image Dataset Curation from High-Volume Industrial IoT data
Thursday September 4, 2025 11:30am - 11:55am PDT
Apurva Godghase, Brambles, Senior Computer Vision Engineer

In Industrial IoT for Supply chain, and logistics, massive amounts of data is generated by edge devices that capture data continuously. For embedded vision systems, managing the sheer volume of images and metadata can be challenging. Selecting a diverse subset of high-quality data is crucial for effective modeling and analysis. This work outlines a comprehensive method for selecting relevant images from an extensive dataset to build a high-quality image database for building and monitoring computer vision and machine learning models. This systematic approach not only enhances the efficiency of data management in industrial IoT applications but also improves the generalizability and accuracy of Computer Vision learning models. 
Speakers
avatar for Apurva Godghase

Apurva Godghase

Senior Computer Vision Engineer, Brambles
Apurva is a Senior Computer Vision Engineer at Brambles, with over seven years of R&D experience across diverse industrial domains. At Brambles, she specializes in designing and deploying cutting-edge machine learning and computer vision IoT prototypes to enhance supply chain efficiencies... Read More →
Thursday September 4, 2025 11:30am - 11:55am PDT
DataWeek -- Main Stage

2:30pm PDT

KEYNOTE (DataWeek): Informatica -- Mastering Enterprise AI Agents: Best Practices for Multi-Agent Frameworks
Thursday September 4, 2025 2:30pm - 2:55pm PDT
Sumeet Kumar Agrawal, Informatica, Vice President Product Management

Explore the design and deployment of enterprise-grade AI agent systems. This session covers best practices for building scalable multi-agent frameworks, ensuring security and governance, and features real-world examples from industries like customer service and supply chain management. Learn how to integrate AI agents effectively while addressing key enterprise challenges. 
Speakers
avatar for Sumeet Kumar Agrawal

Sumeet Kumar Agrawal

Vice President Product Management, Informatica
Sumeet Agrawal is the Vice President of Products at Informatica, where he spearheads product management of innovative cloud-based technology products. With over 15 of experience in data engineering and product management, Sumeet has a proven track record of driving innovative solutions... Read More →
Thursday September 4, 2025 2:30pm - 2:55pm PDT
DataWeek -- Main Stage

3:30pm PDT

OPEN Session (DataWeek): Balancing Velocity with Academic Rigor When Building with LLMs
Thursday September 4, 2025 3:30pm - 3:55pm PDT
Lauren Peate, Multitudes, CEO & founder

We’re all building AI features now. But building with LLMs brings its own challenges – namely: How can we use cutting-edge practices, weave in AI ethics, and consider the cost of different models without blowing past delivery dates. Not to mention making sure that the features we build will be stable, reliable and maintainable in the future.

We recently built our first LLM feature, to show the quality of feedback given in code reviews. In 1 month, we did a literature review, consultation with academic experts, data labelling, model experimentation, a cost assessment, and finally, all the ML engineering to launch it into production. The outcome: <1% extreme misclassification and zero hallucinations. In this talk, we’ll share our approach to building LLM features – how we partnered with academia (without being delayed by their timelines), what tooling we used, and how we made the cost and money tradeoffs to keep business stakeholders happy. I’ll also speak to how we built this into our microservices architecture, including how we used tools to generate structured outputs from LLMs on top of AWS’s Bedrock API to have parseable responses from a range of models.

You'll walk away with practical strategies for leading your own teams through AI implementations, identifying ethical issues early, addressing them efficiently, and still delivering on time and on budget.
Speakers
avatar for Lauren Peate

Lauren Peate

CEO & founder, Multitudes
Lauren Peate is the CEO and founder of Multitudes, which helps engineering teams improve delivery sustainably. She’s focused her career on using data to support people, including as the founder of Ally Skills NZ, a consultancy helping global tech companies improve team performance... Read More →
Thursday September 4, 2025 3:30pm - 3:55pm PDT
DataWeek -- Main Stage
 
Friday, September 5
 

1:00pm PDT

PRO Session (DataWeek): Compute for Your AI Model: GPUs, LPUs, TPUs and Beyond..
Friday September 5, 2025 1:00pm - 1:25pm PDT
Kushaagra Goyal, Rubrik, Tech Lead

In the rapidly evolving landscape of computing, Graphics Processing Units (GPUs), Language Processing Units (LPUs), and Tensor Processing Units (TPUs) play pivotal roles in accelerating complex tasks, particularly in machine learning and artificial intelligence.

GPUs are renowned for their parallel processing capabilities, making them ideal for rendering graphics and handling large datasets. LPUs are specialized for optimizing natural language processing tasks, enhancing efficiency in understanding and generating human language. TPUs, developed by Google, are tailored specifically for training and inference of machine learning models, offering significant performance advantages for large-scale AI applications.

As we explore these technologies, we'll also look at emerging processing units designed for specific AI use-cases and the future of computational advancements.

Join me to dive into the intricacies of these processing units, their applications, and what lies ahead in the world of computing technology.
Speakers
avatar for Kushaagra Goyal

Kushaagra Goyal

Tech Lead, Rubrik
Kushaagra Goyal is an accomplished technology leader with deep expertise in engineering and AI infrastructure. He holds a Bachelor’s degree from the Indian Institute of Technology, Delhi (2016), and a Master’s degree from Stanford University, where he developed a strong foundation... Read More →
Friday September 5, 2025 1:00pm - 1:25pm PDT
DataWeek -- Main Stage
 
Wednesday, September 10
 

9:00am PDT

[Virtual] PRO WORKSHOP (DataWeek): Building a RAG System for Video Search and Analysis
Wednesday September 10, 2025 9:00am - 9:50am PDT
Elizabeth Fuentes Leone, AWS, Developer Advocate

This talk addresses the challenge of making video content searchable and analyzable using modern AI techniques. While text and image RAG systems are common, video presents unique challenges due to its multimodal nature combining visual frames and audio content. 
Speakers
avatar for Elizabeth Fuentes Leone

Elizabeth Fuentes Leone

Developer Advocate, AWS
As a Data Analytics and Machine Learning/Artificial Intelligence (ML/AI) Specialist, my mission is to break down complex concepts into easily understandable terms. I strive to develop innovative solutions that tackle real-world challenges effectively. By sharing my knowledge and experience... Read More →
Wednesday September 10, 2025 9:00am - 9:50am PDT
VIRTUAL DataWeek -- Main Stage

2:00pm PDT

[Virtual] PRO WORKSHOP (DataWeek): From Data to Insights in Minutes: Accelerating Business Growth with AutoML and LLM-Powered Data Prep
Wednesday September 10, 2025 2:00pm - 2:50pm PDT
Shailaja Sampat, Fujitsu Research of America, Senior Researcher

- Are you tired of manually processing large datasets in spreadsheets?

- Have you considered using Machine Learning (ML) to automate your tasks but feel limited by your coding skills or time constraints?

- Looking for ways to reduce the time spent in making your data AI-ready, despite your ML background?

If these challenges sound familiar, this session is for you!

Traditionally, building predictive models has required specialized coding and statistical expertise. Moreover, data professionals often dedicate over 80% of their time to the labor-intensive data-wrangling process- to prepare raw data for consumption by automated machine learning (AutoML) tools. To overcome these hurdles, we introduce AutoDW, an innovative data wrangler that utilizes the power of Large Language Models (LLMs) and sophisticated automation to facilitate the seamless preparation of AI-ready data. This session will demonstrate how the synergistic application of AutoDW and AutoML empowers users to rapidly develop predictive prototypes for their business use cases end-to-end without writing any code. Through step-by-step instruction and a live demonstration, attendees will witness AutoDW's intelligent data processing, observe AutoML's autonomous algorithm selection tailored to specific applications, and, most importantly, gain a comprehensive understanding of how to interpret the resulting predictive outputs.

#AutoML #DataWrangling #NoCodeML #FastMLPrototyping #DataDrivenDevelopment
Speakers
avatar for Shailaja Sampat

Shailaja Sampat

Senior Researcher, Fujitsu Research of America
Shailaja Sampat is a senior researcher in the AI lab at Fujitsu's research division in the USA. She earned her Ph.D. from Arizona State University, with a thesis focusing on the intersection of natural language processing and computer vision. Her current research spans autoML, data... Read More →
Wednesday September 10, 2025 2:00pm - 2:50pm PDT
VIRTUAL DataWeek -- Main Stage
 
Thursday, September 11
 

11:30am PDT

[Virtual] OPEN Session (DataWeek): Strategies for Image Dataset Curation from High-Volume Industrial IoT data
Thursday September 11, 2025 11:30am - 11:55am PDT
Apurva Godghase, Brambles, Senior Computer Vision Engineer

In Industrial IoT for Supply chain, and logistics, massive amounts of data is generated by edge devices that capture data continuously. For embedded vision systems, managing the sheer volume of images and metadata can be challenging. Selecting a diverse subset of high-quality data is crucial for effective modeling and analysis. This work outlines a comprehensive method for selecting relevant images from an extensive dataset to build a high-quality image database for building and monitoring computer vision and machine learning models. This systematic approach not only enhances the efficiency of data management in industrial IoT applications but also improves the generalizability and accuracy of Computer Vision learning models. 
Speakers
avatar for Apurva Godghase

Apurva Godghase

Senior Computer Vision Engineer, Brambles
Apurva is a Senior Computer Vision Engineer at Brambles, with over seven years of R&D experience across diverse industrial domains. At Brambles, she specializes in designing and deploying cutting-edge machine learning and computer vision IoT prototypes to enhance supply chain efficiencies... Read More →
Thursday September 11, 2025 11:30am - 11:55am PDT
VIRTUAL DataWeek -- Main Stage

2:30pm PDT

[Virtual] KEYNOTE (DataWeek): Informatica -- Mastering Enterprise AI Agents: Best Practices for Multi-Agent Frameworks
Thursday September 11, 2025 2:30pm - 2:55pm PDT
Sumeet Kumar Agrawal, Informatica, Vice President Product Management

Explore the design and deployment of enterprise-grade AI agent systems. This session covers best practices for building scalable multi-agent frameworks, ensuring security and governance, and features real-world examples from industries like customer service and supply chain management. Learn how to integrate AI agents effectively while addressing key enterprise challenges. 
Speakers
avatar for Sumeet Kumar Agrawal

Sumeet Kumar Agrawal

Vice President Product Management, Informatica
Sumeet Agrawal is the Vice President of Products at Informatica, where he spearheads product management of innovative cloud-based technology products. With over 15 of experience in data engineering and product management, Sumeet has a proven track record of driving innovative solutions... Read More →
Thursday September 11, 2025 2:30pm - 2:55pm PDT
VIRTUAL DataWeek -- Main Stage

3:30pm PDT

[Virtual] OPEN Session (DataWeek): Balancing Velocity with Academic Rigor When Building with LLMs
Thursday September 11, 2025 3:30pm - 3:55pm PDT
Lauren Peate, Multitudes, CEO & founder

We’re all building AI features now. But building with LLMs brings its own challenges – namely: How can we use cutting-edge practices, weave in AI ethics, and consider the cost of different models without blowing past delivery dates. Not to mention making sure that the features we build will be stable, reliable and maintainable in the future.

We recently built our first LLM feature, to show the quality of feedback given in code reviews. In 1 month, we did a literature review, consultation with academic experts, data labelling, model experimentation, a cost assessment, and finally, all the ML engineering to launch it into production. The outcome: <1% extreme misclassification and zero hallucinations. In this talk, we’ll share our approach to building LLM features – how we partnered with academia (without being delayed by their timelines), what tooling we used, and how we made the cost and money tradeoffs to keep business stakeholders happy. I’ll also speak to how we built this into our microservices architecture, including how we used tools to generate structured outputs from LLMs on top of AWS’s Bedrock API to have parseable responses from a range of models.

You'll walk away with practical strategies for leading your own teams through AI implementations, identifying ethical issues early, addressing them efficiently, and still delivering on time and on budget.
Speakers
avatar for Lauren Peate

Lauren Peate

CEO & founder, Multitudes
Lauren Peate is the CEO and founder of Multitudes, which helps engineering teams improve delivery sustainably. She’s focused her career on using data to support people, including as the founder of Ally Skills NZ, a consultancy helping global tech companies improve team performance... Read More →
Thursday September 11, 2025 3:30pm - 3:55pm PDT
VIRTUAL DataWeek -- Main Stage
 
Friday, September 12
 

1:00pm PDT

[Virtual] PRO Session (DataWeek): Compute for Your AI Model: GPUs, LPUs, TPUs and Beyond..
Friday September 12, 2025 1:00pm - 1:25pm PDT
Kushaagra Goyal, Rubrik, Tech Lead

In the rapidly evolving landscape of computing, Graphics Processing Units (GPUs), Language Processing Units (LPUs), and Tensor Processing Units (TPUs) play pivotal roles in accelerating complex tasks, particularly in machine learning and artificial intelligence.

GPUs are renowned for their parallel processing capabilities, making them ideal for rendering graphics and handling large datasets. LPUs are specialized for optimizing natural language processing tasks, enhancing efficiency in understanding and generating human language. TPUs, developed by Google, are tailored specifically for training and inference of machine learning models, offering significant performance advantages for large-scale AI applications.

As we explore these technologies, we'll also look at emerging processing units designed for specific AI use-cases and the future of computational advancements.

Join me to dive into the intricacies of these processing units, their applications, and what lies ahead in the world of computing technology.
Speakers
avatar for Kushaagra Goyal

Kushaagra Goyal

Tech Lead, Rubrik
Kushaagra Goyal is an accomplished technology leader with deep expertise in engineering and AI infrastructure. He holds a Bachelor’s degree from the Indian Institute of Technology, Delhi (2016), and a Master’s degree from Stanford University, where he developed a strong foundation... Read More →
Friday September 12, 2025 1:00pm - 1:25pm PDT
VIRTUAL DataWeek -- Main Stage
 

Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.