Enterprise Prompt Engineering & LLM Testing via h2oGPTe | Part 12

Enterprise Prompt Engineering & LLM Testing via h2oGPTe | Part 12

Video by H2O.ai via YouTube
Enterprise Prompt Engineering & LLM Testing via h2oGPTe | Part 12

How Enterprise h2oGPTe manages prompt templates, version control, and multilingual AI agent deployment at scale.

Bridging predictive models and end users requires well-engineered, maintainable prompts. h2oGPTe provides a centralized prompt library where teams can create, clone, version, and share templates across the organization. The H2O Super Agent connects natural language prompts directly to predictive scoring APIs—enabling real-world actions like addressing customer churn. Multilingual template support and UI localization allow consistent AI behavior to be deployed across global markets.

Technical Capabilities & Resources

➤ Prompt Templates & Libraries: Create, clone, and share prompt templates from a managed organizational catalog.
🔗 https://docs.h2o.ai/enterprise-h2ogpte/guide/prompts

➤ Prompt Version Control & Iteration: Define system behaviors, iterate on prompt designs, and manage template settings.
🔗 https://docs.h2o.ai/enterprise-h2ogpte/guide/prompts#create-a-prompt-template

➤ Template Sharing Across Teams: Distribute prompt templates for consistent AI behavior organization-wide.
🔗 https://docs.h2o.ai/enterprise-h2ogpte/guide/prompts#share-a-prompt-template

➤ Custom Multilingual Prompts: Configure language-specific templates for consistent, localized global AI deployment.
🔗 https://docs.h2o.ai/enterprise-h2ogpte/guide/prompts#create-a-prompt-template-for-a-specific-language

Source

Advanced MLflow Tracing: Framework Integrations with LangChain, LlamaIndex, LangGraph (Notebook 1.6)

Advanced MLflow Tracing: Framework Integrations with LangChain, LlamaIndex, LangGraph (Notebook 1.6)

Video by MLflow via YouTube
Advanced MLflow Tracing: Framework Integrations with LangChain, LlamaIndex, LangGraph (Notebook 1.6)

In this sixth episode of this series, Jules Damji dives deep into MLflow’s extensive framework integrations. MLflow supports over 30 different open-source agent-building frameworks, allowing you to automatically trace and evaluate complex AI workflows regardless of your chosen architecture.

This tutorial provides a hands-on comparison of three open source agent building frameworks and demonstrates how MLflow provides full visibility into their execution:
🔹 𝗟𝗮𝗻𝗴𝗖𝗵𝗮𝗶𝗻: Learn how to use high-level primitives like ChatPromptTemplate and StringOutputParser to build sequential workflows. We demonstrate both simple chains and complex multi-step sequences connected via the pipeline operator.
🔹 𝗟𝗹𝗮𝗺𝗮𝗜𝗻𝗱𝗲𝘅: See how to build a Retrieval-Augmented Generation (RAG) system. We walk through creating an in-memory vector index, generating embeddings with OpenAI, and using a query engine to retrieve document-based answers, all while capturing the entire operation trace in MLflow.
🔹 𝗟𝗮𝗻𝗴𝗚𝗿𝗮𝗽𝗵: For more advanced use cases, we explore building stateful, hierarchical agent workflows. We demonstrate a customer service triage system that uses a supervisor node to classify queries and route them to specialized handlers for billing, tech support, or general inquiries.

Key Takeaways:
🔹 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗰 𝗧𝗿𝗮𝗰𝗶𝗻𝗴: All frameworks integrated with MLflow are automatically traced, capturing inputs, outputs, and intermediate steps without manual instrumentation.8
🔹 𝗙𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 𝗦𝗲𝗹𝗲𝗰𝘁𝗶𝗼𝗻: Choose LangChain for sequential chains, LlamaIndex for heavy document indexing, and LangGraph for complex, stateful branching or looping workflows.
🔹 𝗩𝗶𝘀𝗶𝗯𝗶𝗹𝗶𝘁𝘆: Use the MLflow UI to inspect timelines, verify embeddings, and debug the internal logic of your AI agents.

Resources:
🔗 Notebook 1.5: https://github.com/dmatrix/mlflow-genai-tutorials/blob/main/06_framework_integrations.ipynb
🎥 Full Series Playlist: https://youtube.com/playlist?list=PLaoPu6xpLk9EI99TuOjSgy-UuDWowJ_mR&si=jdbAbxTCRuxFxfnG

Source

Visual Anomaly & Novelty Detection Workshop VAND 4.0

Visual Anomaly & Novelty Detection Workshop VAND 4.0

Video by OpenCV via YouTube
Visual Anomaly & Novelty Detection Workshop VAND 4.0

Join our Patreon to support the show: https://patreon.com/opencv

We welcome back the team behind the VAND anomaly-detection challenge, a staple of recent CVPR conferences. VAND brings together cutting-edge research on detecting what doesn’t belong in visual data—spanning anomaly, novelty, and out-of-distribution detection. Building on three successful editions, VAND 4.0 unites supervised, semi-, and unsupervised approaches, including few-, one-, and zero-shot learning, with a strong focus on real-world impact.

Official site: https://sites.google.com/view/vand4-cvpr2026

Info for CVPR attendees: June 4th (1pm-6pm), 2026 in Denver, CO, USA (In Person) + Zoom (Virtual), Half Day
Room: 601, Posters: Exhibit Hall A

OpenCV is a 501(c)(3) registered non-profit in the United States. See how you can support open source CV & AI: http://opencv.org/support/

Watch along for your chance to win during our live trivia segment, and participate in the live Q&A session with questions from you in the audience.

Become a paid member of the channel to help us make more episodes https://www.youtube.com/channel/UCkrcW82Y2kbgU-U9RaYfgxw/join

Got a cool project of your own? Send it to us and you may be featured https://www.jotform.com/form/233105358823151

Source

Why WideEP Inference Needs Data-Parallel-Aware Scheduling – Maroon Ayoub & Tyler Michael Smith

Why WideEP Inference Needs Data-Parallel-Aware Scheduling - Maroon Ayoub & Tyler Michael Smith

Video by PyTorch via YouTube
Why WideEP Inference Needs Data-Parallel-Aware Scheduling - Maroon Ayoub & Tyler Michael Smith

Why WideEP Inference Needs Data-Parallel-Aware Scheduling – Maroon Ayoub, IBM; Tyler Michael Smith, Red Hat

WideEP—wide expert parallelism fails not because experts are expensive, but because routing ignores where state already lives. In PyTorch LLM serving with vLLM, WideEP fans tokens across many experts while KV caches accumulate unevenly across data-parallel replicas. When routing is unaware of KV placement and per-replica load, requests land on replicas that cannot reuse cache or make progress efficiently and latency spikes as expert fan-out grows.

The fix is not reshaping expert parallelism, but making routing data-parallel aware using signals vLLM already exposes. In this talk, we show how llm-d extends its router to leverage KV-cache locality and load awareness when routing WideEP flows. Rather than treating replicas as interchangeable, the router prefers replicas with warm KV state and available capacity, aligning routing decisions with vLLM’s execution reality and reducing cache fragmentation.

This session walks through how KV-aware, data-parallel routing changes WideEP inference in practice: which signals matter, how routing behavior evolves, and where the gains come from. Attendees leave with a clear mental model for when KV- and load-aware routing unlocks higher throughput.

Source

Why Your Small Language Model Will Fail (And How to Fix It)

Why Your Small Language Model Will Fail (And How to Fix It)

Video by NetApp Instaclustr via YouTube
Why Your Small Language Model Will Fail (And How to Fix It)

While LLMs produce fantastic results, the costs can be steep. A small language model is a great option for limited budgets and resources while still providing solid, fast outputs.

Learn how to build a SLM from scratch with Sr. AI developer advocate David VonThenen.

Demo: https://www.youtube.com/watch?v=VUuVro-Dv7c

Be sure to subscribe for all things AI!

Timestamps:
00:46: What is a small language model?
01:55: Setting up a nanoGPT-style casual LM
09:26: Building our model
13:38: Results
14:59: Fine-tuning the model
21:06: The types of LMs

Source

FOSSASIA 2026 in Bangkok, Hall 2 – 10 March 2026

FOSSASIA 2026 in Bangkok, Hall 2 - 10 March 2026

Video by FOSSASIA via YouTube
FOSSASIA 2026 in Bangkok, Hall 2 - 10 March 2026

Welcome to the livestream of FOSSASIA 2026 taking place in Bangkok, Thailand as part of the FOSSASIA Summit. Community Day brings together open source contributors, developers, students, maintainers, and technology leaders from across Asia and around the world.

The program features talks, discussions, and community sessions covering:

Artificial Intelligence and Machine Learning

Cloud Infrastructure and DevOps

Cybersecurity and Digital Safety

Web and Mobile Development

Open Hardware and Embedded Systems

Databases and Data Engineering

Open Source community collaboration

Speakers include engineers, researchers, and maintainers from global technology companies and major open source projects. Participants share real world experience, technical insights, and community initiatives shaping the future of open technology.

FOSSASIA is a non profit organization supporting open technologies and developer communities across Asia. Through conferences, hackathons, mentoring programs such as Google Summer of Code, and open source projects, FOSSASIA connects contributors and organizations working on impactful technology.

Location: Bangkok, Thailand
Event: FOSSASIA Summit 2026
Track: Community Day

Learn more about the event and upcoming activities:
https://fossasia.org

Join the community:
https://github.com/fossasia

Follow FOSSASIA for updates on open source events, projects, and collaborations across Asia.

#FOSSASIA #OpenSource #AI #Cloud #Developers #Bangkok #FOSSASIASummit

Source

Project Lightning Talk: A Curator’s Guide to the CNCF Landscape- Katherine Druckman and Lori Lorusso

Project Lightning Talk: A Curator’s Guide to the CNCF Landscape- Katherine Druckman and Lori Lorusso

Video by CNCF [Cloud Native Computing Foundation] via YouTube
Project Lightning Talk: A Curator’s Guide to the CNCF Landscape- Katherine Druckman and Lori Lorusso

Don’t miss out! Join us at our next KubeCon + CloudNativeCon events in Mumbai, India (18-19 June, 2026), Yokohama, Japan (29-30 July, 2026), and Shanghai, China (8-9 September, 2026). Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io

Project Lightning Talk: A Curator’s Guide to the CNCF Landscape – Katherine Druckman and Lori Lorusso

Step into the studio with two of the Netherlands’ most famous masters as we explore the CNCF Landscape like a gallery of modern technical masterpieces. Don your berets and join Rembrandt and Van Gogh as we paint a clearer picture of the CNCF ecosystem.

With more than 190 projects across the landscape, discovering what each one does can feel a bit like wandering through an enormous museum without a guide. “Just go to the website” isn’t always enough. Sometimes you need a curator to point out the highlights, explain the movements, and help you see how the pieces fit together.
In this lightning session, we’ll tour a selection of CNCF projects the way art historians might walk through a gallery, highlighting the themes, techniques, and innovations that make them stand out. By the end, you’ll have a clearer mental map of the landscape and be ready to navigate KubeCon like a seasoned collector.

Source