Operationalizing Agentic AI Safety & Evaluation for Multi-Agent Financial Systems

Operationalizing Agentic AI Safety & Evaluation for Multi-Agent Financial Systems

Video by FINOS via YouTube
Operationalizing Agentic AI Safety & Evaluation for Multi-Agent Financial Systems

Vincent Caldeira (Field CTO at Red Hat) and Valentina Rodriguez Sosa (Principal Architect at Red Hat) map out a comprehensive, technical architecture for deploying multi-agent AI safely into production within regulated financial environments. They cover evaluation-driven development (EDD), open telemetry trace analysis, guardrailing economics, and automated red-teaming.

🇬🇧 Join us in London! Catch the latest on Agentic AI and DevSecOps at OSFF London on June 25, 2026: https://hubs.ly/Q041YV9Z0 (Use Code: 26YTOSFFLN20C)

🕒 Timestamps:
0:00 Introduction: System Behavior vs. Component Safety
0:50 Strategic Context: The Financial Interest in AI Agents
1:32 Architectural Differences: Traditional BPM vs. Non-Deterministic Multi-Step Workflows
2:05 Intent-Based Orchestration and Self-Correction Loops
2:36 The AgentOps Life Cycle: Building for Autonomy
3:05 Evaluation-Driven Development (EDD) Explained
3:34 Practical Dev Cycle: Executing the Harness Inner/Outer Loops
4:26 Telemetry Foundations: Using OpenTelemetry Standards
4:47 Capture Strategy: Generating Trace Telemetry for LLM Calls & Tools
5:24 Emphasizing Trajectory Validation Over Final Output
5:37 Managing Statistical Fat Tails in Non-Deterministic Systems
6:30 LLM-as-a-Judge: Reviewing Chain-of-Thought Decisions
7:02 FINOS Case Study: The "Finite Agent" Earnings Call Analysis Workflow
8:10 Operationalizing Workloads and the OWASP Top 10 for LLMs
9:24 Software Supply Chain Trusted Provenance for AI Artifacts
9:52 Guardrailing Architectures: Content Compliance and Cost Reduction Economics
11:43 Security Control: Signing Artifacts and Models with Sigstore
12:42 Automated Red-Teaming at Scale: Deploying Garak for Adversarial Testing
13:45 Closing Summary: Bridging Safety and Innovation

📊 The Problem: The Statistical Fat Tail of Non-Deterministic Agents Traditional financial software relies on deterministic step-based pathways managed by standard Business Process Management (BPM) systems. Multi-agent systems, however, utilize intent-based orchestration—allowing models to dynamically pick loops, leverage system tools, and self-correct on the fly. This introduces a massive architectural risk: because agents are non-deterministic, they cannot be completely validated through traditional testing. A single prompt deviation could trigger an unpredictable execution trajectory, leading to regulatory failure, data liability, or runaway compute costs.

🏗️ The Solution: Evaluation-Driven Development & Telemetry Architectures
Vincent and Valentina detail an end-to-end operational framework built explicitly to mitigate non-deterministic risks:
* Evaluation-Driven Development (EDD): Shifting testing to evaluate the complete trajectory (the sequence of agent thoughts and tool calls) rather than just checking the final output.
* OpenTelemetry Trace Baselines: Instrumenting agents to produce uniform open-telemetry trace logs for every tool engagement and LLM inference, serving as the debugging foundation for LLM-as-a-Judge validation architectures.
* Automated Adversarial Testing (Garak): Replacing finite human testing schedules with automated open-source red-teaming pipelines to run up to 70,000 statistical execution paths—stress-testing the system for prompt injection, shell breaking, and PI leakage.

⚙️ Why This Matters for Financial Engineering
* Guardrailing Cost Economics: Implementing input/output guardrails acts as an operational defense line—blocking malicious or redundant text blocks to significantly reduce institutional token expense overheads.
* Cryptographic Attestation (Sigstore): Enforcing cryptographic supply-chain signing on data pipelines and model configurations ensures verifiable provenance across all deployment environments.

🌐 More about FINOS: https://www.finos.org/
📧 Join our newsletter: https://www.finos.org/sign-up
🎙️ Listen to our Open Source in Finance Podcast: https://www.youtube.com/@FINOS/podcasts
LinkedIn: https://www.linkedin.com/company/finosfoundation

#FINOS #OSFFToronto #RedHat #AgenticAI #LLMOps #AgentOps #OpenTelemetry #DevSecOps #Sigstore #Garak #ResponsibleAI

Source

Enterprise AI Adoption: Turning FOMO into Progress

Enterprise AI Adoption: Turning FOMO into Progress

Video by Open Data Science and AI Conference via YouTube
Enterprise AI Adoption: Turning FOMO into Progress

The biggest business FOMO right now? Enterprise AI.

Want to learn more about AI in person? Check out ODSC AI East 2026, coming to Boston this April 28th-30th: https://hubs.li/Q041BP6P0

#DataScience #AI #ArtificialIntelligence #odscai

————————————————————————————————————-

Visit our website and choose the nearest ODSC event to attend and experience all our training and workshops: https://odsc.ai

To watch more videos like this, visit https://aiplus.training

Sign up for the newsletter to stay up to date with the latest trends in data science: https://opendatascience.com/newsletter/

Follow us online!
• Facebook: https://www.facebook.com/OPENDATASCI
• Instagram: https://www.instagram.com/odsc/
• Blog: https://opendatascience.com/
• LinkedIn: https://www.linkedin.com/company/open-data-science/
• X (twitter): https://x.com/_odsc

Source

2.5 Admins 303: Denial of Secrets

2.5 Admins 303: Denial of Secrets

Video by The Late Night Linux Family via YouTube
2.5 Admins 303: Denial of Secrets

Support us on Patreon and get an ad-free RSS feed with some early episodes. https://www.patreon.com/LateNightLinux

People were locked out of their password managers to stop a brute force attack, Coreutils come to Windows, a FreeBSD PR effort backfires, and the best simple consumer WiFi gear.

https://2.5admins.com/2-5-admins-303/

Source

Autonomous Unified Commerce | SAP Sapphire 2026

Autonomous Unified Commerce | SAP Sapphire 2026

Video by SAP via YouTube
Autonomous Unified Commerce | SAP Sapphire 2026

See how Autonomous Unified Commerce helps organizations respond faster when demand shifts, operations tighten, and customer commitments are on the line.

At SAP Sapphire 2026, SAP showcased how autonomous, unified commerce can connect intelligent buying moments with operational execution at scale. In this demo, demand for event kits surges just days before SAP Sapphire, creating a packing bottleneck and putting outbound deliveries at risk.

When SAP Logistics Management flags a delay risk, the logistics clerk consults Joule for a mitigation plan. Joule analyzes warehouse priorities, backlog, and resource allocation in real time, then recommends reassigning an autonomous robot from inbound putaway to outbound event kit packaging. With human approval, Joule triggers an SAP Embodied AI Agent, which translates delivery requirements and warehouse context into robot tasks prioritized by urgency and business impact.

Through SAP Business Technology Platform and the Cyberwave platform, robotic automation integrates into existing SAP workflows. The result is a closed loop from risk detection to execution — helping stabilize throughput, reduce backlog, and keep deliveries on track during peak demand.

This is Autonomous Unified Commerce in action: connected commerce, logistics, warehouse operations, AI agents, and robotics working together to help organizations deliver faster, more reliable customer experiences.

Chapters:
00:00 – Demand surges before SAP Sapphire
00:16 – Delay risk appears in logistics
00:22 – Packing capacity becomes constrained
00:38 – Joule builds a mitigation plan
00:50 – Rebalancing warehouse operations
01:05 – Triggering an SAP Embodied AI Agent
01:17 – Coordinating robot execution
01:23 – Monitoring throughput and backlog
01:41 – Training robots with real-world data
01:54 – Validating workflows in a digital twin
02:01 – Connecting robotics with SAP BTP
02:21 – Closing the loop from risk to execution
02:31 – Deliveries get back on track

Optimize customer journeys with Autonomous Unified Commerce:
https://www.sap.com/industries/autonomous-unified-commerce.html

Follow us on social:
LinkedIn: https://www.linkedin.com/company/sap/
Instagram: https://www.instagram.com/sap
Facebook: https://www.facebook.com/SAP/
Threads: https://www.threads.com/@sap

About SAP:
As a global leader in enterprise applications and business AI, SAP stands at the nexus of business and technology. For over 50 years, organizations have trusted SAP to bring out their best by uniting business-critical operations spanning finance, procurement, HR, supply chain, and customer experience. For more information, visit: https://www.sap.com/index.html

#SAP #BusinessAI #SAPSapphire

Source

Why New FCC Rules Could Crush Phone Privacy in the US

Why New FCC Rules Could Crush Phone Privacy in the US

Video by TWiT Tech Podcast Network via YouTube
Why New FCC Rules Could Crush Phone Privacy in the US

On Tech News Weekly, Mikah Sargent and Amanda Silberling of TechCrunch talk about the FCC’s plans to stop robocalls, but inadvertently kill the idea of burner phones through a potential requirement for carriers to collect their customers’ IDs.

You can find more about TWiT and subscribe to our full shows at https://podcasts.twit.tv
Subscribe: https://twit.tv/subscribe

Products we recommend: https://www.amazon.com/shop/twitnetcastnetwork
TWiT may earn commissions on certain products.

Join Club TWiT for Ad-Free Podcasts!
Support what you love and get ad-free shows, a members-only Discord, and behind-the-scenes access.
Join today: https://twit.tv/clubtwit

Join our TWiT Community on Discourse: https://www.twit.community/

Follow us:
– Bluesky: https://bsky.app/profile/twit.tv
– X: https://x.com/twit
– Mastodon: https://mastodon.social/@twit
– Facebook: https://www.facebook.com/TWiTNetwork
– Instagram: https://www.instagram.com/twit.tv
– TikTok: https://www.tiktok.com/@twittok
– LinkedIn: https://www.linkedin.com/company/twit-llc

About us:
TWiT.tv is a technology podcasting network located in the San Francisco Bay Area with the #1 ranked technology podcast This Week in Tech hosted by Leo Laporte. Every week we produce over 30 hours of content on a variety of programs including Tech News Weekly, MacBreak Weekly, Windows Weekly, Security Now, Intelligent Machines, and more.

Source

Engineering Quality in a Fast-Moving Open Source Project: WPE WebKit – Mario Sanchez-Prada, Igalia

Engineering Quality in a Fast-Moving Open Source Project: WPE WebKit - Mario Sanchez-Prada, Igalia

Video by The Linux Foundation via YouTube
Engineering Quality in a Fast-Moving Open Source Project: WPE WebKit - Mario Sanchez-Prada, Igalia

Join us at the premier vendor-neutral open source conference, where developers and technologists come together to collaborate, share knowledge, and explore the latest innovations and advancements in open source technology. Learn more at https://events.linuxfoundation.org/

Engineering Quality in a Fast-Moving Open Source Project: WPE WebKit – Mario Sanchez-Prada, Igalia

Building an embedded product on top of a large Open Source codebase like WPE WebKit is only the first step. The real challenge is keeping its quality stable as thousands of lines evolve and hundreds of changes land every week across multiple platforms.

In such an environment, errors and regressions are inevitable. What matters is detecting them quickly, understanding their impact, and reacting before they propagate further. This talk focuses on the engineering work that makes this possible, an effort that is essential yet often invisible.

Using WPE WebKit as a case study, we will explore how quality becomes a continuous engineering effort rather than a final validation phase and how CI and QA infrastructure, testing strategies, and processes (e.g. stabilization windows) sustain upstream development while supporting downstream deployments. We will show how these feedback loops reinforce each other and why aligning upstream and downstream processes is critical to keep quality stable over time.

This talk targets engineers, maintainers, and technical leaders working on large Open Source projects, as well as teams building products on top of them who need to sustain quality at scale.
How

Source

How I customized my KDE Desktop: Auto tiling, theme, icons, layout, activites..

How I customized my KDE Desktop: Auto tiling, theme, icons, layout, activites..

Video by The Linux Experiment via YouTube
How I customized my KDE Desktop: Auto tiling, theme, icons, layout, activites..

Use a secure, encrypted, and fast VPN with Proton VPN: https://protonvpn.com/TheLinuxEXP

Grab a brand new laptop or desktop running Linux: https://www.tuxedocomputers.com/en#

👏 SUPPORT THE CHANNEL:
Get access to:
– a Daily Linux News show
– a weekly patroncast for more thoughts
– your name in the credits

YouTube: https://www.youtube.com/@TheLinuxEXP/join
Patreon: https://www.patreon.com/thelinuxexperiment

Or, you can donate whatever you want:
https://paypal.me/thelinuxexp
Liberapay: https://liberapay.com/TheLinuxExperiment/

👕 GET TLE MERCH
Support the channel AND get cool new gear: https://the-linux-experiment.creator-spring.com/

Timestamps:
00:00 Intro
00:43 Sponsor: ProtonVPN
01:41 Mouse Tiler: auto-tiling for KDE
06:35 Other Scripts
08:00 Plasma Layout & Applets
11:43 Themes & Icons
15:57 Activities
18:46 Sponsor: Tuxedo Computers

#kdeplasma #linuxdesktop #ricing

Source

Create campaign concepts and assets with Codex

Create campaign concepts and assets with Codex

Video by OpenAI via YouTube
Create campaign concepts and assets with Codex

See how the creative production plugin for Codex helps marketing teams move from product and campaign context to visual concepts, asset variations, and editable creative.

In this demo, Codex creates a campaign mood board, refines visual directions, generates a launch asset, and hands it off for final edits.

More than 5 million people now use Codex every week, with marketers, creators, operators, designers, researchers, investors, and bankers among the fastest-growing users.

Learn more: https://openai.com/index/codex-for-every-role-tool-workflow/
Try the plugin today: https://chatgpt.com/plugins/share/a826391706e14c90816f2ceba9cc8b49

Source