Open Source News: AI, Security, and Linux Updates

Open Source News: AI, Security, and Linux Updates

Introduction This week’s open source news highlights a surge in AI-powered projects, significant security initiatives, and practical updates for Linux users. From Canonical’s streamlined development tool to IBM and Red Hat’s $5B security effort, the ecosystem is evolving rapidly. The underlying trend is clear: open source is embracing AI while grappling with its challenges, and … Read more

How to Create an LLM Dataset | FineWeb Overview

How to Create an LLM Dataset | FineWeb Overview

Video by Hugging Face via YouTube
How to Create an LLM Dataset | FineWeb Overview

A deep dive into how Hugging Face created the FineWeb dataset: starting from Common Crawl snapshots, extracting high-quality text from raw web data, filtering noisy content, deduplicating at web scale, and building FineWeb-Edu with model-assisted educational quality filtering.


đź”— Links
– FineWeb dataset: https://huggingface.co/datasets/HuggingFaceFW/fineweb
– FineWeb-Edu dataset: https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu
– FineWeb paper: https://arxiv.org/abs/2406.17557
– FineWeb blog post: https://huggingface.co/spaces/HuggingFaceFW/blogpost-fineweb-v1
– Common Crawl: https://commoncrawl.org/
– Trafilatura: https://trafilatura.readthedocs.io/


đź‘‹ Connect with me
– My website: https://alejandro-ao.com/
– X (Twitter): https://x.com/_alejandroao
– LinkedIn: https://www.linkedin.com/in/alejandro-ao/


🤓 Topics Covered
– FineWeb dataset creation pipeline
– Common Crawl filtering and deduplication
– FineWeb-Edu educational data filtering


⏱️ Timestamps
00:00 Introduction
00:52 Why FineWeb matters
02:58 Common Crawl as data source
06:10 Base filtering techniques
07:17 Deduplication within snapshots
13:05 C4-style quality filters
17:20 FineWeb-Edu extraction
21:41 Key lessons learned
24:21 Synthetic data on the web
28:39 Conclusion

Source

How Community Driven Platforms Enable the Next Generation of Data, Sergey Pronin #FOSSASIASummit2026

How Community Driven Platforms Enable the Next Generation of Data, Sergey Pronin #FOSSASIASummit2026

Video by FOSSASIA via YouTube
How Community Driven Platforms Enable the Next Generation of Data, Sergey Pronin #FOSSASIASummit2026

Specialized databases like vector, graph, and time-series systems drive innovation—but relying on proprietary managed services often leads to high costs, compliance challenges, and vendor lock-in.

This talk introduces **OpenEverest**, an open source platform built on Kubernetes and CRDs that provides a unified way to manage diverse database engines—from PostgreSQL to modern NoSQL systems. Instead of fragmented tools and cloud-specific services, OpenEverest offers a single control plane for open source databases.

Learn how platform teams can reduce cloud dependency, strengthen data sovereignty, and simplify database operations while embracing the flexibility of the open source data ecosystem. Perfect for DevOps, SRE, and platform engineers managing modern data infrastructure.

FOSSASIA Summit 2026 held in Bangkok, is Asia’s leading Open Source tech conference featuring sessions on #AI, #Cloud, #DevOps, #Open Hardware, #Security, #Web #Mobile Technologies, #Web3, and #Databases. Learn more: http://summit.fossasia.org

#FOSSASIA #FOSSASIASummit #opensource #FOSS #Database

Source

Join Co-Chair Sonali Srivastava in Mumbai for KubeCon + CloudNativeCon India

Join Co-Chair Sonali Srivastava in Mumbai for KubeCon + CloudNativeCon India

Video by CNCF [Cloud Native Computing Foundation] via YouTube
Join Co-Chair Sonali Srivastava in Mumbai for KubeCon + CloudNativeCon India

AI, security, and open source mentorship. Join co-chair Sonali Srivastava in Mumbai for KubeCon + CloudNativeCon India. Get ready for hands-on demos, community booths, and everything you need to level up your GitOps pipeline.

Link in bio to check out the schedule!

#KubeCon #CloudNative #OpenSource

Source

LLM Answers DEPEND on Context, Not Just Power!

LLM Answers DEPEND on Context, Not Just Power!

Video by Open Data Science and AI Conference via YouTube
LLM Answers DEPEND on Context, Not Just Power!

The effectiveness of LLMs, and the reason for the rise of agents, is tied to the context they receive, not just their inherent power.

Want to learn more about AI in person? Check out ODSC AI West 2026, coming to Burlingame this October 27th-29th: https://hubs.li/Q04cYsmk0

#DataScience #AI #ArtificialIntelligence #ODSCAI

————————————————————————————————————-

Visit our website and choose the nearest ODSC event to attend and experience all our training and workshops: https://odsc.ai

To watch more videos like this, visit https://aiplus.training

Sign up for the newsletter to stay up to date with the latest trends in data science: https://opendatascience.com/newsletter/

Follow us online!
• Facebook: https://www.facebook.com/OPENDATASCI
• Instagram: https://www.instagram.com/odsc/
• Blog: https://opendatascience.com/
• LinkedIn: https://www.linkedin.com/company/open-data-science/
• X (twitter): https://x.com/_odsc

Source

Late Night Linux – Episode 388

Late Night Linux – Episode 388

Video by The Late Night Linux Family via YouTube
Late Night Linux – Episode 388

Support us on Patreon and get an ad-free RSS feed with some early episodes. https://www.patreon.com/LateNightLinux

Steam Deck price rises point toward high prices for the new Valve hardware, Lenovo puts its name to a cheap retro handheld and regrets it, Wikipedia management seems to be acting like a typical big tech company and the workers are organising, Bambu pisses off its 3D printer customers and Joe got given a free unrelated 3D printer, and we don’t believe that the Raspberry Pi 6 will arrive as late as 2028.

https://latenightlinux.com/late-night-linux-episode-388/

Source

$5B for Open Source Security, Age Checks Might Exempt Linux, Linus Torvalds on AI & more Linux news

B for Open Source Security, Age Checks Might Exempt Linux, Linus Torvalds on AI & more Linux news

Video by Michael Tunnell via YouTube
B for Open Source Security, Age Checks Might Exempt Linux, Linus Torvalds on AI & more Linux news

Support the show by becoming a patron at https://tuxdigital.com/membership or get some swag at https://store.tuxdigital.com/

This week in Linux, we’ve got a packed episode covering AI, security, and everyone’s favorite, LEGAL News! IBM and Red Hat a massive new effort to secure open-source software at enterprise scale. Linus Torvalds has some very pointed comments about AI-generated security reports making kernel maintainers’ lives harder. Age verification laws are raising new questions for Linux users. Then we’ll take a look at some Wayland window managers with Sway and labwc.

All of this and more on This Week in Linux, the weekly news show that keeps you up to date with what’s going on in the Linux and Open Source world. Now let’s jump right into Your Source for Linux GNews!

### SHOW NOTES â–şâ–ş https://thisweekinlinux.com/346

### Chapters:
00:00 Intro
00:54 Project Lightwell: IBM & Red Hat’s B Open Source Security Effort
03:55 OS-Level Age Checks Might Exempt Linux, But There’s a Catch
07:12 Linus Torvalds says AI Tools Can Help, But Bad Reports Waste Maintainer Time
10:52 MoonRay: DreamWorks’ Film Renderer Moves Deeper Into Open Source
12:10 Sway 1.12 and labwc 0.20 Push Lightweight Wayland Forward
15:04 Flathub and QEMU Draw Different Lines on AI Contributions
19:19 Purism’s Latest PureOS Release Feels Late on Arrival
22:53 Outro

SHOW NOTES â–şâ–ş https://thisweekinlinux.com/346

———————————————————————————–

### Links:
– Project Lightwell: IBM & Red Hat’s B Open Source Security Effort
– https://www.redhat.com/en/about/press-releases/project-lightwell-secure-open-source
– https://www.redhat.com/en/lightwell
– https://www.ibm.com/products/lightwell
– https://www.infoworld.com/article/4178451/ibm-and-red-hat-want-to-become-the-security-clearinghouse-for-open-source-applications-in-the-enterprise.html
– https://devops.com/ibm-red-hat-launch-project-lightwell-to-secure-open-source-software-from-frontier-models/
– https://finance.yahoo.com/sectors/technology/articles/ibm-red-hat-project-lightwell-140943530.html
– OS-Level Age Checks Might Exempt Linux, But There’s a Catch
– https://itsfoss.com/news/age-verification-open-source-exemptions/
– https://fossforce.com/2026/05/the-quiet-clause-that-may-save-linux-from-age-verification-laws/
– https://www.gamingonlinux.com/2026/05/linux-and-open-source-getting-age-checking-exemptions-could-be-problematic/
– https://www.gamingonlinux.com/2026/05/colorado-and-california-age-verification-bills-exempt-open-source-operating-systems/
– https://www.phoronix.com/news/California-AB-1856
– Linus Torvalds says AI Tools Can Help, But Bad Reports Waste Maintainer Time
– https://www.zdnet.com/article/linus-torvalds-has-a-love-hate-relationship-with-ai/
– https://www.gamingonlinux.com/2026/05/linux-head-says-ai-tools-are-great-but-theyre-making-the-security-list-almost-entirely-unmanageable/
– https://www.phoronix.com/news/Torvalds-AI-Tools-Can-Be-Great
– https://linuxiac.com/linus-torvalds-merges-new-linux-kernel-security-bug-guidelines/
– MoonRay: DreamWorks’ Film Renderer Moves Deeper Into Open Source
– https://www.linuxfoundation.org/press/moonray-dreamworks-animations-open-source-production-renderer-joins-the-academy-software-foundation
– https://tuxdigital.com/videos/how-dreamworks-uses-linux-open-source-to-create-blockbuster-movies/
– https://www.youtube.com/watch?v=1Q8xp8fhDLk
– Sway 1.12 and labwc 0.20 Push Lightweight Wayland Forward
– Sway:
– https://9to5linux.com/sway-1-12-wayland-compositor-released-with-hdr10-support-via-vulkan-renderer
– https://www.phoronix.com/news/Sway-1.12-Released
– https://linuxiac.com/sway-1-12-wayland-compositor-released-with-hdr10-and-window-capture/
– labwc:
– https://www.phoronix.com/news/Labwc-0.20-Compositor
– https://linuxiac.com/labwc-0-20-wayland-compositor-released-with-wlroots-0-20-support/
– Flathub and QEMU Draw Different Lines on AI Contributions
– Flathub:
– https://docs.flathub.org/docs/for-app-authors/requirements#generative-ai-policy
– https://social.treehouse.systems/@barthalion/116657011366876079
– https://www.gamingonlinux.com/2026/05/flathub-moves-to-ban-nearly-all-apps-and-submissions-made-with-generative-ai/
– QEMU:
– https://www.qemu.org/
– https://www.phoronix.com/news/QEMU-Patch-Allows-Some-AI
– https://lists.nongnu.org/archive/html/qemu-devel/2026-05/msg07614.html
– Purism’s Latest PureOS Release Feels Late on Arrival
– https://puri.sm/posts/pureos-crimson-development-report-april-2026-pureos-crimson-released/
– Trisquel 12 – https://tuxdigital.com/podcasts/this-week-in-linux/twil-343/
– Fairphone 6 via Murena – https://murena.com/shop/smartphones/brand-new/murena-fairphone-6/
– Support the show
– https://tuxdigital.com/membership
– https://store.tuxdigital.com/

———————————————————————————–

Thanks For Watching!

#Linux #News #Podcast

Source