HOL x AI: US Cancels AI Safety Order | Claude Mythos | Composer 2.5 | Gemini 3.5 |
The Spaces covers a fast-moving week in AI spanning policy, deployment, models, and markets. Michael hosts with Phil (hashtag) and Patricia (HLA/Syntax), opening with industry headlines: the U.S. canceling an AI safety executive order (safety vs speed), Starbucks rolling back an AI inventory tool, and Microsoft restricting Claude/Cloud Code internally. The panel debates guardrails, sandboxes, and global standards, noting GDPR gaps and EU sandbox delays. Demis Hassabis’s claim that agents are a rehearsal for AGI (possibly by 2029) sets up AGI/ASI definitions, timelines, and human-nuance challenges. They warn AI warfare is already here (Project Maven, disclosures via Anthropic’s dispute), emphasizing accountability in autonomous systems. On models, Liquid AI’s LFM‑2.5 debuts as a device‑optimized, privacy‑friendly alternative; later, Anthropic launches Claude Opus 4.8 live during the Space with improved judgment and longer autonomous tasks. The team explores China’s proposal to financialize AI compute via token futures, weighing blockchain parallels against risks of speculation, hoarding, and inequity. Finally, enterprise pragmatism emerges: Microsoft’s Copilot prioritization and Starbucks’s hallucination‑driven rollback. Actionable advice: adopt appropriate guardrails, test lighter on‑device models, monitor model releases, and stay engaged on emerging compute markets.
Hol X AI Twitter Spaces — Full Summary and Notes
Participants
- Michael (Host, HLX AI)
- Phil (Co-host; from Hashtag, Montreal)
- Patricia (Vice President at HLA; Co‑founder of Syntax, a leading NFT marketplace in the Hedera ecosystem)
Sponsor and community notes
- Sponsor: Hashtag (described as the leading retail-facing entity in the Hedera ecosystem).
- Giveaway: $50 in Pack token to one engaged participant; winner announced as X Raptor.
- Ecosystem milestone: Dead Pixels Ghost Club surpassed 100M HBAR trading volume on Hedera via Syntax — first in the ecosystem; a major achievement noted by Patricia.
US AI Safety Executive Order reportedly canceled — speed vs. safety
- Context (Michael):
- Ads are appearing in the US about AI safety from political candidates, often reflecting limited technical understanding yet surfacing real concerns.
- Claim discussed: A US AI executive order was canceled hours before signing (attributed to President Trump in the conversation). The draft would have created a voluntary framework to vet advanced models pre‑release for national security and cyber risks.
- Political calculus: Even a voluntary framework can evolve into de facto standards with penalties; the US chose speed to maintain AI leadership, with China as a perceived benchmark. Safety took a back seat in this instance.
- Pandora’s box argument: Tight US rules could slow domestic labs while open-source and foreign actors accelerate; regulation must consider dual-use risks from bad actors to military applications.
- Phil’s perspective:
- Historical pattern: We tend to be reactive after “flying too close to the sun.” Proactivity and guardrails are crucial.
- Safety is conditional: “Safe for whom and under what conditions?” Calls for audits, testing, pre‑release reviews, and proof — moving beyond “trust me, bro.”
- Retail-facing AI needs sandboxes, data provenance, and guardrails before agentic wallets/automation are rolled out.
- Patricia’s perspective:
- Strong advocate for AI safety but warns laws must avoid handicapping responsible labs while empowering bad actors; AI already accelerates exploits.
- Favors cross-country standards (consortium approach) over fragmented national rules to define acceptable AI use.
- GDPR/compliance (Q&A):
- Michael asked if AI apps follow GDPR for personal logs/conversations.
- Patricia (EU-based): has not seen special GDPR addenda in mainstream apps; typical opt-in/out for using chats to train models is the same as elsewhere.
- Takeaway: The show highlights a tension between national competitiveness and safety oversight. Consensus that guardrails, audits, and international coordination are needed, but speed and innovation pressures are immense.
Agents, AGI, and timelines (Demis Hassabis quote)
- Trigger (Michael):
- Demis Hassabis (DeepMind) reportedly framed current AI agents as a “practice run” for AGI and suggested AGI could arrive as soon as 2029.
- Michael’s view: With the right harnessing, many agentic behaviors already exist (his team runs agents 24–72 hours without human input), but Hassabis’s AGI lens includes embodied, multimodal competence (vision/audio/robotics), different from narrow desktop task automation.
- Definitions:
- AGI: Human-level generality across tasks and contexts (including common sense and embodied interaction).
- ASI: Beyond human capability and speed across domains.
- Phil’s perspective:
- We’re closer to AI’s inception than its endgame; for most people it still feels like “a chat box.” The breakthrough is when the “fourth wall” drops and AI accomplishes end-to-end goals seamlessly.
- Timelines vary (some as early as this year per Elon Musk vs. 2029 per Hassabis). Multimodality and, critically, energy will be key constraints; the endgame may hinge more on energy/data center capacity than on model/chip scale alone. Humanoids and quantum computing could accelerate singularity.
- Patricia’s perspective:
- Cites lack of human nuance (e.g., a model replying “Great” to “I was hacked!”) as evidence we’re not near human-level understanding.
- Hopes we don’t rush toward full replaceability of humans; favors augmentative agents over wholesale substitution, especially not by 2029.
- Michael’s macro view:
- Avoid fixation on “are we there yet?” for AGI/ASI. Like the internet, capability is iterative and unevenly distributed. Keep learning, using tools, and spotting new roles as others fade; avoid doom cycles.
Military AI is here — ethics, guardrails, and the Anthropic–Pentagon dispute
- Report (Michael):
- The Verge deep-dive: AI is already embedded in surveillance, object detection, targeting-adjacent workflows, and even defense procurement (buying arms) — warfare is a leading AI deployment context.
- The Pentagon–Anthropic dispute has surfaced public details via court processes, accelerating transparency.
- Speculation: AI may have been used in early targeting in a recent Iran-related conflict; broader theme is how much human judgment remains in the loop.
- Observation: Conflicts (Ukraine, Iran) are yielding unprecedented open data. Adversaries can infer strategies and plan counters; AI+data could reshape warfare and deterrence.
- Phil’s perspective:
- Hard questions: What if agents identify targets, recommend escalation, control drones, go rogue, or act faster than human accountability? Who authorizes, audits, can stop, and is accountable?
- Stakes are life-and-death; this is the sharpest edge of the guardrails debate.
- Takeaway: The genie is out of the bottle for defense AI. The debate must center on human control, auditability, and fail-stops in high-stakes contexts.
Practical setbacks: Starbucks and Microsoft
- Starbucks (Patricia):
- Reported rollback of an AI-powered inventory tool in North America ~9 months after deployment due to miscounts/“hallucinations,” causing stock inaccuracies that directly impact revenue (e.g., missing caramel syrup for orders).
- Lesson: Mission-critical ops (inventory) can’t tolerate AI unreliability.
- Microsoft and Claude (group discussion):
- Reports/rumors: Microsoft is pulling back employee use of Claude/Claude Code; a migration deadline was mentioned. Possible reasons include poor ROI and high costs. Some notes about usage limits increasing after an Anthropic compute deal.
- Alternative hypothesis (Michael): May not fully end Anthropic partnership; employees could be pushed to use Copilot, which may call Claude under the hood.
- Patricia: Could be prioritizing first-party Copilot as it matures. Exact scope remains unclear until further disclosures.
- Meta-note: This converges with a claim (surfaced via Polymarket) that a consulting client overspent by ~$500M in a month due to uncapped cloud usage — highlighting runaway compute cost risks.
Lightweight, on-device AI — Liquid AI’s LFM‑2.5
- Announcement (Michael):
- Liquid AI released LFM‑2.5 during the show: a 1.5B‑parameter, device-optimized model trained on ~38T tokens, runnable on a single GPU, designed for phones, laptops, PCs, robots, and lightweight server use cases. Strong at computer/device control. They also released “Local Cohort,” an open-source local co-working/agent framework.
- Privacy and cost strategy: Use a local model as a first-pass filter or orchestrator, reducing data sent to larger/expensive frontier models, and lowering risk of sharing sensitive data with intermediaries.
- Phil’s perspective:
- Aligns with the “hide the wires”/ubiquity narrative: next frontier may be lighter, fluid, edge models rather than ever-bigger ones. Distribution inflection happens when AI lives in your pocket, akin to the miniaturization of computers.
- Patricia’s perspective:
- Most phone use cases don’t need top-tier frontier models like Opus 4.7; smaller models suffice and save money, water, and energy.
- Takeaway: Expect a bifurcation — heavy frontier models for specialized reasoning and lighter edge models for everyday tasks, privacy, and cost control.
Financializing AI “tokens” — China’s token futures idea
- Report (Michael):
- China is exploring an AI token futures market — treating model usage units (“tokens” in the LLM sense) like commodities (energy/bandwidth), enabling hedging. Not to be confused with cryptocurrencies.
- Blockchain angle: A natural fit for on-chain marketplaces and measurement; decentralized inference providers exist today, and projects (named in conversation) operate in adjacent spaces.
- Phil’s perspective:
- Tokenizing usage reframes competition around energy, latency, and cost per token — not just GPU counts.
- Developers could avoid shock bills by hedging; aligns with making AI feel like a utility (electricity/bandwidth). Cites Sam Altman’s earlier commentary about monetizing the value harvested by models.
- Patricia’s perspective:
- Implementation matters for consumers: If only large labs sell tokenized usage, it could entrench pricing power and raise costs. A more open model (analogous to mining or running nodes) allowing individuals to produce/sell tokens would be better for consumers.
- Michael’s caution (the “dark side”):
- Speculators could buy tokens as contracts without intent to use, pushing up prices for real users; labs get paid without running compute. Similar distortions occur in oil/commodity markets.
- Phil’s human-impact lens:
- Risk of an “intelligence economy” where people must budget tokens like groceries; students ration tokens to learn, or freelancers burn tokens to stay competitive. Ensuring choice and competition is critical to avoid dystopian outcomes.
- Takeaway: This could birth a new financial market with both promising hedging benefits and serious affordability/ethics risks. Early regulatory engagement will be crucial.
Live breaking news during the space
- Liquid AI: LFM‑2.5 model announced (device-optimized, local-first focus) along with Local Cohort framework.
- Anthropic:
- Rumors via Polymarket about a forthcoming “Claude Mythos” (timeline odds markets priced a near-term release); chatter that big enterprises were given a head start to patch vulnerabilities before public release.
- Confirmed during the show: Anthropic released Claude Opus 4.8 with improved judgment, greater transparency, and longer autonomous task execution — reportedly at the same price. Users received in-app upgrade prompts; noted higher token consumption rates than other models.
- Meta-observation: The hosts repeatedly received news pings and pinned tweets as releases happened, underscoring the pace of change.
Community, ecosystem, and calls-to-action
- HLX AI partner program: Michael invited AI companies to join the HLX AI partner program (see pinned info/website) to participate in open-source foundations and subcommittees, including AI security and privacy.
- Show cadence: Next episode scheduled for Thursday at 12 PM ET.
- Engagement: Encouragement to use, test, and compare tools (e.g., Grok CLI), stay informed, and look for opportunities amid rapid change.
Key takeaways
- Safety vs. speed: The US appears to have deprioritized an AI safety executive framework in favor of pace and competitiveness. Guests argue for proactive, audited guardrails and cross-border standards to avoid fragmenting the ecosystem and handicapping responsible actors.
- AGI timelines diverge widely (from “this year” to 2029), but practical agentic systems already exist. Energy and distribution (edge vs. data center) may be as decisive as model size.
- Defense is already an AI deployment frontier, raising hard questions about authorization, oversight, and accountability.
- Enterprises are course-correcting on applied AI (e.g., Starbucks inventory, Microsoft’s internal model choices), emphasizing reliability, cost control, and first-party integration.
- Lightweight, on-device models are accelerating, offering cost, privacy, and environmental benefits for everyday tasks.
- Turning LLM tokens into tradable commodities could be powerful (hedging/utility framing) but risks market distortions and consumer harm if not designed for broad participation and fairness.
- The news cycle is blistering: model updates (e.g., Opus 4.8) can land mid-show. Staying current and hands-on remains a competitive advantage.
Notable acknowledgments
- Patricia highlighted Hedera’s milestone (100M HBAR volume on Dead Pixels Ghost Club via Syntax), reflecting continued growth in the ecosystem.
- Giveaway winner: X Raptor (to be contacted for $50 in Pack token).
Final remarks
- The hosts emphasize learning by doing, following trusted sources, and joining collaborative initiatives (like HLX AI’s partner program) to shape security, privacy, and deployment standards. Rapid iteration, informed discussion, and responsible guardrails are recurring themes throughout the session.
