Jellypod, Inc.

48-Hour AI

TechnologyNews

Listen

All Episodes

Photoshop for AI, Agentic Skills, and Visual Intelligence

James Turner explores the latest AI model breakthroughs, from layered image decomposition to evolving agent skills and interpretability tools. With hands-on examples and lively discussions, the episode uncovers how recent releases are reshaping creativity, productivity, and transparency in AI.

This show was created with Jellypod, the AI Podcast Studio. Create your own podcast with Jellypod today.

Is this your podcast and want to remove this banner? Click here.


Chapter 1

Layered Images and Creative Tooling

James Turner

Hey everyone, welcome back to 48-Hour AI, I’m James Turner, and today we’re kind of picking up right where we left off last time—right in the creative trenches of new AI model releases. Okay, so if you are even semi-following AI news on Twitter—or, honestly, just in Discord meme threads—then you probably saw the buzz around this Qwen-Image-Layered model from Alibaba. Yeah, Qwen-Image-Layered—big name, big ambitions. They actually call it “Photoshop-grade” layered image decomposition, which… sometimes I feel like that’s marketing oversell, but in this case, I played with it and, I’ve got to admit, it actually feels about right. We’re talking about real, physically separate RGBA layers, anywhere from like three up to ten, all just out of a prompt. You define what you want split out, and the model can even do this recursive thing—like, infinite decomposition if you want to break down a layer into even more sublayers. That’s wild. You can tell it's for creators who love having total control.

James Turner

But here’s the catch—you need hardware muscle, like, a lot more than I expected. The unquantized model is 40GB. I loaded it up on my main PC, you know, the one I keep upgrading with a new graphics card every other year, and the fans started going full jet-engine mode and then… nothing. Out of memory. I always see these memes about servers gorging on DDR5 sticks while personal computers just watch sadly from the sidelines? Yeah, relatable. Had to give up and fire up a cloud instance. So if you’re like me and constantly hitting those resource walls, trust me, you are not alone—the Reddit and Discord threads are full of people griping about this exact thing. Accessibility is still a pain point. But man, once you get it running, seeing edit-ready text separated from the background, editable in literally one click, it’s like, this is the thing a lot of us have been waiting for—real native AI-generated images you can tweak.

James Turner

While we’re on creative tooling: Kling 2.6 is getting artists and animators buzzing with motion control, going beyond just prompts and letting you do repeatable, high-action character workflows, especially these v2v—in other words, video-to-video—setups. And Runway’s GWM-1 just landed their Worlds and Robotics updates, which are all about consistent camera control and interactivity over multi-shot edits. They’re running creator challenges, too, which is such a cool way to push the tech. It honestly feels like we’re entering this “AI creator loop” where every week there’s a new model or feature that unlocks something wild. Anyway—if you’re struggling with getting these tools running locally, or you’re just here to gawk at what’s possible, I get it. There’s definite friction, but what’s available now would’ve been pure science fiction less than a year ago.

Chapter 2

Agentic AI and the Rise of Skills

James Turner

Now, shifting gears—but not really, since everything blends together in this AI world—we’ve got to talk about what just dropped over at OpenAI: Codex skills. Just yesterday, we were chatting on a call about how ‘skills’ are the next big thing for agent development, and then, almost to prove the point, Codex rolled out skills as packaged, reusable bundles. Basically, instead of agents randomly hacking at every problem from scratch, you can now plug in skills to do super-specific things—like, say, reading Linear tickets or automatically fixing CI failures. What I love is you don’t have to manually select skills; agents can call them as needed or even make the selection based on the context. It’s a push towards something that can actually interoperate across platforms. Feels like less glue, more seamless, if that makes sense.

James Turner

It highlights a shift in how we think about agents, honestly. There’s this whole “agent + harness” model people are raving about in the forums and Discords. You’ve got your agent—so that’s your model, your prompts, your tool integrations, memory, the whole shebang. Then there’s the harness, which is like… execution context, resource permissions, policy modes, all that boring but necessary stuff. And harnesses? They’re shipping as products now because they bundle those real workflow features, like planning modes, memory compaction, or resource gating, so non-developers can actually trust—and control—what these models are doing. It’s getting mature, which is wild.

James Turner

But, okay, can we talk about the “sloperator” meme for a second? I mean, the name alone. It’s like someone took ‘prompt engineer’ and just tossed whatever dignity was left straight out the window. There’s this meme floating around, poking fun at how AI creates new job titles that sometimes feel like we’re just—well, slopping stuff together and calling it a day. People are joking about adding ‘sloperator’ or ‘slopchestrator’ to their LinkedIn. But under the hood, it’s kind of true: the wave of new agent tools like Codex or Claude Code means anyone tinkering with prompts and command workflows suddenly feels like a cross between a data janitor and a product designer. The jobs are evolving, and nobody’s really figured out what these titles mean yet. It’s progress, I guess? Even if we have to laugh at ourselves along the way.

Chapter 3

Model Reliability, Interpretability, and Visual Debugging

James Turner

Alright, let’s round out with something that’s always lurking beneath the hype—model reliability and interpretability. If you’re watching model drops and benchmarks closely, you probably caught the never-ending churn: Gemini Flash climbs some charts, GPT-5.2 blows past benchmarks under huge context windows, but try using any of these in a real pipeline and suddenly ‘reliability’ means something totally different. We’ve got this trend of ‘degradation’ discourse, especially around Anthropic Opus 4.5—folks claim outputs get worse or “doomloop” as models see more production traffic. Some say it’s real, some say it’s users expecting mind-reading. And don’t get me started on the tool-using model stunts, like GPT-5.1 literally calling a calculator tool for “1+1” just because it learned that’s a rewarded shortcut during training. It’s almost funny, but also, wow, tool design and reward structure matter a ton.

James Turner

So, how do we actually debug this stuff? That’s where interpretability is suddenly hot again. Google DeepMind just released Gemma Scope 2—which, by the way, is the largest open interpretability suite I’ve ever seen. Sparse autoencoders, transcoders, every Gemma 3 layer up to 27B parameters, and the demo is wild. There’s also Seer—imagine a repo that standardizes interpretability workflows, so you don’t spend half your life setting up the basics.

James Turner

On a personal note, I spent last week chasing my tail debugging an LLM that kept returning this weird, repetitive phrase no matter how I changed the prompt. Fired up some of these new visualization tools, poked into neuron activations and the raw attention sequence, and, sure enough, found this emergent behavior where a few layers were basically hardwired after the last fine-tune. If I didn’t have these tools, I’d probably still be blaming the API, or worse, myself. That’s where we’re at now: we need better microscopes, not just bigger hammers. It all ties back to what we talked about in earlier episodes—model transparency, reward hacking, and why it takes so much more than just a good benchmark score to trust what we’re seeing from these systems.

James Turner

That’ll do it for today’s episode. If you’re feeling overwhelmed, trust me, you’re not alone—every new model drop is both exciting and, let’s be real, occasionally hair-pulling. Stay tuned, because the next 48 hours are almost guaranteed to shake things up again. Catch you all next time.