Your Mac Has Hidden VRAM… Here's How to Unlock It

Alex Ziskind · 160.5K views · 5.1K likes

Analysis Summary

30% Low Influence

mildmoderatesevere

“Be aware that the 'hidden' nature of this feature is a framing device; the command is a documented system parameter, and the primary risk is system instability which the creator demonstrates but downplays for entertainment.”

Transparency Mostly Transparent

Primary technique

Human Detected

98%

Signals

The video features a distinct personal voice with natural conversational flow, spontaneous reactions to software behavior, and unique humor that deviates from formulaic AI scripts. The content is highly specific to the creator's hands-on testing and personal hardware, confirming human production.

Natural Speech Patterns Transcript includes self-correction ('You can't download more RAM, but...'), rhetorical questions, and conversational filler ('right?', 'Well', 'Why the heck not?')

Personal Anecdotes and Humor The creator makes a recurring joke about 'downloading RAM' and includes a scripted humorous aside with a 'mom' character.

Live Demonstration and Error Handling The narrator reacts in real-time to a model failing to load ('Oh, what happened here? Oh no') and explains the logic behind the failure.

Technical Expertise and Voice Consistency The explanation of powers of two and sudo commands is delivered with natural prosody and emphasis typical of an experienced tech educator.

Worth Noting

Positive elements

This video provides a specific, functional terminal command that is genuinely useful for developers trying to run larger LLMs on limited hardware.

Be Aware

Cautionary elements

The use of 'revelation framing' (calling a setting 'hidden' or 'unlocked') can lead viewers to bypass caution when executing administrative commands.

Influence Dimensions

How are these scored?

About this analysis

Knowing about these techniques makes them visible, not powerless. The ones that work best on you are the ones that match beliefs you already hold.

This analysis is a tool for your own thinking — what you do with it is up to you.

Analyzed March 23, 2026 at 20:38 UTC Model google/gemini-3-flash-preview-20251217 Prompt Pack bouncer_influence_analyzer 2026-03-08a App Version 0.1.0

More on This Topic

Related content covering similar topics.

DeepSeek 671B params on Mac Studio

Alex Ziskind

Minimal Transparent

apple silicon unified memory

Mac Studio vs. Nvidia: Running Large Models Locally!

Heavy Metal Cloud

Minimal Transparent

apple silicon large language models

M5 MacBook Air vs MacBook Pro M5 - The THERMAL Gap Is INSANE!

Matt Talks Tech

Low Transparent

apple silicon macbook air

MacBook Air M5 Vs M4 - The Best MacBook Deal is EVEN BETTER!

Matt Talks Tech

Low Transparent

apple silicon macbook air

The M5 MacBook Air Might Have a Problem? 👀 #apple #macbookair #m5macbookair #m5macbook #applenews

SaranByte

Minimal Transparent

apple silicon macbook air

Transcript

If your Apple silicon machine, like a MacBook or a little Mac Mini, doesn't have much RAM, and you still want to run large language models that are decently sized, you can download more RAM. You can't download more RAM, but there is a trick you can do to allow your system to use more memory. Here, I've got the base model M4 MacBook Air, and it only has 16 GB of unified memory. So, when I open up LM Studio, and I try to load GPTOSS 20B, a pretty popular model, look at that. It's 11.2 28 GB on disk, which should be okay for a 16 GB machine, right? Let's load it up. Load model. There it goes. It's trying, desperately trying to load that model. Oh, what happened here? Oh no, failed to load the model. Oh, please check the settings and try loading it again. That's not going to help. You You can't just check the settings without doing anything and try loading it again. That's like the definition of insanity. You need to make some changes. So, let's take a look at LM Studio. You can go to settings down here and if you go to the hardware tab, it'll show you what you have available. There you go. Apple M4 RAM is 16 GB and VRAM is 11.84 GB. So, LM Studio actually tells you you can't use all of 16 GB. There's some memory that needs to be allocated for other tasks like your operating system and other things you're running. There's things like the kernel, IO buffers, Windows server, GPU drive, compressed memory pool, and all the background tasks. You can see some of these actually running. If you go to activity monitor and take a look, there they are. Hey, look at that. They're using some memory. Now, Mac OS is pretty good at handling this stuff, putting aside what's not being used, compressing the rest and things like that. But there is a limit to it, right? If you're trying to load a model that's going to be bigger than this 11.84 GB with context, that is cuz when you start to load GPoss 20B, estimated memory usage is 9.26, total is 12.34. So, we need a little bit more, don't we? I'm going to show you how to download this RAM. All right, enough of that joke. I'm not going to show you how to download RAM. I'm going to show you this command here right now. If you take a look at this command, sudo, which means you're going to need to execute this as admin account. >> Hey, mom. Some guy on the internet told me to run a weird command on my computer. >> Okay, just be careful, dear. >> It's like, hey, Mac, what's the maximum amount of memory the GPU is allowed to lock and keep for itself? Right, kid? >> Huh? And then this command will show you what this limit is set to right now. You need to enter your password. And usually it's zero by default. And that's the default setting, but you can change that. 81 92. Why the weird number, Alex? Well, these all have to be powers of two. Like memory, it goes 2 4 8 uh 16 and it goes up from there. 8192 is 4,96 * 2. These are all powers of two, and it's used all throughout computing. That's how things just work really well. So, I'm going to set that. And now, if I run that initial command to check and not set 8192. Let's restart LM Studio. What do you think is going to happen here? I've just set the memory limit in megabytes to basically half of 16 GB. So, if I check hardware, look at that. VRAM is now 8 GB instead of 16. So, it's even less. You get where I'm going here, right? I'm just showing you that basically you can alter it. You can make it less, you can make it more. And LM Studio is a good tool to show you it graphically, which is pretty cool. So now, let's go a little higher so we can actually load that model. Well, should we set it to 16 GB? No, because if you do, you're going to break things. Why the heck not? You know what? This channel is for that kind of thing. I break things so you don't have to. I'm going to set it to 16,000 and this is probably not going to be good. Yeah, it's set to 16,000. Let's quit LM Studio. Hey, maybe it'll work. I don't know. Let's see. Things are still working and snappy. Let's check hardware. VRAM is set to 15.63 GB. Um, I'm pretty sure that's not a good idea. This is probably not very safe, but I'm going to go ahead and try to load this 20 billion parameter model. Load. Let's see what's happening here. Let's take a look at the activity monitor and the memory pressure. The memory used is going up. And there it goes. It's going up. Oh boy. This is making me a little nervous here. Look at that. It's orange now. And now it's going back down. And it loaded. It loaded. What? Okay, I'm going to keep an eye on that little window there while I try to run a prompt. New chat. Write a story. Boom. It doesn't matter what the prompt is. It's just going to generate text no matter what. And it's working. That's crazy. The memory used is 15.65 GB. The memory pressure is crazy high. It's not in the red though. Sometimes it turns red and that's really not good. But this is actually working. Do I recommend that you set that setting? No. You want to leave a little bit of room unless this is all you're doing with this machine. If you're operating your Mac Mini in headless mode, for example, and you can do that on all the Apple silicon machines, Mac Studios, in fact, uh back here I have a cluster of Mac Studios where I can run really gigantic models sharing the memory between all those. I had to go in there and manually adjust the available memory. Each one of those boxes has 512 GB of memory, but by default, that wired limit in megabytes is set to zero, which is by default going to be calculated to about 75% of what's available. So, if you want to squeeze a little bit more out of that to be able to run larger models, that's the setting you want to use. Now, going back to a small machine like this one, look at that. 16.72 tokens per second. I'm going to actually set the memory limit to something more reasonable. And this is the number I'm going to use, 14336 on a 16 gigabyte machine, which is still not the safest thing you can do, but it'll allow me to have a couple of gigabytes to run for some background processes and still have enough room to run GPT OSS. Going to restart LM Studio because it needs to restart to detect those changes. And yeah, so 14 GB now is allocated to VRAM. And now I can load my model GPUs 20B. There's that memory pressure going up again. But in the end, it loaded. Write a story, thought for a brief moment, and off it goes. Now, if you want to see some huge models running, check out my video on that cluster over there. That video is right over here. Thanks for watching, and I'll see you next time.

Video description

Here’s the one change that let me use more RAM for LLMs on my MacBook Air, and it works on all Apple Silicon computers 🛒 Gear Links 🛒 🪛🪛Highly rated precision driver kit: https://amzn.to/4fkMVfg 💻☕ Favorite 15" display with magnet: https://amzn.to/3zD1DhQ 🎧⚡ Great 40Gbps T4 enclosure: https://amzn.to/3JNwBGW 🛠️🚀 My nvme ssd: https://amzn.to/3YLEySo 📦🎮 My gear: https://www.amazon.com/shop/alexziskind 🎥 Related Videos 🎥 🧳🧰 Mini PC portable setup - https://youtu.be/4RYmsrarOSw 🍎💻 Dev setup on Mac - https://youtu.be/KiKUN4i1SeU 💸🧠 Cheap mini runs a 70B LLM 🤯 - https://youtu.be/xyKEQjUzfAk 🧪🔥 RAM torture test on Mac - https://youtu.be/l3zIwPgan7M 🍏⚡ FREE Local LLMs on Apple Silicon | FAST! - https://youtu.be/bp2eev21Qfo 🧠📉 REALITY vs Apple’s Memory Claims | vs RTX4090m - https://youtu.be/fdvzQAWXU7A 🧬🐍 Set up Conda - https://youtu.be/2Acht_5_HTo ⚡💥 Thunderbolt 5 BREAKS Apple’s Upcharge - https://youtu.be/nHqrvxcRc7o 🧠🚀 INSANE Machine Learning on Neural Engine - https://youtu.be/Y2FOUg_jo7k 🧱🖥️ Mac Mini Cluster - https://youtu.be/GBR6pHZ68Ho * 🛠️ Developer productivity Playlist - https://www.youtube.com/playlist?list=PLPwbI_iIX3aQCRdFGM7j4TY_7STfv2aXX 🔗 AI for Coding Playlist: 📚 - https://www.youtube.com/playlist?list=PLPwbI_iIX3aSlUmRtYPfbQHt4n0YaX0qw — — — — — — — — — ❤️ SUBSCRIBE TO MY YOUTUBE CHANNEL 📺 Click here to subscribe: https://www.youtube.com/@AZisk?sub_confirmation=1 — — — — — — — — — Join this channel to get access to perks: https://www.youtube.com/channel/UCajiMK_CY9icRhLepS8_3ug/join — — — — — — — — — 📱 ALEX on X: https://x.com/digitalix #macbook #llm #llamacpp