We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
Analysis Summary
Worth Noting
Positive elements
- This video provides highly specific, practical advice on managing memory and concurrency when bridging high-level Elixir code with low-level C libraries.
Influence Dimensions
How are these scored?About this analysis
Knowing about these techniques makes them visible, not powerless. The ones that work best on you are the ones that match beliefs you already hold.
This analysis is a tool for your own thinking — what you do with it is up to you.
Related content covering similar topics.
Episode 1: Handling Unexpected Errors Gracefully
Exploring Elixir
Optimizing the BEAM's Scheduler for Many-Core Machines - Robin Morisset | Code BEAM Europe 2025
Code Sync
IEx's Hidden Superpowers - Brett Beatty | ElixirConf US 2025
ElixirConf
Mnesia as a complete production database system | Chaitanya Chalasani | Code BEAM V
Code Sync
Microservice the OTP Way - Diede Claessens | Code BEAM Europe 2025
Code Sync
Transcript
[music] [music] Okay. Hi everyone. So, uh, welcome to my talk which is called beyond your first n. So, I'm not going to go too much over a nif basics. Uh, but just as a quick recap, uh, what is a nif? It's a function written in a native language that gets dynamically linked into the beam. Uh, an if has a lots of nice properties, uh, like it can leak memory, it can aug theulers, it can crash the beam. Then you could ask why should I want to do that? Uh the usual reasons are that you need raw speed or you need to reuse existing libraries usually C libraries, Rust libraries, whatever or you need low-level system access. There are other ways to obtain that but n is one of them. Usually your first n looks like this. So it's a n that just gets two integers as input and sums them. And today we're going to go a little beyond that and we're going to uh explore what else the earl library uh offers us. So we are going to cover data exchange between ns and the beam. We're going to try to write ns that don't block the scheduleuler. We're going to go over n resources and debug debugging faulty nifs. And then we are going to quickly see a bunch of language choices in which you can write your n. So for the examples I used C. So warning see ahead. Uh all the examples will be a available on GitHub. So don't sweat in uh like understanding C because it's not possible quickly because we also have less time. Uh and it's not as nice as Elixir to read but that's what you had to deal with. So quick introduction for myself. I'm Ricardo Binetti. You can find me online as Arbino. I'm a senior bean engineer at remote and I've been using elixir as main language since 2017. Uh you uh recently I've been messing with ns for my target beetle elixir client. So that's why I went down this rabbit hole. So let's start with some basic data exchange. Uh the basic n uh signature looks like this. So you can see that we're accepting uh an array of earl terms and we're returning an earl term. So everything that goes between the elixir or heirlang word and the nif word has to have an earn n term type. So first of all we need a way to convert this from and end our native c types. to do so the inf library library provides us for uh provides us with functions to extract terms. So we have all the inif get and inif inspect functions that allow to pass in a term and populate a native uh a native type a native variable uh on the C side. This is what they look like. So here for example we are extract extracting an integer term. we just declare a variable and we call uh inf get int. Uh so this can return an error if the uh for example if the term is not an integer but once we extract that then we can use fu below and it has the same value as the term that came from the elixir word. Uh of course the beam is dynamic so we might have a n function that accepts an integer or a float or something else while on the c side we're statically typed. So we need to have a way to switch on term types. And to do that we use uh either inif term type or inif is integer in if is float whatever. So the idea here is that you have separate uh code paths on your cite to handle the different uh types. So for example, if you use inif turn type uh you can switch uh it returns uh an enum that tells you what what is the actual type that is uh contained inside the the term and you can have dedicated um case clause to handle that. As a last step when we return data from an if or send it as a message we have to convert it to anif term. To do so, we can use the inif make functions which basically accept an integer uh a float or whatever and return a earn earn if term. For example, here we are uh creating a tpple. We have a inif make atom which creates an atom uh with the value answer. we create a value uh for an integer and then we use inif make tpple to return a tag topple uh to that contains the the values we we indicated there. [clears throat] Here is a couple more useful partners that you might use in your n. So here is how you uh iterate through a list. Uh there is this function called inif get list cell and basically it looks if you squint hard enough it looks like a reduce. So basically each time you enter the loop you have you populate your head and you do something with your head term and then at the end of the loop you assign the tail to the rest of the list and so you go on and go on and in if get list cell returns zero when the when the tail is empty. Uh here is how you build an elixir strct. So uh as you probably know uh an elixir strct is just a map with a special key. So what you can do here is that you can if you have an elixir an elixir strct on the elixir side you create your keys and values uh arrays and you have to be careful to populate the underscore strct key with the uh name of the strct and note that you have to add the elixir prefix because that is uh that is um basically uh implicit when dealing with elixir u aliases but on the since we here we are dealing more with the Erlang side we have to add it and uh manually and then after you done that you call in if make map from arrays uh and you return your map. So this is the basic uh of how you deal with data between the between an if and elixir. Now there is a piece that if we go back to the signature that we ignored right now but it appeared basically in every function call and it's that m variable. So that m variable uh is of type earl in if mv and it's very important because uh the environment is the place where where all the terms live and so all terms of type earl term belong to an environment of typem and most importantly the lifetime of a term is controlled by the one of its environment project. So this means that you have to be very careful uh when when uh interacting between a term and its environment. There are actually three different types of environment. There are processbound environment which is the one that gets passed as first argument in our n calls. So the the uh m that you saw before. There is a callback environment which is very similar but instead of being uh passed in ns it's passed in the callbacks we use when loading our n that we're going to see later. And then the third type is the process process independent environment uh which is something that we can create manually. Uh so the important distinctions here important distinctions is that the first two are function scoped. So you receive the environment at the beginning but then when the function terminates that environment is not valid anymore while the process independent environment is the only one that can be created and destroyed at some point in the future independently from the scope where it got created. So this is very important uh and a big foot gun when you want to persist a term that belongs to a processbound environment or a callback environment outside its function scope. So what you don't have to do and I've done it many times. So then I learned to not do it uh is for example create some term with uh with the processbound environment and then store that term in some kind of long long-term storage. If you're doing so to match a reference uh between a request and a reply, you can return the reference here and it will be perfectly valid because it's valid in the context of the process creating it. But when the n returns long-term storage.mmy ref is invalid. So if you try to access this it later, it will I don't know do something bad probably. So what you have to do instead is if you want to do something similar, you still create your ref in the context of the process bound environment. But then if you want to persist it, you have to create another environment which is now uh a process independent environment and call in if make copy which creates another term that lives inside the environment passes the first argument. So in my ref you have the same reference. So if you look at uh from the elixir side it looks the same. The big difference is that the it lives in another environment which means that you can then persist both the environment and the reference in some longterm storage and it will remain valid even after we are out of this function. It will remain valid as uh as long as we don't invalidate the process bounded uh this process independent environment. Another big gotcha here is that uh when you use inspect functions so inif inspect binary and inif inspect io list as binary they're used to take a binary term and basically populate a buffer and a length uh which is an earl binary. Another foot gun that hit me time when I was fiddling with it is that uh you don't have to free the binary. So this should makes you think that the the reason why you don't you don't have to free the binary is that the binary itself uh have has their lifetime tied to the environment. So if you extract a binary, you can't persist it. You still have to to copy the term and then extract it with a separate environment to make sure that the binary is still is still valid. So the main takeaway for from all of these is please consider the environment before persisting your term and uh now let me drink a little bit and then we go to go through n scheduling. So as you might remember from the great talk from yesterday uh this is a highle view of how uh processes uh in the beam work. Uh if you want more details again uh the talk yesterday was much better at explaining this but basically you have one scheduleuler per CPU core and it it executes one process at a time. So here you can see the small boxes and uh each of one is a process and it's getting executed on a scheduleuler. Now the big difference between Erlam processes and ns is that Erlam processes use preemptive scheduling. Nifs use cooperative scheduling. What this means uh concretely is that theuler for for elixir processes the scheduleuler is able to kick them out when their time is up. But with ns it it can't do so. So it has to rely on the Nif being nice and say, "Oh, please go go." Uh, and this uh sometimes doesn't happen. So uh what is a good like a good citizen a good n citizen do? Uh how much time does it take on a scheduleuler? Uh the early documentation mentions a suggested n runtime of about 1 millisecond. So your nif once you enter in the C section of your n should return in some way to the caller in about 1 millisecond. If you if it takes more then things start to get uh uh problematic on the scheduleuler side. And so we are going to see a particularly bad example of this. So uh this is a bad sleep function. It basically implements the same interface as timer.sleep. sleep, but the the difference is that it sleeps directly on the C side for a certain amount of milliseconds. In the real world, of course, instead of sleeping, you would do some actual work. So, let's see what happens with when this gets executed on the beam. And to do so to we will artificially reduce the number of scheduulers to one so the effect is more clear. And you can do it at startup with the uh line above or you can also doing it at uh you can also do it at runtime with the line below with the line below. So uh let's see here a quick demo. So we are going to reduce the number of scheduulers here and who uh this is the normal timer. So here we spawn a process that uh sleeps for two 3 seconds and then prints something and then outside that process we print some other thing with the normal timer. We get the first print. Whoa. Uh and then we get the slept uh print after 3 seconds. Now let's see what happens if we use our bed sleep. So let's focus here. Oh, both got printed at after three seconds because what happens is that we just had one scheduleuler and uh the f the n process uh was hogging it up. So it was just staying there and sleeping with their uh with their arms crossed and the other process couldn't get executed there. So the question is is the question here is how do we make make our n a good bin citizen? So there are three possible techniques. First one is dirty nifs. So basically you use dedicated scheduulers to avoid disrupt disruption. So the previous uh uh diagram I showed you was uh I I lied a little. So this is the more similar to the actual situation. So uh additionally to the scheduleuler you also have a dirty CPUuler for each CPU core and then you have an additional number of dirty IOulers uh which is by default it's 10. Uh note that when you change the the number of scheduulers like we did before the the the number of dirty CPU scheduulers changes but not the the the number of dirty IU scheduulers because they are thought to be independent from the actual number of CPUs. So what we can do to make our nicer is uh uh statically uh telling that it's a dirty uh n and we can do that uh in the array where we declare all our n we can pass the uh the flag nfer n dirty dirty job cu bound or IO bound and it will uh then execute on dirty. So, how do you choose between how do you determine if a n is a dirty CPU or or dirty IO? So, basically, it's dirty CPU if the your fans go and it's dirty IO if it's just passively waiting for someone else like network disk or time or whatever. And the nice thing is that we will see in the next technique that you can actually dynamically switch switch between the two. So there we defined it statically but you can also switch dynamically between the two dirty scheduulers. So the second technique is yielding ns. In this case what we do is that we split work in small chunks. So here basically uh instead of uh sleeping for the number of milliseconds we uh read how many milliseconds are remaining uh and we decrease the count and we say okay are we done here? If we still have to to if we still have to sleep, we sleep just for a single millisecond which is more or less the amount that we are allowed to and then what we do is that we serialize our state which is in this case just the number of remaining milliseconds and we tell the scheduleuler okay please call this function uh at some point in the future with these arguments. So basically uh at the beginning you will get the new remaining uh count and start over like this. If we're actually done we just return okay. And notice that this if you squint hard enough this kind of looks like handle continue. So you do uh do your job one piece at a time. uh and the flags uh variable is where you can actually dynamically switch your dirty type. So if you pass here early n dirty job CPUbound then the n will be scheduled to a dirty scheduleuler the next time uh it runs. So the of course the hard part of uh uh yielding nifts is that they are basically stateless and so you are you have to be able to serialize the entire state of execution inside an array of terms because the next time you're going to be called you you're just going to be able to look at that and say okay I arrived here let's resume from there and in real world uh uh in real yielding if where you don't have a time dependent uh payload that you can just say okay I spent 1 milliseconds you also have to call this enough consume time slice function basically what this does is that you you guesstimate the percentage of a time slice you spent uh already in your yielding n and so you basically have to try it out and see how much time your n uh takes and you say okay I did 5% of nitiz and the scheduleuler at some point will return okay stop here please please yield and you will have to yield at some point notice that it's still the n responsibility to be truthful about how much how much of the percentage of the times it's spent so the bean still trusts you to estimate progress correctly uh how do you do it you have to try things out and measure things. So, uh the third technique is threaded nifs. So, you basically run your code in a dedicated thread and then you can send a message back. Uh here comes some chunky C. So, uh let I'll try to focus on what the main parts are doing. So, here we retrieve the the self pit of the current process. we create a reference because uh now we're doing message passing. So we want to uh match the request with the reply. So then the thing you the the thing you do usually is that you create a reference and match it when you receive the the message. Then as mentioned before we allocate a process independent environment because we want to store the reference uh in the thread to because it it needs to send them uh to to match them. Then we allocate some uh thread data. So all the data that the thread needs to do its job and then we create the thread uh passing in the data uh and uh and and the thread goes and we just return the reference so we can use it to match it later. In the implementation of the thread basically we just uh use the data to do our work. In this case, we just sleep for the number of milliseconds and then we can return what we can send a message with whatever we want here. We just do a topple with the reference and slept. So the caller know that we slept. Then we have to remember of course we are in C we can leak every everything. So we have to free the environment and free the thread data. And now we can see how the three techniques actually work uh in in the demo. So here we have the first one. So dirty nif we we just start and we can see that everything works fine and the first message gets printed then after 3 seconds the second one gets written. Here we have the yielding uh version and I will just count. So 1 2 3 4. So if you can see this, this is a problem uh because it didn't sleep actually 3 seconds because you with yielding ns you have to think about the fact that there is always some uh some overhead because you're doing a uh a little bit of work then you're doing a bunch of stuff to serialize the state and then you're resuming the work. So having a precise uh sleep count is not the the use perfect use case for this. But uh you have to consider that when using yield and nifs. And then for threaded sleep we have a slightly different uh spawn because we have to receive the message but you can always rub that in a in a an interface that looks like a synchronous interface. So here too we can see that we print the first message and then we wait for three seconds and we print the second one. So everything is good. Let's go on with resources. Okay. Um resources are uh a way to keep uh long live data across n calls. Uh they are independently reference counted on both sides. So you basically have a handle on the elixir side and uh another uh like a way to increase and decrease the reference count on the uh ear on the n side which is this one. So when you create a resource you have uh the reference count starts from one and then you can increase it and decrease it on the C side and the the resource is destroyed when both the ref count and the N ref counts are zero. So you can have something that you can keep on the C side even if if we would get garbage collected on the elixir side or vice versa. Uh I will do a quick resource example using a simple shared counter. So we will just have a a strct that contains a count amount and our API will be new counter which creates a new counter increment counter uh which increments it and get counter value returns the current counter value. Uh first of all we have to initialize the resource type and we're going to do this in the load call back which is one of the three callbacks you can execute during the life cycle of your n. So in your nif init uh macro you can pass uh function pointers to be executed when the n is loaded when it's upgraded and when it's unloaded. In this case when it's loaded we're going to initialize the resource count. This is because resources on the Erlang and elixir side are just references. If you look at the term, you will just see a reference. But in on the C side, each uh each one of these references have a specific resource type. And to do that, we have to create uh a resource type that is usually uh put in a global variable because it needs to be accessible when you uh serialize and deserialize the resource. And you just uh declare a earl resource type variable. And then you create a a specific resource type by calling the the inif open resource type function in the load callback. Note that we are passing a a destructor because resources when resources are destroyed. So when the reference count is zero from both sides, we can execute some code. This is because the resource might uh contain some uh pointers to some externally allocated data and that is the place where we free them up if we don't want to leak stuff. Uh in this case we are just allocated the the memory for the resource itself and that is already handled by the beam for us. So we can have an empty destructor here. Now our API is implemented like this. So first of all we have the new counter and we allocate a resource of a of the specific type that we created before and that creates the memory space to put our counter into. Then we initialize the count to zero. We call in if release resource because in this case we just want the resource to be handled on the elixir side. So we're not keeping uh keeping it on the C side. We just want to uh you we just want it to be garbage collected when the term of the resource has no references on the elixir side. And then we call in if resource which returns a ref which is going to be passed around and back into our n. when we pass back the reference into our n, we can uh basically extract the counters a pointer to the counter strct from that uh from arc zero which is the reference that we passed into the n. We can increase the count and we just then return okay and then we have get counter value which basically does the same. So it extract the resource uh it uh it extracts the value from the count this time and it returns the value. So everything is fine right? So we have a shared counter. We can pass it around. H wrong. So why this is not fine? Let's see. So let's bring back the schedulers to more than one. And it's a slight foreshadowing here because you can see where this is going. So first let's try to create a new counter and increment it five times and get the counter value and everything works. It returns five. Wow, seems to be working. Now let's do a mega counter and instead of just increasing it uh in line, we spawn an async task uh that calls increment counter and we spawn 100,000 of them. So at the end we'd expect to see 100,000 here, right? Whoops, it's not. Why it's not 100,000? This is not a failing demo, by the way. It's just that this line is not atomic because all our processes are accessing the count and uh this type of increase is not atomic. So even if it looks like a single instructions what what hap what's happening behind the scenes is that the count is being read from memory then increased then rode back to memory. And if 100,000 processes are doing this together, sometimes we might lost uh lose an update. So, uh the the key thing here is that we you're not in the cozy airline concurrency model anymore. And so, say hello to race conditions. And how do you solve that? Like you do like you solve race conditions in C. So you have to either use atomics or you can use uh some of the primitives that are all offered by the inif uh library. So instead of using the like the uh the inif library offers like mutxes read locks and conditions that are crossplatform and that also helps you with debugging because they have some names that you can look at when stuff goes wrong. speaking uh about stuff that goes wrong. Uh so what do you do? Uh now unfortunately debugging sackfalls is not something I can go through in five minutes. So this is the part of the talk where I where I will draw some circles and just tell you draw debug the rest of the freaking n. Uh and all these tips will be in the code repo too. So again don't sweat to to take photographs. So these are the main tools that we have available. So the the most important one is that we are able to debug to compile the emulator in debug mode which helps a lot because uh it adds lots of assertions and um helpful uh messages uh and helpful way to debug the core dumps later. Uh and then for memory problems you can add you can use address sanitizer or val grind if you want to be more hardcore about that. And then there is uh RR which is uh record and replay which is a nice tool from Mozilla which basically allows you to record what is like something similar to a core dump but the cool thing is that you can go back in time. So if you crash on a variable that is has some garbage in it, you can go see where the that variable was written in uh in uh in reverse. So interesting uh useful snippets that I found out by uh like screaming sometimes about this. So uh because the the the strange interaction is between like miser or asdf or and uh yex and what is documented on the erlang uh documentation which is which refers to erlang. So here is how do you uh build the bug uh emulators. So you just have to pass the uh a flag to KL uh as a uh environment variable and you can also you also if you use miser you also have to force it to compile the stuff otherwise it just use a pre-ompiled binary uh and that's it uh how you generate a core dump so uh you do ul-c unlimited and then you start your uh your ex shell using the debug emulator in this uh in this uh in this example then you crash the thing and then you can debug your core dump. So uh you have to point the ear top variable to uh a repo especially if you're using MIA or ASDF because they clean up the source afterwards. So you have to have a repo with the correct version of OTP you're using and then you can open up the uh core dump using the serial script. Uh which uh why do uh the reason you want to do that instead of using the GDB directly is that it provides some nicities. For example, if you have a earn term in your nif, you can do ETP uh and the name of the variable and it will instead of a random uh like uh memory address, it will print the actual value term value as you would see it on the Erland side. To debug memory problems, you can start with asen. Uh that's as much as I'm going to tell you like uh you're you just set a log deer and you tell it to alone error. So it once there is the first fault it just crashes and prints a lot of very interesting stuff and then you launch the emulator with the asen uh flavor and uh that's it. That's nice stuff because while preparing the talk, I was able to segold GDB while trying this out. So have fun. [applause] La last section. So uh choose your poison. This is a quick overview of what languages you you can use on the to write uh ns that are not C. I know you all love C, but you can also use something else. uh I won't go too much into the pros and cons because it's totally a a flavor uh issue. So with C++ you can use fine which is a it's not my judgment on the language by the way. So uh this is uh this is a library uh which is uh uh written by dashbit and it's also it also powers python x uh it reduces some n boiler plate and you can use c++ instead of c. I will just show quick examples about uh on the add uh showing the add function that I show in the beginning. This is how the add function looks in fine. So you the nice thing is that your arguments are already d serialized. So they're already a native type. And then on the Elixir side, you have to do some mambo jumbo, which is the usual stuff to load the n and uh have a fake n implementation that just returns an error and that's it. With rust, you can use rustler. So talk before me lots of more interesting stuff about that. So you just go quickly. You you just have this function. So this is what it looks like in Rust and you call Russell.init with your uh mo elixir module. Uh and then you do this on the on the elixir side. So less boilerplate and with Zigg you can use Ziggler. And the nice thing about Ziggler is that you can directly write a zcode inside your elixir model and it automatically gets translated into an if and all the uh like the stubs are created and all that stuff like that. So in the end this is exactly like macros. So rule number one don't write ifs and rule number two go break rule number one. Thank you. [applause] Wow, this is the talk I wish I had. I think I made all of those mistakes. [laughter] Really great job. Uh, it's basically lunchtime, so I think if we want, we can do questions in the hallway. Yeah, sure. Thank you so much, Ricardo. >> Thank you. [music]
Video description
✨ This talk was recorded at Code BEAM Europe in November 2025. If you're curious about our upcoming event, check https://codebebeameurope.com ✨ --- You've written your first NIF that adds two numbers together, hooray! Now where do you go from here? This talk digs deeper into the erl_nif interface, showcasing what it offers you for building high performance NIFs. We will start by learning how to avoid the Cardinal Sin of NIFs, blocking the scheduler. Then we'll move on to handling data more complex than a simple integer. We'll see how to efficiently work with Elixir maps and binaries, and how to safely manage stateful resources that need to be cleaned up by the BEAM's garbage collector. Did you know that you can monitor a process from a NIF? Or that you can send a message to a GenServer using its registered name? We will go over all the different ways a NIF can interact with the BEAM, extending its capabilities well beyond executing a simple chunk of synchronous code. NIFs are also capable of bringing the whole BEAM down, so we'll learn some techniques to debug that pesky segfault when things go wrong. Finally, all the concepts we discuss will be valid regardless of your implementation language. We'll explore the available choices and the libraries that make using each language more ergonomic. --- Let's keep in touch! Follow us on: 💥 Bluesky: / codebeam.bsky.social 💥 Twitter: / codebeamio 💥 LinkedIn: / code-sync