We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
Fred Overflow · 342 views · 18 likes
Analysis Summary
Worth Noting
Positive elements
- This video provides a highly technical and clever demonstration of using Java 9's VarHandle API to create a persistent, thread-safe data structure with minimal performance overhead.
Be Aware
Cautionary elements
- The use of 'revelation framing'—suggesting this is a 'trick' others haven't caught on to—might lead viewers to underestimate the edge cases of custom concurrency primitives.
Influence Dimensions
How are these scored?About this analysis
Knowing about these techniques makes them visible, not powerless. The ones that work best on you are the ones that match beliefs you already hold.
This analysis is a tool for your own thinking — what you do with it is up to you.
Related content covering similar topics.
The Genuine lazy Sieve of Eratosthenes (Java, Records, Iterable/Iterator, PriorityQueue, Comparator)
Fred Overflow
Episode 1: Handling Unexpected Errors Gracefully
Exploring Elixir
The Taming of the Deftype Baishampayan Ghose
Zhang Jian
Parallel Map in Erlang
BEAM Channel - Erlang & Elixir
The Data Reader's Guide to the Galaxy Steve Miner
Zhang Jian
Transcript
You know what I love about Java strings? When you accept the string from the outside, you can just copy the reference. You don't have to make a defensive copy. And if you return a string to the outside, you can just return the reference. You don't need to make a defensive copy. On the other hand, if you accept a list from the outside, then just copying the reference could be dangerous because the client could pass a list that he later clears and then our own list is also cleared. So you better make a defensive copy. And if you return a list to the outside world, then the client could also clear your internal list. So you better make a defensive copy or return a read only view. What do strings feel like from the API perspective? So you can make a string car. Car plus pet would be carpet, but the original car is still the same. And if you replace the car in carpet with crumpet, then you get crumpet, but the carpet stays the same, right? So no operations on string will ever mutate an object. And one day I thought to myself, why aren't lists like that? How would they feel if they had the same kind of API? So we can make a car. Car plus pet would be carpet, but the car would be unchanged. And if you replace the car with crumb then you get a crumpet but the carpet stays unchanged. Now why aren't lists like that? You may think well if car and carpet are completely separate lists then plus must be a very expensive operation and this is fine for small lists maybe but what if you have a million elements that this must be unbearably slow. [snorts] So I measured this in my implementation. I made a million calls to add to build a big list and a million calls to plus to build a big list. And the plus version was only 16% slower. So plus does not copy all the elements. It has the exact same um algorithmic time complexity as add on an array list. And even if the overhead was bigger, I don't think that would be a problem in practice because you wouldn't really call plus a million time to build a big list. What would you do instead? You would work with streams of course, right? Okay. So, how is it implemented? The big idea is that the array has additional capacity and the array is shared between seek instances. So, for example, here the first three slots are already in use by uh one seek instance which uses all three of those. But the previous versions which only had a and b or just a use the same string array and as long as there are additional empty slots available we can just reuse the same array across instances. That's the big idea. Okay. Now we can't allow null for real elements because we need to distinguish between used and unused. But I think that's fine. Modern collections don't allow null anyway. And then let's assume there's additional space just to make the example easier to understand. Then what do we do if we want to make a new version with an additional element? We use a weak compare and set to say uh if the array at length is null then we want to replace it with the element. Right? So this is completely thread safe. If the array element isn't null, so for example, the version um that only has a and b for now and wants to insert an x here, but there is no space, then we need to copy over and do a normal array insert. Right? So from here on we have uh diverging histories if you will and the normal sharing takes place. Okay. And I think it's quite surprising. So in our benchmark, we had a million new seek uh object creations and we had a million weak comparison sets and it was only 16% slower. So maybe you could optimize this for the case where you're not interested in thread safety, but uh personally I'm not interested in that. I want this type to be as safe as possible. So why aren't lists like that? Um, so this weak compare and set on an array slot wasn't always available in Java. It's only been available since Java 9 with the method handles API. And we've only had that since 2017. So it's only been 8 years. And I don't know if anybody has thought of this trick before, probably, but it it if so, it hasn't caught on yet. Now, if you're still not convinced that the Java lists are too complicated, let's look let's look at them. [laughter] So, you can have a normal array list. You can add a car to it. The result is a boolean instead of the list, but as a side effect, the car is inserted. You can replace it with crumb. The result is the old value. As a side effect, the crumb was um used as a replacement. And you can add additional elements. The result is a boolean. And as a side effect, it was inserted. So this is the let's say default list in Java. And I think everybody knows this. If you don't want to add a car into an empty list, you can start with a list that already has the car inside. You can replace it with crumb. And if you try to add a pet, you get an unsupported operation exception. So not every list supports the add method but every list has the add method. Right? So that's also quite weird property of the list interface. So what is the type of B? The [snorts] type of B is not an array list as you would expect but it's an arrays dollar array list. [laughter] Yeah. Which is a different type. Okay. Then we also have since Java 9 the list.off and if we try to replace an element we also get an unsupported operation exception. Right? So is this a problem in practice? What about this guy? Have you ever done something like this? You have a stream and and collect it into a list and then later you say hm I want to use the set method. I want to use the add method. That seems to work. So what's the class of this? Oh yeah, it's an array list. We know array list. So set and add surely are legal. But if you look into the documentation of collectors to list, you see that there are no guarantees on the type mut mutability and so on of the list returned. So it it happens to work today, but Oracle could change its minds and say, "No, we want to return another type of list that doesn't support these operations." And probably half of Java programs in production will break at that point in time. Okay. Oh yeah, there's also link lists. [laughter and gasps] You only see them in universities usually. Um, oh yeah, and you can also wrap mutable lists in immutable rappers. If you mutate the wrapped collection, then you will see that through the wrapper, but uh adding through the rapper doesn't doesn't work, right? So you have to be really careful. Do you want to hold on to this kind of list for for longer or not? Do you want to observe these changes? Okay, then there's also vector. Why would you want to use a vector? because you can mutate vectors from multiple threads and then we can loop over our collection and we observe that car and pet were inserted. But of course this is not a guarantee. These threads may not have been finished by now and if they are uh adding while we are looping over the collection we will get a concurrent modification exception. This usually doesn't h happen in testing but only in production under heavy loads. If your company deprecated vector or forbids its use, you can also use an array list and wrap it in a synchronized list. Yet another kind of list. Um, but the rest of the example is exactly the same. You may get a concurrent modification exception. If you want to guarantee no exception in the loop, you can use yet another kind of list, the copy on right array list, and then uh yeah, the loop is guaranteed to never throw. But the disadvantage is that AD now has linear time complexity instead of constant time mats constant time. That's what the name copy on right implies, right? Okay. So, how would we use our new seek type with multiple threads? We because we don't have an add method, we can't mutate a seek. Uh we need a different type for that. So, we take our immutable seek and wrap it in an atomic reference. The atomic reference initially references a completely empty sequence and then later the threads can say hey dear atomic reference I want you to reference a new value take the old value and compute a new value and if you notice a race condition please do it again and if you notice another race condition please do it again. So there's a literal for loop inside update and get. So one call to update and get may invoke the lambda multiple times. So it it really only works with side effect free functions. You can't use this approach with mutable collections. And if you don't believe me, here is the part from the documentation that makes this crystal clear. Okay. Um now why am I so excited about this? When I teach Java, teaching strings is usually quite simple. And I would want teaching lists to be as simple as that. Right? If you understand strings, you basically understand lists. I really wouldn't want to teach my students, well, hm, there's like a handful of lists that you will see in practice, and sometimes you can add to the lists, and sometimes you can add to the sets, and sometimes not, and sometimes it seems to work, but there are no guarantees, and once you enter threads, it gets really hairy. Um, and then you have all these good practices of when you need to make defensive copies or immutable rappers. And of course, not everybody does this all the time. Um, it would be cool if I didn't have to teach that. At least not from day one. That's why I'm so excited. Okay, ran over. Thread overflow
Video description
00:00 Intro 01:46 Benchmark 02:22 Implementation 05:04 Java Lists 07:51 Concurrent Lists 10:06 Outro VarHandle MethodHandles weakCompareAndSet ConcurrentModificationException AtomicReference