We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
justforfunc: Programming in Go · 12.5K views · 277 likes
Analysis Summary
Worth Noting
Positive elements
- This video provides a practical, hands-on demonstration of abstract syntax trees (AST) and visitor patterns in Go, which are often intimidating concepts for intermediate developers.
Influence Dimensions
How are these scored?About this analysis
Knowing about these techniques makes them visible, not powerless. The ones that work best on you are the ones that match beliefs you already hold.
This analysis is a tool for your own thinking — what you do with it is up to you.
Transcript
hi I'm Franciscan boy and this is just a funk we're going to continue today with what we're doing before talking about ASD and parsing programs and stuff like that but this time rather than answering the question what is the most common identifiers on the library we're going to answer a slightly different question which is what is the most common name for local variables and package variables also known as Global's in the Sunda library and in order to do that rather than using the scanner we'll need to learn how to use the parser that's lots of fun let's get started okay so the important thing to know here is that in the previous episode we were working with a sequence of bytes which is a program the source code and work is in the scanner to convert it to a sequence of tokens now what we're going to do is we're going to convert that series of tokens into a tree that tree that is constructed by following some rules those rules are actually part of the go problem specification so you can find them on the go from specification page which I recommend tweed thing I recommend that many times to read that document so you should probably go over with it it is pretty easy to read also very deep and useful for anyone no matter your level so if you've never read it go read it if you've read before it's time to build again probably anyway so in there you're gonna see things like well a variable declaration is something that starts with a keyword var token var then there's a series of names then there's an equal maybe some types and then some values right and that is a specific row and you can find it in the standard library like write this now once you have those rules those rules form a grammar which also we call a syntax and once we generate a tree from those rules then we have an abstract syntax tree that we call an AST and that's what we're gonna be using we're actually going to analyze a program not from token by token but given a whole tree of the program we're gonna be able to find the pieces that we care about let's get started writing some code so I'm gonna create a main Delco file in my parser directory and this program it's gonna be doing something quite simple so rather than using the scanner as with it before we're gonna be using the posture so I'm going to import that import go parser and parser dot has a bunch of things one of them is parse file and that parses a whole file so we need to pass a file set token which is as the previous episode just something to basically hold information about all the files that we've parsed and you create one by doing token dog new file set so that is the file Sabrina pass then we're gonna pass the filename which in this case rather than passing the filename I'm going to pass directly the code directly into this next one so here I'm gonna pass save are a equals three and then we have the mode and the mode for now we're gonna do parse all errors and basically that says please pass all of the things we'll see other modes in a little bit okay now it fits in screen so when we have parse file that returns a file in an error if there is an ill let's block fatal otherwise let's print F and put these actually that's cuz I'm not sure printing the tree by itself is gonna be very useful it's just a bunch of pointers and stuff like that but let's keep it like this for now let's keep it simple okay so go remainder go and this says expected package found var and this makes sense right because we said parse file and any go program starts with package other than for some comments before so we need to write package main here and then gonna is a semicolon you could also use a line break okay if you want it to parse something that is an expression like three plus five stuff like that you could use the parser dot parse expression for this but what we're trying to pass is a var thing that var is not an expression is a statement of declaration so it will not work okay so let's try it again cool so nine works and we get this thing then you know not incredibly useful but we can see that indeed there's main and then VAR a let's try to print something a little bit more valuable with the pound V that gives all of this we can see that there's a pointer to file and I had the name which is a pointer to an identifier that has some declarations and some scope and some important stuff like that not very useful like this either but there's a cool program a cool package it's called pew goes P or something we'll face happy go Q so this part this package here allows you to spew dot dump of the file and this should give something slightly easier to read there you go so now it gives us an idea of what these tree looks like we start with a pointer to file has comments there's no comments has the package which is actually talking position it's in position one then we have the name of the package which is name and then we have a bunch of declarations well actually just one lens this one inside the clergy shun doesn't have any comments and that declaration has a bunch of specifications in this case only one and that is the declaration the identifier a and the value three which is an integer cool so that is a tree that is the tree that we are talking about right so now that we have this how do we go inside how do we extract information from that trip we could go and write our own program but there's actually a better way of doing this which is using what's called an ASD an ASD visitor so we're gonna do is we're gonna be using st walk and a st walk is a function that gets two for two parameters and visitor and a note now when you pass a visitor the visitor has it's an interface and the note is also an interface any node in the ast is a node so for instance f this F which is a pointer to file is a node Declaration is also now all of those things are kind of not a visitor though is something interesting what is a visitor well a visitor is an interface with one single method visit that given a node returns a visitor so we're gonna need to implement one of those I'm gonna can I call it visitor V and just do this for now so let's define visitor for now let's make it an empty and empty struct we don't need any contacts in here because what I'm going to do is every single time we call visit on node that returns a st visitor it's gonna return itself at the end but before that it's going to print the tag of the node and and okay so now if we do this not happy that visitor oh it's up not a pointer it's a cousin interface you don't pass pointer to interface that's kind of weird okay so that that can pass so now we run this where are we gonna get is boom we're gonna have a file and then fire general declaration value spec and it's fire basic literal and this kind of makes sense right this looks like the notes we have in the tree now the problem is that we actually lost information of how deep they are which is kind of a problem and also there's a bunch of nails why are there so many nails well the solution to this is surprisingly to read the documentation so if you go to the visitor type here you will see that a visitor sorry not not the visitor documentation but the walk documentation you will see that so traverses an ast in death first order it starts by calling visit node node must not be nail if the visitor W returned by the coal is not meal walk is info recursively with visitor W for each of the non nail children of note so basically if I'm at file and the visitor that I returned will be used to call it on all the declarations and all the notes that are right - like children directly not descendants or grandchildren whatever just children if it's nil then we stop at the end once we've called all of the children of the single node we call nail and that's what we're seeing here that's all of the nails right so what we can do is we can do a little game here we're gonna make our visitor be an integer and that is the depth of the tree so then we could add strings that repeat it of V and we're gonna return V plus 1 now with this we're gonna be printing something with in Dec within the intention indentation so if we run it but this actually looks like a tree which is cool now we should just ignore the nails because we don't really care about them so if and is nil then just return nil because also we know it's the last one so nothing else gonna be cope with it so it's cool so there you go now we see that for our code package main for a equals 3 you get a file and a nifty fire which is the int fire for the package which is main and then a list of declarations with one of the declarations being some identifiers a with some basic literal 3 and that is our ast cool now knowing things how are we going to do it so we can extract information that we want notably the all the variable declarations which could be things like this but also could be func main a equals let's call it B equals J this is also a variable declaration right and we want to also keep those in mind so you run it you will see that there's inside of the function declaration that is the function main there's a function type which is nothing block statement has an assignment statement and an assignment statement is also variable declarations so we care about assignment statements and we care about value specs those are the two notes that we're gonna try to understand so let's change the code that we wrote last time for the parser in order to adapt it to our code today so I'm going to get the scanner code and just pass it back pass it here okay so things I need to change we're not gonna have a counter instead we're gonna have a visitor Barbie visitor that will define later on we don't need to read the file really we don't need to scan it we just going to pass it so to do that we're gonna do a parser dot parse file and we're gonna pass our files file set the name of the file itself then we don't need to pass the content and parser that all errors and that would be if the error is known you know since we're gonna be passing many files it's better to just log it and continue because we just do log fatal it will never work because understand the library there's peace of God that do not compile because there are tests so we should not just panic there so lock print could not parse this file because of these reasons so far is arc and clears this and continue otherwise we're going to do V dot now ast dot walk visitor and F see and then this part where we print the top five elements we're gonna move it to a function func screen top five giving a map does all of these things and we want to use all of them so if you don't know what this code does go watch the episode 24 that explains how like what this actually does basically it just prints the five more common elements in this map taking into account that the key is the name of the identifier and the int is how many times we saw it so we need to declare a visitor type visitor is a struct and what we need to do is we need to have a map for locals so map string of int map string int and also Global's map string int and when we call visitor visit st the node we're gonna return a steep that visitor and turn v4 now okay so now what we want to do is we want to do something only when we see a type of node that we care about so to do that the easiest to start with actually if the node is near just return nail otherwise we want to do a switch on the type so switch let's call it D which is declaration that's why I want to call it D so n dot type and if the case is so let's start with let's start with : equals because we know that they're always local that's easy right you're gonna do a colon equals three outside of a function therefore all of the variables declare like that they're locals always so let's do that so it's going to be an assignment statement so case ASD assignment stamen just friend let's print print not the type because we have the type sprint the dot let's see the dot left hand side that is the names on the left hand side which are the things that were declaring also we should be able to say that if the token is not token dad come column equals is there a colon equals thing let's see define okay so if the token is not define we should continue we should return V because if the token is just var it's just a equals three we know we're not declaring anything so we can ignore it so okay so let's run the program we need to pass a parameter let's build it and use parser it's usually done itself for now and see what we get mein dekha okay so we see that we have FF air pears IND and that looks pretty good because we have so FS is here so that's a Collin equals the this one here doesn't appear which is interesting we should fix that to the range so let's see let's just put it here for now case ast the range statement and if we have that we're gonna print the key and value so did dad keep indeed add value okay so now we can see the art and s and n so these looks better okay so I think this might be enough so we have switch appears already here the range stem and now appears we have also the if statements but pretty sure that they also appear so let's see do I have any done no so let's try it here so if I do if L and now it's true we see L here so if statements appear for students appear switch statements appear so we are good okay so let's start with our assignment so LHS LHS is a slice so we're going to do is for every kind of name in range the LHS name is let's print it's time for now so name so we can know what we're working with we have identifiers and that's what it should be if it's anything else than in fire we it's not a declaration so we're good so we're gonna do if name as an ast identifier is okay then dent the name cool so if the name is underscore we're gonna ignore it continue otherwise we're going to add it to our visitor so visit the locals of Edenton name plus plus and that panics because we didn't initialize this correctly so let's initialize this correctly we need to give a map from string to int and another one okay so at the end we're going to frame the top five for vehicles so we saw VF as the okay so that looks pretty good I have a concern though I'm going to create one folder test and inside test let go so if the a equals three and a P equals three what we should see is that a is declared only once because here we should not come in because it's already been defined before in order to make it actually compile it's a B so these ops three in sure okay so this one is a declaration of a this one is not a declaration of a it's just a statement it's a it's an assignment so how do we check the difference between those two right now if we run global and parser overt test test ago we will see that a is declared twice the thing is that we in order to know whether this is the clergy or not what we're going to be using is the object so there's a this Bible here that points to an object which is what so one is this object referring to what is the distant afire referring to in one of the things it has is has a position where was this declared so we can check is if the position is the same one as the position of these declaration then it means that these are declaration right so if the only play the place where this in a fire has been declared is this place then we know that is the declaration so if we run it now we see it only once just in case I'm going to say that object should not be near home because if the object is now then it means that it's it something that either has not been occurred and it's not declare either here or it's been declared in a different file and then it's not a local declaration either for the range statement we should do something similar so if the is a still identifier identify the name is an underscore will continue otherwise we actually going to do exactly the same so let me just do it counted at fire count local identifier of name five count local affinity fire and we're gonna get a node I see the node and we're gonna do our solar that is what we're going to do we also need to pass our visitor and continue it's gonna be returned okay so we're gonna do is let's change this a little bit like this a little better so we are going to count against fire of name via name and similarly here we're gonna do count locality fire of the key and count local identifier of the DAAD value so V and D cool so let's modify the test program to try it so for a equals range int one two three print a we should be able to see that a has been declared right there there you go so a is nil are we still printing this we don't need to okay so a appears once which is cool and we have AP should have AP AP and if for any reason we have this we do not see any of them a B because they're actually not declared they're declared on top and we still not handling this so this looks okay cool so we're able to count the assignment statements we're able to count arrange statements we're pretty good now what we need to do next we should count the other one the VAR statement and that is a little bit more complicated because we could have something like VAR a equals 3 or we could have VAR p equals 2 right and they look exactly the same so in the in the node that we're going to receive from the tree so in another we'll see inside of the tree we will not see a difference between these two between maybe we need to actually keep track of what are the declarations that have been done at the package level so we're going to do is I'm going to have package declaration which is a map of St ten declarations which is the type of any declaration so when you have var or Const or type those keywords there that will generate a ten declaration and to pull in and what we're going to do next is we're gonna need to initialize it correctly so I'm going to move this to a new visitor and let's do a new visitor ast the file return so what I'm going to do is declaration is gonna be a map of that type and we're going to go over all the declarations in the file so for range of declarations a declaration is an ast that value spec that is when we return in either a variable or a constant we're just going to say so not that sorry V comma okay is that and okay we're going to keep track of the X of V equals true and then we're going to be able to use that list when we are receiving not happy value spec does not declare it's not value spec it is value let's see what are the Eckler asians agenda there you go and that is yes why is this not happy Oh equals true there you go and this returns a visitor visitor okay so now we're gonna need to move this a little bit because the visitor is going to be created right after the file has been parsed because we need that parser and then we're gonna need to keep track of locals in Global's which are going to be both map of string to int well okay and for key value range these are the locals we're gonna need to add to increment the locals so key V and range over Global's enter the same Global's key of K is plus V and then we're gonna print locals mass common local viable names and and I'm lost there you go n mass common global volumes sense okay so if we do a equals three we should see okay so we have most common local variable name so far is one it's a the previous only ones that is okay that it that looks pretty good okay so what we need to do next is to add one more case which is case st gender and the first thing we need to check is if the token is not far then we should ignore it because it might be a constant it may be a type we do not care about those in this case otherwise we are going to go over the what the specs so about four spec in brain specs its back ease of we don't know what let's simply print it for now print the type so we can continue easily I do this all the time when I'm not really sure was the concrete type that I'm dealing with I don't know if it's a good practice or if there's better ways of doing it but I think it's pretty useful okay so we have a value spec which is as we expected actually so a value okay is expect this okay what we going to do is value dot so we have names and we have values and the names are going to be all of the identifiers that are either logo local or global we still don't know so let's go range over this range over the names and let's print them from now println name of name so we should be able to see a and B which are defined a and B so far a VAR b so the last thing I want to do is I want to verify that this declaration is not a global declaration or it is actually so the way we're gonna do this is if the dot package declarations of spec then the dot Global's of named app name plus plus else did the locals name that name plus plus also we should ignore if named that name equals underscore just continue we do not care about that underscore [Music] canidius types mega step in map index oh that's actually d okay so now we're seeing p and a and a and so let's go real quick about the whole thing or a visitor is checking if there's an assignment and it checks those assignments if there's a colony equals then it checks the the name and it checks if whether it's a local or not it's a if it's a declaration or not same for range statements and then for general declarations we check the it's a var something and we check whether its underscore and we also check for global so whether it's a global thing or not let's make sure that this is actually working I'm going to remove a from here we should see that a B appears a local variable and a appears as a global variable that's exactly what we want cool so I'd say that this code is good now so let's finish by actually running it so I'm going to go build parser and let's do again over all of the go files in the standard library some of them will not work which is fine because there's tests and now we can see that the most common local variable name is error not shocking also X and I if you remember from the previous episode the most common identifier was V but that one doesn't have appear anymore which makes me think that that V either was a type or a concern or something like that or probably an identifier that was defined not that often but was used very very often that is my intuition about the problem about the about the most common global variable names we have error signals failed test and debug and something interesting is how people say well in go we always use very short variable names well that is true for local variable names but for common double vile names those names are always forwards which I'd say is the best practice if you have a local variable then yeah sure go for I but if you're going for something bigger well go for tests or errors or signals are actual words that make sense so this is the end of the episode I hope you enjoyed it let me know if you want me to continue with this series I think it's a pretty interesting series on program analysis there's more to see do you want to talk about types we did not do anything related to type in here but there's way more we can do let me know if you want me to cover all the topics you can always send ideas to form the chat so Frank calm you can also get in touch with me on twitter at ask front desk and finally i just wanted to have a quick note thanks so much we just passed 10,000 subscribers in the channel if you're not subscribed yet why don't you click Subscribe let your friends know also huge thanks to the patrons that are supporting me on patreon calm there is I think 45 or you right now which is amazing it's blowing my mind and it's allowing me to have mics that float in space so thank you so much and to all of you as always thanks for watching and see you in two weeks [Music]
Video description
In the second episode of the program analysis series we use go/parser to find out what's the most common local variable name and the most common package variable name. Get ready for visitor patterns, ASTs, and more! blog post: https://medium.com/justforfunc/understanding-go-programs-with-go-parser-c4e88a6edb87