We can't find the internet
Attempting to reconnect
Something went wrong!
Attempting to reconnect
Analysis Summary
Surrogate Validation
This technique was detected by AI but doesn't yet map to our curated glossary. We're tracking its usage patterns.
Worth Noting
Positive elements
- This video provides a rare, detailed look at how a high school student can successfully navigate the peer-review process at a major AI conference like NeurIPS.
Be Aware
Cautionary elements
- The content uses a 'surrogate' (the student) to validate the host's commercial and educational offerings, making the promotion feel like a neutral success story.
Influence Dimensions
How are these scored?About this analysis
Knowing about these techniques makes them visible, not powerless. The ones that work best on you are the ones that match beliefs you already hold.
This analysis is a tool for your own thinking — what you do with it is up to you.
Transcript
hi I'm Jeremy Howard from answer Ai and I'm joined here by Sarah Sarah is answer ai's first ever fellow um first uh founding member of our fellowship program and today we're going to hear a bit about uh Sarah's background and experiences I think you'll find her a really inspiring and uh amazing human being thanks for joining us Sarah thank you for having me so let me um just rewind a little bit for those who haven't had the pleasure like I have of getting to know you and who you are Sarah um uh is a person who has a remarkable achievement of getting published in neurs the world's Premier academic AI conference whilst still at high school um uh and I met Sarah at neurs uh her the paper she published is extremely high quality work and extremely interesting um and I was thrilled to discover that her background included learning from my courses fast air courses and um we stayed in touch everything I saw Sarah do was incredibly impressive and so I wanted more than anything else to find an opportunity to work with her so we offered her this role um as a answer AI fellow so Sarah Maybe um to start with could you summarize for a like somewhat lay person audience like the research that you were presenting at neps what was the purpose of of the work of the research you were doing and what did you find sure so um I started this project I think towards the beginning of 20123 at that time Chad GPT was still like relatively I think yeah relatively young and um we were interested in doing this thing with multi-step reasoning so at the time like these large language models really struggled with producing um coherent and correct mathematical reasoning and logic and so we were interested in finding ways to improve that so um one of the papers that we kind of based our approach off of was from open AI um they used these reward models as verifiers uh to kind of verify whether these steps in a multi-step reasoning process would be uh correct or incorrect um and what was interesting about their approach was that instead of having um one reward model that would produce a single reward kind of telling you whether uh the solution was correct or incorrect or somewhat of a mix between the two um this would basically grade each step so it would tell you um where exactly uh the response went wrong um and yeah it would give you kind of a more uh specific kind of feedback in terms of that and and let's talk about let's talk about open AI for a moment then before com back to how you built on this because this is very much in the news now so openi recently released a pair of new models called 01 preview and 01 mini which um are dramatically better at reasoning than previous models and they're they seem to be taking advantage of a training system that uses this kind of approach they explicitly open have explained how they've been explicitly taught how to or being given feedback on their reasoning steps and have learned to become better reasoners as a result so kind of it's interesting that it seems this you you know you you picked a problem which I think not not coincidentally has turned out to be really important in practice right and so kind of more on that that the actual origins of our project weren't on mathematical reasoning originally um we had been looking at sort of bias in model in terms of the more like um kind of like ethical uh implications and we realized that a lot of times like um statements were not logically uh connected with each other and so this kind of let us down the route of logical reasoning but yeah that was kind of a tangent but um back to the how did you build on top of then that that open AI work did you take it in a different direction or you kind of took it a little further in the same direction or what well I mean um the the PRM paper yes so basically what we did was uh they basically only trained the verifiers um they showed that the uh more process reasoning oriented ones were um better than the objective sort of holistic ones and so we decided to take it one step further and actually use it in an rhf pipeline to update a sort of like uh completion model and so know this is a second nature to use so let's just fwi a bit so rhf is reinforcement learning from Human feedback so this is the third step in the process that uh open AI used to build stuff like the chat GPT model where they get human beings to give feedback um about uh different possible answers to a question and it helps the model learn better what what human beings are looking for right so typically a um human is given to model responses um the human rates it um or or picks the better one out of the two and then based on the that preference data like a reward model was trained um but for us we were there was a data set released by open AI that had that like process um feedback were individual steps were um rated incorrect or correct by human graders and we trained a uh reward model B stuff like that and so in terms of like the actual RL pipeline um there were a few key changes that we had to make just because the setup of her uh I guess process was a little different um I can talk on those a little bit more more but I don't know if you have any more like conceptual um questions or any clarifications oh that's okay so I just wanted to kind of I think it helps to start you know with like where what what you've been working on recently so basically it's taking a really kind of classic branch of research that turned out to be of great practical import which is moving from just like is this a good answer or not to is this a good series of steps to get to an answer or not and then building on that to then say okay with that we can then hopefully come up with a model which can actually create the correct steps and as a result doesn't have to jump straight to the answer straight away and um yeah until your um model had encouraging results I mean you weren't able to train it on the large amount of Compu you would have liked since uh High School researchers don't have access to open AI computers um but it looked good enough at least to get a get an Europe's uh paper um so actually I wanted to kind of then step back a bit to say like all right I've through my work with fast.ai uh I would say like every year there's maybe a couple of high school students who I come across who do fast Ai and and you know be become a extremely competent practitioner um and so I've kind of got to know a few folks like you and generally the experience is like a bit tricky because there aren't other people at school who have the same interests and capabilities either teachers or students um so you know how how did this work how did this work for you how did how did you get into artificial intelligence and how did you then follow that interest even although I assume you did not have a group of peers who were following that interest with you yeah you're definitely right about the peers thing um I think I started towards the end of Middle School beginning of high school what kind of age is that for I don't know what middle school is in America oh okay yeah I think I was like 13 14 years old um and so I had taken like an algebra 1 course at like the highest math level but um my brother was taking the course because he's much older than me he's you know interested in the sort of thing and so I was like hey let me you know watch and let me follow along and I think like the one thing that kind of parried me through um the course if you will is was actually like the kind of ease of like I don't know just watching the lectures playing around with the notebooks um it was something you haven't done it I just explained there that like first AI course is a bit unusual and that even though we do get to the point of like reimplementing Recent research papers we do it in this top down way where you start out like building stuff basically so it's it's a uh and then you kind of only gradually get into the complexity of the underlying implementation as as needed so I guess you're saying that was like helpful for you as a teenager in working through it yes very um I think I liked starting out with like the higher level ideas just because I got to see how they kind of worked I feel like if you probably started with I don't know like back propagation or like any of those like you know fancy calculus things calculus H but um yeah it was just nice being able to start out with like a bigger picture view of things um and that was definitely interesting I had did you and your brother help each other were you kind of co- studying um we kind of co- studied in like the very beginning I think we watched like a lecture or two together um But as time progressed like I realized that I didn't really like need him there next to me uh things were more or less understandable I think there was like a forum too if I had any like questions I could like go issues you used the Forum I was mostly like like a like a watcher yeah like AAR kind of if anyone had similar questions that would be super helpful um wasn't too much of a poster myself but it's very it was very cool being able to see like a bunch of people kind of following along and doing their own thing like even if like my close friends weren't doing it the only person I knew was doing was my brother who's like much older than me it was nice like to think about having like that sort of like I don't know there's just like a community of people online that are just really curious and driven I remember a um teenage girl from Bangladesh emailing me to say like hey I've just finished the fast AI course this seems really important and really good but like all my friends think it's weird and like I don't know anybody else that like really uses computers much so like is it okay am I doing something wrong I say no you're better than okay it's amazing you know and she went on to get a fellowship at Google and got flown out to Silicon Valley and but you know it can be now it can be a bit weird until and actually I remember um you telling me that going to neurs was pretty important for you because suddenly you were surrounded by real life versions of these people and realized like it's not right that must have been amazing yeah it was really cool it was like one thing to like I don't know kind of just like see the presentations and talk and talks and whatever but like actually walking through like the poster halls and getting to like ask people about their research like wow these people like are also interested in the same things and they like spend so much time doing like or like like asking the same questions that I'm interested in things like that this is a conference with know 10,000 AI researchers or something or coming into New Orleans and everywhere you go around the conference building all the pubs and restaurants are full of AI researchers yeah it's pretty insane I felt a bit the same way when I first went to San Francisco you know coming from Australia where my interests were I didn't really know anybody else who had them and it was nice to suddenly find myself in a town with lots of other people who thought that what I was doing was interesting and they cared like it it matters you know but you must have had a lot of tenacity cuz you were going for what four or five years before you got to that point I mean that's that must have been a huge amount of work for you yeah I think like just like it was kind of my like side hustle like after school I'd have you know extra free time this would be sort of my thing um I think part of it was also like AI was kind of getting you know hot in the news um is responsible for a ton of things really cool um papers really cool breakthrough algorithms and it felt like I was like in on something you know because um more or less like I could understand like what was going on I think like there was like this headline I think about like Alpha fold and I was like wait like I know um kind of more or less what's going on there and so that kind of helped but that was the protein folding model from Google right yes um and it had like one it had solved the protein folding um problem um as like the headlines kind of reported and so that kind of like helped me um kind of stay interested um stay driven and I guess it kind of just like blossomed once I like hit high school and I was um in this program called MIT primes that typically um pairs uh students with like graduate students or professors high school students with graduate students and professors um to do this sort of like research project and so for me I was paired with with um Vlad Leen who was a PhD student at UMass L and now he's graduated with his PhD and he became your co-author on your n Pika yes he was my mentor and co-author and basically taught me everything I need to know about AI research um uh but through that program I was able to kind of take on like a like a new perspective because this entire time I'd been kind of like a student i' kind of like learned about these things kind of played with them kind of analyzed them from like the top down picked apart there in our workings more or less um but now I was kind of presented with the question of like so what like what's next right and so can I just rewind for a bit like about the side hustle because like as a homeschooling dad this kind of interests me because um I think it might be somewhat more aligned with how I think about you know school education you know rather than spending more time doing your math homework you are spending time doing something that's not in the curriculum but if I'm thinking about it I'm thinking okay you're you're doing fast AI Lesson Four you have to use the chain rule in calculus which you would not have done in Algebra 1 so you must have been like I don't know what were you doing like going to KH Academy and stuff like that to kind of learn this stuff and like I imagine that then by the time you covered it in high school it would have been reasonably straight forward for you because you've been applying it for years at that point right I realized like i' had taken multivariable at some point in high school and I was like wait like these concepts are very very familiar to me but I think like you're right in the sense that like for me a lot of things were nonlinear I think like in American schooling especially like there's a very you know straightforward progression from algebra one to you know geometry to whatever to eventually calculus yeah and I think like because I was just so interested in like AI um machine learning and things like that I kind of just like took the initiative and learned the things that I needed to know um and so when it actually came time for math class like a lot of things felt out of order just because they weren't yeah and you would have been better at that math because you you knew what was important you knew what it was for like as you know my 8-year-old has recently started learning derivatives um and it's definitely not in her curriculum but like we don't follow the curriculum because I figure like yeah if you do fun and interesting stuff then it all comes around eventually and you know yeah I'm just trying to think like what about like uh I mean coding even more so right like in the US curriculum I don't think there's much coding at high school generally but in the fast a course you would have had to have become reasonably proficient at python to succeed there so think I was lucky enough to have like a few python classes at my middle school um but then again very different ways of like thinking about programming I think a lot of like introductory programming classes um I don't know they're very like game centered I feel like a lot of the like intro classes you just like solve a puzzle and that's like the entire course um but I think like what's helpful about I don't know fast Ai and maybe just like python in general is that pretty readable um I think a lot of the notebooks for fast AI they were on collab so I didn't have to worry about like a terminal or like vs code and things that were that you know more pick a link and start typing button yeah exactly um so that was nice and then also there were like a ton of resources online at that point even now like you know there's like chat GPT there's AI magic like you can literally just you know it it's the the barrier to access is definitely um much lower nowadays but um it's just something that I had to learn on the Fly um and thankfully there were enough resources to do so well sorry people watching this but AI Magic's an internal tool that answer AI so you don't get to use maybe I shouldn't have said that that's fine no the fact it exists is a known thing I talked about it on a podcast ages ago so glad I haven't spilled the now you've just increased the mystery um Okay so let's fast forward a little bit uh so you actually started working part-time at answer AI before you even finished high school um and then you know we had a bit of a conversation about what next for you and you felt like um MIT was the right place for you to be so you've been there now for a few months I guess and I think you're living there at MIT right I'd love to hear like what's your experience been so far because like again like it I'm imagining first year at MIT there's still not going to be loads of people either students or teachers you're dealing with who know enough about AI to be published at Europe like have you found like yeah tell me about the experience in general and also whether you've you know kind of how you're you know whether you're mainly continuing to kind of work with vad and of course we'll talk shortly about your NCI work um and how that's what you're learning and experiencing and stuff at MIT yeah so I think when I first started talking to you about answer AI kind of working there um doing a fellowship I really seriously considered taking a gap here um so that I could you know pursue my research projects a little more seriously have a little more time on my hands um but ultimately I decided against it just because I feel like I might is such like I hate to say this it sounds like so basic but it's such like a great place to be I think you're definitely right like again none of my peers or like not none but like the majority of my peers um aren't really interested in AI um but they're amazing people they um I don't know there's just a very broad Variety in like what they're interested in and they're all super super passionate about it um and I think like if I think about my future I don't know 5 10 years down the line I'm not exactly sure where I'll be um maybe it'll be doing AI research maybe it'll be entirely something else and I think having that exposure to people that are interested in other things people that are the top of their field and whatever that might be is it's very exciting um and so yeah I guess like I guess that's it yeah and so at the same time so you kind of you know got this multi-prong thing going on where you're working at anai you're also doing and I MIT I guess where your new side hustle used to be fast ai ai um you've been working with Austin hang who is one of the absolute top AI practitioners in the world he was a project leader at at Google you know building the retrieval stuff for Google's deep learning models it became Gemini he's the creator of gma. CPP um yeah what's what's it been like you know getting involved with working with folks like Austin you know what's what's been surprising about Earth or you know what what kind of what's the experience being like there yeah so I think at first I was definitely a little intimidated um the laundry list of cool things um Austin has done is just like insane um but I realized like over time like Austin and like the rest of the crew at Ani they're very down to earth very happy to like explain Concepts very happy to answer questions and I think that's like one of the things I've enjoyed the most about working with Austin um so for some context uh we put together web GPU puzzles um and so um through that I had to kind of like let's just take a look at that then shall we yeah sure um so here's GPU cp. AI okay okay so you and Austin built this together yes and let's talk a bit about what this is so this is like let's go through a few here um these are some pretty hardcore things basically what you're doing here things like a 1D convolution a prefix sum you're asking people to write code fill in something to to what's you know what is a like a fairly comp uh thing written in hardcore lowlevel GPU code sorry um kind of yeah hardcore low-level GPU code which uh if it's taught at University at all it would be like probably in like a master's program or something like that it's this is like extremely extremely extremely Advanced and you're also doing it in like a brand new framework that Austin invented so it kind of reminds me a bit of AD loveely in some ways like you know um she was the first computer programmer uh and she was programming a computer that had just been invented um I mean how do you yeah how I mean are is that your background you've got years of Hardcore Cuda GPU low LEL background programming like how did you how did you implement and contribute to this project no I think my entire GPU programming background was probably like the three hours I spent solving Sasha Rush's GPU programming puzzles um and that was really fun for me but in terms of just like getting this putting this together I think like learning on the Fly again was like a huge thing for me and also just like knowing that the framework wasn't complete and so if I had like any questions that auson would be more than happy to answer that nice to have the guy that wrote the framework there to ask questions about exactly yeah but another thing that I kind of just like reminded myself of was that um so for a little context Sasha Rush is I believe a professor at Cornell um he wrote These GPU puzzles um you can run them in collab um but the idea is basically to kind of distill down the ideas behind you know this sort of Paradigm of like parallel GPU Computing um and then have it presented in a very fun interactive sort of like puzzling puzzle solving way um and so through that I kind of learned like hey like this is how you actually think um in parallel essentially and so when I was implementing um these the web GPU version I kind of reminded myself that like hey even though this is like a new framework things are a little hacky here and there like the essential idea is the same um and for me the goal for the these puzzles um was the same kind of um along the lines of the experience I had with Sasha russan was kind of just distilling down um those really core ideas into something that beginners could digest anyone could digest and really have like a fun time with and so that's kind of like my like overarching philosophy when it came to these puzzles yeah I mean I'm just thinking it's very it it is inspiring though right because I I hear so many people say you can't expect to make any progress in a career in AI research or practice without a PhD and you know you are so slow Sarah you still don't have a PhD you know my goodness and yet you know like many many many past alumni I'm not saying past is unique way to do this but it's very common way to do this you've forged a great path and you know to be honest I did encourage you to consider joining an fulltime at least for a year like you're strength you know your portfolio is strong enough that we're just about the hardest company in the world to get into and we're offering you a position so it's it definitely works out you know I I do want to ask though like a lot of people judge on things other than your pure demonstrated confidence uh a lot of people will at least implicitly judged based on you know the fact that somebody's very young or the fact that somebody's female um and so I'm thinking for example my friend Tanish who finished high school when he was 10 and he wanted to go to university um and he him and his family faced a lot of prejudice you know and struggled to get somebody to understand that he was ready to go to university and when he finally found a you know um Professor ready to take him on they they were right he he sha at University and he went on to finish really impressive PhD you know um so I think he beat you to that one um although you beat him to an Europe's publication so tough tough crowd um but you know it was it was a struggle um and you know I'm kind of curious to hear like it sounds like maybe I guess in their case his family were trying to get him to learn through a really classic okay stop going to school start going to University by doing it kind of more on your own online like has that mean there's been like less of a struggle for you or is there been times you've found it a bit challenging to get people to take you seriously based you know on your age or gender or anything else well I think the nice thing about having you know fast AI Andi kind of my research as my side hustle um is that you know it's kind of less of I guess like a part of my life as um in comparison to maybe what Tanish did like he graduated high school um when he was 10 which is like an insane thing to do and I mean really the only option for him afterwards was like College um I feel like generally like my trajectory and kind of like my main hustle kind of Route was very more or less typical um and so not having not being like I didn't really kind of face not being taken seriously or things like that just because it kind of was more um of a side hustle thing for me and uh but to be fair like if it were to be a main hustle I guess I could see um definitely see how that might kind of suck um I think there are definitely people out there that or like programs out there especially um like the MIT primes program that tries to kind of help out these like younger um students kind of unlock that I guess potential yeah so cool like flad put in his time to invest in your success yeah he was telling me he thought like at the beginning when the directors of the program reach out to him and asked if you would help he was like this is going to be like a ginormous waste of time um but me and the student that he mentored before me um both published papers um and so that was like kind of I openening I think like I've heard around MIT and just in general like you do need to have like or like the the word on the street is that you need to have a PhD you need to have some sort of like higher level education in order to do these more like researchy and more like I don't know like interesting jobs but I think that that's sort of it's a little odd to me because well I mean like I think you're getting a bit of a like you're seeing how it is now right and what you're what you're saying is true now um maybe at some points it was a little less true but like right now there are a few people in the world who have more experience with modern AI than you with your whatever five or six years like it's it's on the higher end you know and for somebody who spent 20 years learning I don't know lisp and prologue and basian statistics and whatever they're probably going to take five or six years to unlearn that enough to be able to start where you were when you were 13 so like for me this is like a bit of a superpower we have at answer AI is we basically totally ignore academic credentials and entirely focus on like portfolio you know um and yeah a lot of the folks that we end up wanting to work with uh younger you know and often they never went to a fancy educational institution like MIT because they were off forging their own thing so I think yeah I think it is like I think your experience should become the new Norm it'll probably take decades to get there you know yeah make the side Hustle the main hustle yeah um yeah I mean I guess in another thing about um you know being a a woman in Tech in general it helps to have people who you know can help Mentor you and so forth so it's nice like we've got Audrey at anrei who was the founding president of pie ladies and probably knows more about how to deal with all that stuff than probably anybody in the world so uh yeah and I think like tying back to the MIT thing I think another part of the reason why I wanted to come to MIT was there are just so many people interested in Tech um that are women and it's it's definitely hard to find anywhere else it's you know we have to keep it 5050 for I guess whatever sake so it's a good high concentration of um I think like women that are interested no I mean it matters of course it matters absolutely and it's important to end up somewhere where you're going to do your best work surrounded by people you can do your best work with yeah so coming back then to your research um what's it been like for you seeing you know 01 come out this you know this renewed interest in kind of reasoning traces and reasoning combined with reinforcement learning um yeah what you know how are you feeling about this this this research field that you got into a year or two ago and are you planning to keep pushing on that yourself or is it like oh it's too mainstream now time to do something else no I think this is definitely very exciting for me I think like hey like I chose the right path the people at open a are doing it um but I think like in general there are along with open AI that actually don't know what's going on under the hood apart from like Twitter rumors but there have been kind of like a bunch of papers too I believe there's one called like quiet star um a few more along similar lines that kind of deal with the same problem and I think this is like one of the one of the bigger questions with large language models is that can we like because large language models they they kind of infer things right like um they have this sort of representation of language and therefore sort of like logic and knowledge and somehow we need to uh somehow they kind of put those things together in a way that is coherent um but how do we actually like extract the things that we want right so I think that's going to stay a big question whether it's reasoning whether it's truth whether it's kind of like knowing like what things go with what things like I don't know if that was clear but yeah I can like rerun that too if you want to no no it's all good absolutely okay um yeah I mean mean the reason I asked is I um I sometimes wonder that with myself I'm like I kind of like to poke at the areas that no one else is doing you know so like if somebody sometimes if something something becomes super popular like okay I'm glad like even with like ULM fit you know that was the first kind of real large language model you know or large language model kind of application of that kind and then suddenly everybody was doing it and I kind of felt okay maybe I don't need to be doing this now because lots of people are doing it there's something else I could you know uncover it's I guess that's a tricky thing with research is like do you want to keep uncovering in the same direction or do you want to explore new directions are there other kind of directions of research you've been thinking like oh maybe when you're done with this you'd like to go in this other way well I mean just being at answer I feel like I've been exposed to sort of a lot of the different types of like research um I did like a few tangential tasks just kind of like exploring the different projects that was that were going on um and one of the things that I haven't touched in a long time was kind of creating um like an application like a research buddy sort of type of thing um and it's not as researchy as like my previous projects um but I think like being able to kind of create something with like an end user in mind is something else that I want to um definitely pursue during my time that's kind of our thing isn't it you know our thing at answer AI is all about research and development with an end user task and a specific end user in mind exactly and so kind of being able to still kind of experiment with different ideas having that sort of researchy aspect but also I don't know working towards a very tangible purpose um would be very cool so so before we up I guess like um anybody who's watched to this point the interview I'm thinking you know uh are we thinking well I want to be more like Sarah you know I think you're a really inspiring role model for people um and a lot of people are where you were four or five years ago you know that they're just starting out probably feeling pretty intimidated um pretty overwhelmed and thinking like well I can't do this I need a PhD or you know whatever academic who's a family member or something like I guess like what what would your advice be to 14year Old Sarah you know if she was feeling there must have been times you're feeling like I can't do this or I don't want to do this or is this worth doing or like am I too weird nobody else is doing this like what what kind of advice or feedback or thoughts would you pass along to that to that s I would say kind of just to know what you're curious in and know what you're interested in and just go for it kind of full send um I'm glad you that so like it's it's about like you you you actually have you actually have to care and enjoy it like it's if if you treat it as a grind you're probably not going to do it right exactly like AI is probably not going to be everyone's cup of tea um but I was fortunate enough to have discovered it quite early on and I just knew that I was very interested in it and obviously there were times where like you said like things got hard I wanted to like drop everything yeah so sometimes you've got to grind it out you gotta sometimes you've got to grind it out but I think like know the reason why you were interested in in the first place I think if you're interested in anything um there's got to be something very genuine something very I don't know compelling about it um so just remembering back to the first time um you were interested in it um and kind of just knowing that like there is an end um goal in mind that you'll probably reach if you keep at it so well I'm definitely gonna share this story with my nine-year-old daughter who loves coding and she loves math and she loves calculus and uh I think she'll find this very inspiring uh and I hope that other kids and adults do as well but I know you're definitely an inspiration to me Sarah so thank you so much for this time and for being involved thank you so much for having me again okay bye bye
Video description
Sarah has also written a blog post here: https://www.answer.ai/posts/2025-03-17-gpu-programming-scratch.html