AI and Machine Learning for Software Development in 2024

Exploring AI in development tools, data security, privacy, and local hosting. The Headway development team dives into using AI tools such as ChatGPT in browsers and code editors (including VS Code and Vim), to addressing privacy and security concerns surrounding AI.

Listen next on Even-Keeled

Team Retrospectives

45 min

The Intro

2 min

Presented by

Host

Jon Kinney

Partner & CTO

Host

Dan Diemer

Mobile Lead

Host

Chris Held

Development Lead

Host

Tim Gremore

Development Lead

Guest

Host

Jon Kinney

Partner & CTO

Host

Dan Diemer

Mobile Lead

Host

Chris Held

Development Lead

Host

Tim Gremore

Development Lead

Guest

Transcript

Jon: Hey everybody, welcome to Even Keeled. I'm your host, Jon Kinney, and today we're joined again by our panel of dev leaders at Headway. Today we have Tim Gremore,

Tim: Hello everybody.

Jon: Dan Demer,

Dan: Hey there.

Jon: and Chris held.

Chris: Hey everybody.

Jon: So today we're going to be focusing a little bit more on AI and ML. We had mentioned at the end of last episode that we would potentially be talking a little bit about Cloud Native development as well. We're gonna save that for a separate, more focused episode. So today we're gonna be jumping in and talking a little bit about, AI in general and being able to use it in a browser with tools like ChatGPT. We're gonna be talking about how we can use it and leverage it in code editors , through Copilot or built-in tooling with other editors like VS Code, Zed, and Vim, we'll talk about troubleshooting as a developer in the age of AI and some cool things that we've seen recently on that front. Some privacy and security concerns around AI, options for running LLMs locally, and what we've done specifically at Headway on the development side of things.

So let's just get it kicked off here with some of the tools that we've all found and used recently in the Chat GPT space, the browser interaction space, sort of the end user lowest barrier to entry, for AI. So, I'm sure we've all loaded up Chat GPTs default web UI from Open AI and interacted with it, right?

Has everybody, had a chance to use the v4 of the LLM or 3.5 Turbo? What have folks interacted with most recently?

Chris: I have the free version, so I stick to the 3.5.

Dan: Yea, Same for me though. I've, I've just started playing four, through an API interface for some specific things and we can dive into why in a little bit.

Jon: Tim, how about you?

Tim: yeah, three, five for me too. Um, it's been sufficient, types of things that I've asking it to do, but I could see that need growing and the value and paying for it.

Jon: Yeah, I think one of the big changes is the number of tokens that it can take in at a given time For version four, I've paid for an account, so I've been using version four for a while, and it's nice to be able to maintain context a little bit longer so that you can paste in more of a prompt that has a lot of backstory or build up more of a conversation and then have it generate something a little bit more poignant than a one-off, Hey, here's a blob of text, give me back a response. and it's interesting, I've had better luck when I need to do really heavy context and when I need to paste in a lot of content to get what I want, I'll end up having to go to the web browser and use it through the AI default web interface. But I also have Raycast installed on macOS, and I pay an extra 16 bucks a month or something to be able to have unlimited access to, I think it's again, version four of Chat GPTs, LLM and because it's using the API, there seems to be more of a limit, even though it still is v4. And I didn't dig into the specifics, but it's just been an interesting point that I've noticed where I can, you know, you, I've got like a hyper key configured with custom key mappings and stuff. So I do like Hyper-K and it, pops open this chat interface, which is just right there.

So if I ever need to just do it really quick, I don't have to pop open a web browser and you know, get logged in and whatever. It's just always right there. So it's useful for a little bit of that smaller interaction and it can do medium sized things, but it does end up kind of capping out for whatever reason, if I paste in too much content, so it's interesting to see how that stuff's progressed over time with the number of tokens and ability to process those more cost effectively. And then the companies open those up larger as time goes on, I guess.

Dan: Yeah, don't know off the top of my head what the cap is between three and four. I know three is fairly low, but, I think good for just normal conversation. Um, my specific use case is. For some client work, we're using it to analyze, some page source like HTML and find specific inputs for a tool that we're building that's kind of like an autofill tool. And, um, what I ran into is that token limit with, 3.5 Sometimes it would work and sometimes it wouldn't, we'd bump right over the edge of it. and so I, bumped it up to four and then that kind of worked for a while. but it looks like the newest version, like four, 1, 2, 5 preview has a significantly larger, like, almost like an order of magnitude, larger amount of tokens that you can pass through, which seems do the trick for us.

So I, think that we're probably all bumping into like little bits of that when we run into a scenario where. Just the input won't, you know, cuts off or doesn't work. so there's still like room for improvement, but it seems like they're making some pretty big progress. If we can, jump from like 16,000 tokens to like a hundred something thousand tokens.

Jon: Yeah, I just looked it up. So 3.5 is 4,000 tokens. Uh, GPT four's context window, uh, is 8,000 tokens. So they doubled it. And then you said four, uh, preview, 1, 2, 5.

Dan: Yeah, I think that's what I'm running on, and it might be up in the, you know, teens of thousands of tokens.

Chris: What is that in like, uh, in terms of perspective, Dan? So like 8,000 tokens, is that like a transcript of like a 45 minute podcast? Is it like a movie script? Like how much can you actually input?

Dan: It's a good question. Um, for me, it really is relative to the HTML on a page, so generally

a, a more complex and larger page, longer page. A page that has a lot of, JavaScript pulled and there's probably ways we can optimize that page source before we do that. but what I've found is like a more commercial page that I go try it on versus something that's a little bit more, industry specific for like the client that we're dealing with. Or we've even tested on one of our internal pages and that ran fine. but then running it against. a thing on code pen or somewhere else is where I started to bump into some of those limits.

Chris: Yeah, interesting. I was wondering if it's embedding like, you know, five megabytes of JavaScript on their page and we're pulling that in too, if that's hitting that token limit, but anyway.

Jon: I found GPT-4 1106 preview input token is limited to 32,000 for tier two. and that seems like they're reporting an error here in the Open AI, um, forums. And so it supposedly is gonna be higher than that. Um, they said, you know, I used to get this 32,000 limit when I was in tier one, but after my account has been upgraded to tier two, I'm still getting this same limit message, so it'd be interesting.

I'll, I'll see if I can find out what tier two should be, but, uh, that's quite a bit. 125,000 context length

Dan: Yeah, I don't, I'm trying to figure out where I had seen it, but I. Think that the specific model I'm running against is upwards of like 128,000 tokens, which for all intents and purposes is like unlimited for me.

Jon: Yeah. For this specific use case, at least.

Dan: Uh, here's, here's the link. I'll drop it into our, into our podcast chat. Yeah, GPT-4 caps out at like 8,000 ish. And then, it looks like basically all the preview models, as of like mid last year, have a gigantic context window. And what I don't know Is that a limit only applied to preview and once they lock it in, it'll continue to be that, or lock it down to smaller token size and you have to stay on a preview to have that larger, not a hundred percent sure yet.

Jon: Yeah, I can share this here. So if we scroll down, we can see GPT-4 and GPT-4 turbo, the 0125 preview like you mentioned Dan is the newest one. I would've, I guess, thought that 1106 would be newer, but, I don't know what numbering convention they're using there, but yeah, up to 128,000 tokens. Very cool. What other tools are people using consumer facing AI tools?

Chris: I've been using, well, it's just kind of a fun tool to play with, but it's called Make Real. and it's essentially like, it's a canvas where you draw like a UI and you can annotate it and say like, oh, I want this button to do this, or like, this should look like this. And then you press a button and it sends that up to the OpenAI servers and they come down with like actual, you know, componentized, tailwind, HTML, uh, which it's just really cool and interesting. And then something that's also very similar to that is v0, which is like just a much more polished version where you like basically create a UI with like a conversational AI or you can give it a screenshot to get started. so both of those things are really cool. I don't know that it would translate directly to like our client work right now. But for personal stuff, I've used it before to get inspiration on my own designs for like side projects and, you know, all of the stuff that it, gives out is like actual usable, HTML, so it's pretty interesting.

Jon: What's the big limitation right now, Chris, for not using it with client work? Is it that the components that it's creating have the right abstractions, they're all just kind of one off, like it isn't a comprehensive design system or where's the edge?

Chris: It's not a comprehensive design system, so it's just, it, it, there's a disconnect between like how we build products and how v0 kind of wants you to build products. So if there was a way for me to say, okay, this looks great, now break this down into a design system, and there just isn't that yet. And I don't know that it'll ever be, that, that might not be the point of the tool. it's really just to give you a jumping off point. I guess I could see potentially, taking like a screenshot from Figma, uploading it into V zero, saying, get me started here, and then we break it down ourselves. So that, might be able to buy us a little bit of time. but currently, I'm just not sure that it's there yet.

Jon: That's interesting. Yeah, I've got a new client project that we're gonna be starting hopefully in the next week or so that has the Figma design already pretty well in place. And we did some estimates based on the prototype that they had and it looks good. So I think there isn't gonna be a ton that our design team will change and tweak. There might be some specifics on the Figma design system sort of backend, so to speak itself. Uh, but it'll, I think I might upload some screenshots and see what we get from a jumping off point.

Chris: Yeah, definitely worth a shot.

Jon: I Saw also ARC has a new, concept of how we should all be browsing the internet, and it's less of us going out and browsing the internet and more of them doing some work to bring the internet to us. And I think AI is a big part of their vision. Has anybody used any of the new ARC tooling, around AI?

Dan: I've used a little bit of it. it still is getting in my way enough that I haven't made it like part of my workflow. And so I think as it pops up, a lot of times I'm quick to dismiss it and I need to remind myself to give it a chance, to try some of the stuff that it'll do, like, page summaries and find in page. 'cause those are the things where I see it the most often. And a lot of times I'm still in my like. Luddite phase of like, get outta here. I'm just trying to find this thing that I know is on here. I'm not trying to ask a question about this page, and I probably need to get in the habit of actually trying that.

Tim: Yeah, that's similar to where I've been at. if they're presenting it in an update, meaning I don't, I didn't have to turn on the option or opt into the, experience. I'm probably 50 50 as to whether I'm using it or not. that's true. Probably any other software too. for myself, I'm not looking for it, but if it's presented, I'll try to give it a chance. I'll try to weigh the value of it. but sometimes I'm just trying to look around it to get to the thing that I'm usually doing, which is Google and filter and then find the result that I need.

Dan: And we might not be the audience for those features either, right? I think we're already comfortable with ai, comfortable with using it, and so a lot of times we're proactively going to find it when we want it and not needing it necessarily, pushed in front of us. And so I think maybe. that's why it feels a little more jarring is 'cause I already know about it and right now for the task I'm trying to achieve, you're actually slowing me down or making it worse, right?

Sometimes when, the Bard results come up in my Google, they're often not what I need or specific enough to what I'm trying to do. And the link I want is down like four links below that. So that might be part of it and maybe it's not as bad a situation as it feels like, but

Jon: Did anybody watch the arc release where they have the folks in the diner and they're showing them, Hey, what would you do normally Google this recipe or Google these directions and it goes out automatically and searches for the right stuff and then creates actually a very visually pleasing and easy to parse website with multiple of the portions of the content it found stitched together. I thought that was really, really cool and, something that I definitely could see myself using, because I noticed that I was the one doing the work of the computer previously when, they revealed this feature. It was pretty neat to be able to see their vision for bringing those, pieces of the internet to you in a more streamlined way.

Tim: I'm curious when you do, get an answer, how often are you then checking that against other search results for

Jon: Uh, accuracy.

that's, that is tricky. Yeah. I've seen some things the other day where somebody held up a note card to Travis Kelsey at the Super Bowl and said, Hey, Travis, do my math homework. And it was like parentheses, it was order of operation stuff right. And he got it wrong. I guess, and then in the comments, this might have been on Reddit or something, uh, in the comments, people were like, Chat GPT says he got it right and the people were like, well Chat GPT is returning the three most popular answers, that doesn't mean it's correct. You know, it scours the internet trains on what it's found and isn't necessarily performing the math operations directly.

Dan: I think that's the most actually surprising and troubling thing about something like chat JPT is, just how abysmal it is at actual math. and I worry, about future generations that don't understand That That will be leaning on it to do that kind of stuff and, how that will impact them.

I'm also a little concerned about the, I haven't seen that video from arc. what does that mean for content creators? Like specifically people that are building lifestyle blogs with recipes. If something like ARC is coming by and just grabbing a snippet here and a snippet there. I assume that that's gonna have some sort of negative impact and I don't know how you overcome that to get around it, you know?

Jon: It is a little chicken and egg, right? Because ARC needs the content to be out there somewhere. it isn't inherently a world class chef. It doesn't know how to combine the ingredients, you know, to make something tasty. Humans have done that work and documented it. So if we stop documenting new things, What does the AI learn from, right? That's interesting.

Dan: Yeah, I wonder if we're gonna see something like, um, and maybe this exists and I just am not aware of it, but like we had with Google in the early days where you could, tell your robots file like, hey, don't index this. A way for them to, opt out ahead of time from being indexed into these AI things. Now, whether that's respected, who knows, but.

Jon: Hmm.

Tim: Right, and nobody knows what the legal implications are yet. there's no real clear standard as to how that would play out if somebody's going to be malicious with, that tech.

Jon: Yeah, there's the malicious aspect and you talked to creators a little bit there, Dan. I know. what is the, guild or the association?

Chris: The Screen Actors Guild

Jon: Yeah. I.

Chris: is that it SAG.

Jon: Yep. So the Screen Actors Guild has definitely come out against generative AI and that was before soa and now we've all, in the last couple weeks, been seeing SOA create more and more realistic videos. Uh, sure. The first couple, you know, the woman had extra hands or an extra foot or something funny like

Dan: She gallop to like a horse was the.

Jon: yeah. There were multiple appendages and galloping, I think, oh, it was the gentleman leaning back in the chair on the beach had two arms on one side, But it's getting better and better literally daily, and it's gonna be a very interesting thing to navigate in terms of creating. That content, and I know actors don't want to be scanned and so that, you know, studios could just use their likeness to create films, but I wonder at what point they just bypass the whole human talent thing in the first place and go fully digital.

Dan: Yeah, it does feel like a, little bit of the bridge to some of the like, weird stuff that was happening, I don't know, 10 years ago with like vocaloids and holographic. celebrities in other countries, it seems like This is what lets them just get right to that point where they can just invent a new celebrity on the fly and throw 'em into a film or a TV show.

Tim: Right. at some point that artificial product is maybe adopted, but it won't be fully adopted. I think people are still gonna crave actual real people person interaction. Like you just, you cannot replace what we've known, whether it even be on, on film, on screen, um, obviously if it's in person,

Dan: Yeah. I wonder how long. We're gonna live in that uncanny valley for I, my gut tells me a lot longer than we think where if you listen to

Jon: I kind of hope so.

Dan: Yeah, right. We listen to somebody like read something, like an audio book and then you listen to an AI read it. There becomes a point where you become very aware that you're listening to AI. And for me it's really off-putting and hard to ignore at that point. It's fine and conversational tools, but when it becomes entertainment, I don't know if I'm ever gonna like get to a point where I can just like happily listen to an AI talk to me for 40 minutes.

Tim: I don't think you can. I mean, I think just witnessing the effect that separation had during the pandemic on society. I. Same loneliness that is, just overtaking society. You can't be disconnected from people. It just not made for it. And no matter how good AI gets, no matter how realistic it may seem on screen or, in my ear, it just will not replace human interaction.

Dan: One of my favorite, uh, AI goofs that's happened recently. I don't know if you all have seen this, um, was the Willy Wonka event that happened in Scotland

Chris: Yeah.

Dan: in the last few weeks.

Um,

Jon: it sounds vaguely familiar, but I didn't dig into the story.

Dan: So there was a Willy Wonka experience that was advertised and, they used generative AI for various parts of it. So, uh, the promotional and marketing materials, the imagery, they even used it to generate the scripts for the live actors. And so people showed up at this warehouse, I think, in Edinburgh, and were expecting to have this like, wonderful. Experience with their children. And it was, it was this dismal, dirty warehouse with like sparse tables and a few like little exhibits and some actors reading off of these AI scripts.

And it was a complete bait and switch from what, they thought they were buying tickets to based off of all the images. Um, because if like AI imagery is good at one thing, it's creating like. Fantasy wonderlands. And so, um, just the stark reality of walking into this warehouse, people were so mad they called the police. The company that on is re refunding everybody. But one of the, uh, one of the actors described the script. And, and from my perspective, it's like so clearly AI that it went into like extensive detail on how excited the audience was gonna be at all the joyful things they were saying, like, it's very flowery. It made, it made no sense for it to be in a script that you're just trying to read as an actor. Um, but it's like all the, all the hallmarks I think that we see of like, oh, that's AI. were just scattered throughout this. And I think what's important there, the big takeaway is people don't have an appetite to be scammed by that type of stuff. And so the pushback, I think, will always be pretty heavy if it's not delivering the value you think you're getting.

Chris: Yeah, I, I think it definitely has to be in tandem still, right? Like, I don't know. I don't know that a, I will ever get to the point where it's ready to like, take the lead and just run and, you know, you can just say, Hey, you know, chat GPT, write me a novel and I'm gonna publish it you know, and make money off of it. it can be an assistant and it can work alongside you. But as far as doing something completely on its own, it's just not there yet. What I'm excited for, for Sora is, creating content that's just like individual to me. Not to sell it or see what other people have, made the AI do, but just like, you know, if I wanna be inspired by something specific and I can't find a movie like that. I could just say, hey, you know, Sora make me a movie about what it would be like to go to space, you know? And then I could just watch that by myself and keep like my own individual collection of stuff that's customized for me.

Jon: Yeah, that's really interesting. I've heard recently in the music industry, we don't have shared experiences anymore around things like the Beatles or the Rolling Stones or, um, you know, even as recent as like Nirvana, right? Where everybody kind of at least knows some songs and you, it like transports you to a time or an era or a feeling or whatnot with the ability with Spotify and Apple Music and others to be able to be so hyper specific on your exact. Niche, niche that you prefer to listen to. People just kind of don't share a lot of those same tastes or, or have those same shared experiences anymore. And with what you said, Chris, being able to generate not just very niche things from other artists, but your own content that is just for you, I hadn't even considered that that's next level of that problem, so to speak if it's a problem. It's a double edged sword, right?

Chris: Mm-Hmm. Yeah, I think that gap will probably widen a little bit. Um, but that said, like, I still would want to like. See how the pros do it, right? Like I would still wanna like watch a real movie or like a real television show, or listen to a real artist instead of just like, you know what I generated with the help of AI.

Dan: I think it could be an incredible teaching resource. I think of like a middle grade teacher, that's created a lesson plan around some topic, especially for like history. Being able to put in a prompt and get an engaging video to really like, engage your students so that they can just watch and they can understand a concept in such a better way, or give you like a level up for the lesson you're going into, without having to like go trek down the source. I think that could be a really interesting angle for people to use a tool like soa.

Jon: Yeah, you run up against the problems that Google has had recently where the Gen AI is creating things that aren't historically accurate. So that's definitely something that could be a problem. I.

Dan: For sure. Yeah, the accuracy problem is gonna be probably the Achilles heel for this thing. Until they can get through that and find a way to not make a, an AI assistant that's so happy to please you, that it will just tell you nonsense.

Jon: Well, they say AI isn't gonna take your job, but a human using AI will. And we certainly fall into that category where we're helping have AI facilitate our coding and our bug hunting and troubleshooting and architecture. Um, UIs, as Chris alluded to earlier, with like v0.dev, what have we all used collectively in the code editor, AI space, co-pilot's, the obvious one. Dan, do you wanna chat a little bit about Co-pilot.

Dan: Yeah. Actually before even I got into copilot, I was exploring what other options there were. I had spent quite a bit of time using Codium, which I think is a, an equally good solution. It's, I think, open source and so if you don't have access to Copilot or you know, you don't have, um, like I think for us, we have an organizationally we have access to it, but for individual developers, especially people trying to get into the dev space, maybe someone in a junior role, being able to keep up with all of your peers at, uh, you know, a monthly cost that might not be something that everybody is able to do and so it's good to know that there are free options out there like Codium that can give you relatively, similar and pretty high quality answers to all of the questions you have.

But yeah, I've switched off of that and into Copilot mainly just because we have access to it, and I wanna make sure I'm using all the same tooling that everybody else on the team is using. And so I've used that in a variety of different spaces. through Vim through a little bit of VS Code, and then through the Zed Editor that I've been diving into over the last week after Tim mentioned it.

Jon: have you found built into Zed Dan, that's helpful from an AI perspective.

Dan: It being kind of a first class citizen in the, space of Zed is really interesting. So you have a panel right away. I think the best thing about getting started with Zed is the very batteries included approach that they have. So, having access to Copilot right there without having to install an extension is pretty cool. Being able to chat with your, assistant there about whatever problem you're facing or bug you're trying to fix, without having to leave the tool is pretty nice. There is a quirk to it that haven't fully figured out. So when you pull up the chat assistant in Zed, it will run what you're typing back through Copilot. So now I'm getting AI prompts for the prompt that I'm trying to give to the AI and it feels very, very, uh, XZibit - Pimp My Ride. They've put a prompt inside my prompt 'cause they heard I like prompts. Um, and so I'm, still working around that. It feels like a bug. It feels like it cannot be what you actually want 'cause half the time it's completing nonsense things. But it's been really useful and it's been nice to be able to have everything kind of in one place and not go pull, you know, a chunk of code into my clipboard and go throw it in chat GPT, and then work on it and go back. Um, I've been using that quite a bit and I think it's, it's really good solution.

Jon: Yeah, I know there's the ability to quote unquote chat with your code base, and that's an advanced feature of copilot, if I'm not mistaken. I haven't had a chance to use it myself yet, but it's extremely intriguing because if it can maintain that context and start to introspect files automatically in the background and build up more awareness of the vernacular of your code base or the routing of your code base or the CSS approach of your code base, right? The testing approach of your code base, and now you ask it to help you understand why something's broken or what you should build next in order to complete a feature that feels like it's going to be really, really next level. And even like if you're already a senior developer pair programming with a super senior, all knowing developer because I think we've all probably had that experience where we're a junior dev pairing with a senior dev who already knows the solution, but they're being patient with us to try and help, you know, help us struggle a little bit to learn as we go, the right order of operations and the right way to do things. I don't know that I'd want the AI to have the condescension that perhaps some of us have felt those scenarios, but to be able to be more or less one step ahead and know if we give it the right prompt, how to guide us, I think could really accelerate our capabilities as developers.

Tim: I've been seeing, uh, at least it's very subjective, but the last few days even in increased productivity with copilot in an elixir project. So I would say the, my experience with. A JS project in copilot felt a lot more productive. I think just again, subjective, but it just felt as though I was getting more helpful prompts from copilot in the context of that JS project. But either one of two things, either, the Elixir Project has been improving to the point where now copilot can better prompt and inform what I'm trying to do. or perhaps I'm leaving more descriptive text, more descriptive comments, uh, documentation in the app. I'm not exactly sure what is leading to the productivity, but it has been noticeable just this week even. And I'll be curious to see if that continues. as I continue to build this, uh, Meta code project.

Jon: Nice. Yeah, I know it's a skill to be able to prompt engineer. I mean, we saw at the super height of the hype cycle of ai, which we're still in it, but it's calmed down a little bit, I suppose, where people would offer $600,000 for a prompt engineer to come to their organization. And I just don't understand how that's a thing in and of itself. Um, but definitely being able to use AI effectively is a learned skill. So I'm, I guess I'm not surprised, Tim, that you're, the more you use it, the better at it you get. And I'm also not surprised that it had more help for you with JavaScript code because the world is being eaten by JavaScript and so there's just so much more source material for AI to help inform mung together for you than the Elixir code.

Tim: Right, and I haven't had a chance, at least in the last four weeks or so to To take that same test to a Ruby project, but I'd be really curious to see what the experience is in copilot with rails.

Dan: Yeah, I think it's a great tool for people to like level up their, full step. Experience because I think a lot of times, like unless you're just that front end or that backend dev that really has no interest in branching out from there, if you've got some idea of like what you're trying to do, I feel like copilot can be a really great helper and teacher to help you get from like point A to point B especially if you're jumping into a more full stack project for the first time.

Jon: I saw the other day on Twitter, the concept of being able to troubleshoot your code more effectively with Gemini 1.5 Pro. So Google's Gemini obviously come under some scrutiny here with its image creation, but it also has the ability to ingest images and work with those. And a developer on Twitter posted a video of taking some code that he had written, I believe it was in Python, and he intentionally added some bugs to the code and he took the screenshots of the code, took screenshots of the bug occurring in the UI, and asked Gemini to help troubleshoot what was going on and it spit back out the better code and it said, it looks like you're having this issue. and it was really cool to see where we could be headed with regard to that, because I even saw a tweet yesterday that was at the complete opposite end of that. troubleshooting code with AI is nowhere near where it needs to be. It often just spits back random things for you to try and isn't more helpful than just Googling. And maybe this developer on Twitter with Gemini 1.5 Pro tested things to a point where he knew he was going to get the result he wanted for attention. I'm not a hundred percent sure, but it's very interesting to be able to think about a future where you run. An integration test suite and there's a failure in your end-to-end tests that records the problem in the browser for what happened, and the AI model already has access to your code. It sees what happened in the UI and can suggest not just a stacked trace, but like, here's what you need to do to go fix this situation. Could be really, really empowering to lower the barrier to entry of troubleshooting some of those production problems.

Dan: Yeah, I look forward to that day. I think that that would be a yeah, great, great use case of it. The other thing that comes to mind is in, the Android world, you get this feature. For free, from the Google Play Store where they have a robot go in and like poke at your, app before it goes into like the next stage and, uh, sometimes it will uncover crashes for you. And it would be great to be able to have a process to upload that crash video plus your code into something like Gemini and have it figure out what went wrong, what did you miss, what tests need to be written.

The Ruby tweet that you talk about makes, me feel like they're, maybe it's just a little bit of a skill level that needs to be increased 'cause it seems like I've had that kind of experience where it just gives me random garbage, but it's always, been fixed by providing the right context.

Jon: Yeah, I agree. I, I think being able to provide that context, have enough tokens available to you for the model that you're using for the AI two. Be aware of all of what it should be considering is potentially another unknown part of what we need to troubleshoot or judge or look into when we hear somebody having success or failure with one of these models. Uh, but getting back to what you said, Dan, about the ability for, a Google Play app to be tested for bugs or crashes back to Zed again, they tout this on their homepage too. They say extensively fuzzy tested for stability, and there's a gif of, you know, blocks of, uh, code I suppose being analyzed, uh, or tested that they say creates a much more robust, less likely to crash editor.

Dan: I wanna opt into fuzzy testing. How do we do that?

Jon: Just squint while you're typing.

Dan: Oh, okay.

Jon: So we've been talking a lot about being able to leverage hosted AI platforms, consumer tools, AI tooling that's in our code editor that I think most people realize is going to be opening up your code base to portions of it being sent out to quote unquote, the cloud. There are some security concerns with that. I think it'll be really interesting as we get better hardware and more fine tuned models where we can run some more of this stuff on the edges rather than at a big compute center that requires just stacks and stacks of GPUs. So What have we seen in the space of hosting LLMs locally?

Tim: now, in terms of bringing down an LLM from, uh, from hugging face, for example, uh, that we've done. But in terms of hosting a chat model locally for some of the types of interaction and work that we've been describing for our conversation so far I haven't done that yet there's a healthy amount of open source chat models that are, growing to provide users with an option to host locally instead of having to lean on something in the cloud. But I guess back, John, back to your question of, what have we done with those models? We've had a couple examples of client work and then one internally that we've been working on of, like I mentioned, using modeling, from hugging face. A couple of those have been data science related and using ML to do analysis on data that they've collected or that they anticipate collecting. what those clients have been finding is that at least one in particular is that they're limited. They just simply don't have the ability to keep up to, a growing customer base without some automation, without ML. And so, they also don't know. the security concern exists, right? So they don't quite know what the implications would be of leaning on a host adoption for the ML work that they are asking us to do. Being able to pull down a model, then host it alongside the application, and, eliminate that dependency, eliminate any, sharing of information over the wire, really gave us the confidence to recommend it, gave them the confidence to try it, to explore it, and it's been good. It's held up, in our prototyping to the promise, early on.

So the client work has been good. more specifically our internal project with Medicode, as we've named it, has been excellent. the ability to run something contained. Including that language model, both locally for development and then in a hosted environment, is just really hard to beat. It makes development so much quicker and, feedback loop is so much faster.

Jon: Seems like it'd be a much more predictable burn rate for hosted costs as well.

Tim: Yeah, very much so. for us, we've been using the, ML ecosystem in Elixir. We haven't dug deeply into the Python ML world. Um, mostly because we know Elixir, we know the benefits of Elixir and we've seen such rapid progression just in the last three years within the Elixir community to go from not having an ML offering to having something that's production ready and able to compete with Python three years is not a long time, but the tooling and those libraries have come so quickly that it only suggests that it's gonna continue to accelerate. And so being able to do ML in that same language, which you are using to build your web applications around, being able to use something like live book to interact with those applications and do prototyping and data analysis with, uh, there's so many other advantages to staying in Elixir that it's ideally suited for this space for ML. And then hosting something like Elixir is very inexpensive because it's performant. There's been a, a kind of a fun thread really this past week around Twitter on, you know, folks saying, here's the application I'm working with, here's, you know, as far as annual recurring revenue, monthly users, and I'm sitting on $50 a month with fly.io for the entire cost. And so taking all of those examples, and then again, they're not hard to find if you search, it really suggests that Elixir is uniquely suited for this ML space. And, uh, it'll be fun to watch where it goes from here.

Jon: I think it's great that it feels like a language that is growing in that area and developers are adopting it more, and that's one end of the spectrum. And you could have that in I think, many languages where the developers enjoy working in that space. In this case, ML with Elixir. But if the ecosystem itself and if the framework or the language doesn't also view things that same way, it's gonna at best be an uphill battle and at worst be dead on arrival. Right. And luckily for Elixir, we had the opportunity the other day to chat with Jose Valim, the creator of Elixir, he joined us on his private Zoom call for what was 45 minutes or so, maybe an hour, just to chat about what we were building in Medicode with ML. And he wanted to pick our brain. ML is a huge area of concentration for the Elixir team. Obviously, if the creator of the language took us time to meet with us, just to see what we were doing in the space, that speaks volumes to where they view that language and, and framework going. So, uh, it's exciting. Not that we find it, well-suited for that one use case, but that the creator himself finds it also the next frontier for elixir.

Tim: Right. there was certainly some forethoughts and vision when the idea of, developing nx, uh, numerical elixir when that started three years ago. and the amount of contributors that have come into this space and the amount of libraries that have been developed is I think even beyond what the initial vision was for it. and, uh, it's a lot of fun to work with. Live book makes the barrier to entry very small, the entry into ml that is, if you can install live book, there's a one click, desktop launcher for it. And before you know it, you can add a smart cell to that running live book and be interacting with a neural network task. Things like token classification, speech to text. Image generation. There's a good collection of examples that you can begin to get inspired by and understand how Ely or even facilitates everything from downloading the model, from hugging face, for example, to running it locally on your machine and then eventually up on your hosted platform.

Jon: Yeah, live book is for those not familiar, akin to something like Jupyter Notebooks in the Python community. And if people aren't familiar with either of those things, Tim, could you just take a step back, 10,000 foot view. What is live book? Just give folks the understanding of the benefit of something like a live book uh, that might not have a parallel in every other development community.

Tim: Yeah, yeah, good question. And it Live book really introduced me to Jupyter Notebooks, and so I, this was new to me though obviously many, many developers have used Jupyter Notebooks and, and benefited from that in the Python community. And as best I can tell, it started as a tool to help do data analysis in that space. And so once the idea for ML and Elixir came about, naming that live book came with it. And the ability to do data analysis in Live Book exists today. So what is it? Um, if I start a live book process, that means I have a single server that's running a live book server, and within that server that I can have any number of what are called code notebooks being executed, being run. And a code notebook is nothing more than a markdown file. And so when I am editing a code notebook in the context of a live book running server, or in the context of my code editor, because again, it's just a markdown file, I'm editing something that's executable. And so for example, if I want to read the context of a CSV file, if this is a very simple task, typically you would look for some way to ingest that CSV file either into your code base or into a spreadsheet. and then from there, would, begin to create a data structure around it so that you can begin to interact with it, make changes to possibly write it back to the CSV file. Save it somewhere on your disc. And that, flow is all collapsed into one single interaction within live book. And so I could take that file that I want to do some data analysis on, I can drag and drop it onto my live book server, which would be running within your, browser and live book will ingest that file, read it, create a data frame around it. And so if you're not familiar with the data frame, it's, a very useful, very powerful way to, read, interact with, transform large amounts of data. and so I can have a visual representation of that data frame again in my live book, all from simply dragging and dropping that CSV file onto that server. So that's just one very specific example. People are finding all kinds of new ways to use Live book though. So if it started as a way to do data analysis, it's becoming so much more than that. I mean, I saw somebody referencing an experiment that he was doing with Live Book that interacted with Chat GPT and had Chat GPT beginning to then author some of his live book cells. And so there was this loop that he had been creating in his experiment to build up, uh, you know, some very wild interaction and, and experience. And, uh, people are getting very creative with it, beyond just data analysis, but it certainly does that very well.

Jon: Well, this has been great everybody. We're coming up on time, so Anything else, Kristin or Tim, that you want to plug or mention that you've seen in the past couple of weeks? That's interesting?

Dan: We talked a little bit about Supabase, I think, last episode, but one of the best features they've added recently has been an AI integration, focused around their dashboard and interface, and I found it super, super helpful as I've been learning that tool, but also just some features about Postgres that I didn't know how to do, to be able to describe to the system what I'm trying to do and have it generally give me, a result. But the interface to it, I really enjoy because it gives it to you in more of like a draft state. So it'll, give you like, here's the changes I'm gonna make in like, kind of a diff. You can accept or reject them. And I think that's a cool way to kind of bridge those two things, the AI and the code that it generates without it just like smashing through whatever query you're gonna try to write. So I think that's a kind of a cool novel use of, the tool that I've been pretty pleasantly surprised by.

Chris: one thing, there was a, uh, we talked a little bit about running LLMs locally. Um, there is a tool called. Ollama it's like a CLI tool that you can pull down and you can use that to pull down LLMs, run it locally, and then you can have essentially Chat GPT in your terminal. so that's a tool that, a couple people on the team have been messing around with.

Jon: Nice.

Tim: Have you tried that, Chris?

Chris: Mm-Hmm. Yeah. Um, I don't know. It was, it was okay. I didn't try to do anything hard. I was basically just like, gimme a good recipe for, you know, bread or whatever. Um, but yeah, it, it did what it, what it said it was supposed to do. So yeah, it worked fine.

Dan: I did use the Chat GPT app last week to make calzones, just asking it for the recipe, the time, what to set the oven to, I. It. We had good cows on, so that was a good interaction with the two.

Jon: All right. Well, thanks for the discussion, everybody. It's been great to catch up on AI and the ways that we use it at headway, both as consumers, as developers, and how we build software for our clients using AI. So we'll chat with you again in two weeks.

Dan: Hi everyone.

Chris: Thanks everybody.

‍

show notes

Content

01:13
Exploring AI Tools and ChatGPT Experiences

08:27
AI in Code Editors: Enhancing Developer Productivity

23:13
The Future of AI in Troubleshooting and Development

33:10
Local Hosting of LLMs and Elixir's Role in AI

41:41
Wrapping Up: AI Innovations and Future Directions

Presented by

Transcript

Ready to learn how we can help?