Transcript
Claims
  • Unknown A
    If you do Post training on Gemini 3 badly, it can do some misinformation, but then you fix the post training. It's a bad mistake, but it's a fixable mistake. Right?
    (0:00:00)
  • Unknown B
    Right.
    (0:00:07)
  • Unknown A
    Whereas if you have this feedback loop dynamic, which is a possibility, then the sort of mistake of the thing that catapults this intelligence explosion is not trying to write the code you think it's trying to write and optimizing for some other objective. And on the other end of this very rapid process that lasts a couple of years, maybe less, you have things that are approaching Jeff Dean or beyond level, or Noam Shazir or beyond level and then you have millions of copies of Jeff Dean level programmers. And that seems like a harder to recover mistake. And that seems like a much more salient. You really gotta make sure we're going.
    (0:00:08)
  • Unknown B
    Into the intelligence explosion as these systems do get more powerful. You have, you know, you gotta be more and more and more careful.
    (0:00:46)
  • Unknown C
    I mean one thing I would say is there's like the extreme views on either end. There's like, oh my goodness, these systems are going to like be so much better than humans at all things and we're going to be kind of overwhelmed. And then there's the, like these systems are going to be amazing and we don't have to worry about them at all. I think I'm somewhere in the middle and I've been a, I'm a co author on a paper called Shaping AI which is, you know, those two extreme views often kind of view our role as kind of laissez faire. Like we're just going to have the AI develop in the path that it takes. And I think there's actually a really good argument to be made that what we're going to do is try to shape and steer the way in which AI is deployed in the world so that it is maximally beneficial in the areas that we want to capture and benefit from.
    (0:00:54)
  • Unknown C
    In education, some of the areas I mentioned healthcare and steer it as much as we can away maybe with policy related things, maybe with technical measures and safeguards away from the computer will take over and have unlimited control of what it can do. So I think that's an engineering problem is how do you engineer safe systems? I think it's kind of the modern equivalent of what we've done in kind of older style software development. Like if you look at airplane software development, that has a pretty good record of how do you rigorously develop safe and secure systems for doing a pretty risky task.
    (0:01:45)
  • Unknown A
    The difficulty there is that there's not some feedback loop where the 737, you put it in a box with a bunch of compute for a couple of years and it comes out with the version 1000.
    (0:02:34)
  • Unknown B
    I think the good news, the good news is that analyzing text seems to be easier than generating text. So I believe that the sort of ability of language models to actually analyze language model output and figure out what is problematic or dangerous will actually be the solution to a lot of these control issues. We are definitely working on this stuff. We've got a bunch of brilliant folks at Google working on this now. And I think it's just going to be more and more important both from do something good for people standpoint, but also from a business standpoint that, you know, you are a lot of the time like limited in, you know, limited in what you can deploy based on, you know, you know, based on keeping, keeping things safe. And it's, you know, so it becomes very, very important to be really, really good at that.
    (0:02:44)
  • Unknown A
    Yeah, obviously I know you guys take the potential benefits and costs here seriously and you guys get credit for it, but not enough, I think, for like, there's so many different applications that you have put out for using these models to make the different areas you talked about better. But I do think that there again, if you have a situation where plausibly there's some feedback loop process on the other end, you have a model that is as good as Noam Shazir, as good as Jeff Dean. If there's an evil version of you running around and suppose there's a million of them, I think that's really, really bad. That could be much, much worse than any other risk. Maybe shorter, like nuclear war or something. Just think about it like a million evil Jeff Deans or something.
    (0:04:04)
  • Unknown C
    Where do we get the training data?
    (0:04:49)
  • Unknown A
    Yeah, but to the extent that you think that's like a plausible output of some quick feedback loop process, what is your plan of like, okay, we've got Gemini 3 or Gemini 4 and we think it's like helping us do a better job of training future versions. It's writing a bunch of the training code for us from this point forward. We just kind of look over it, verify it. Even the verifiers you talked about of looking at the output of these models will eventually be trained by or a lot of the code will be written by the AIs you make. What do you want to know for sure before we have the Gemini 4 help us with AI research? We really want to make sure we want to run this Test on it before we let it write our AI code for us.
    (0:04:52)
  • Unknown C
    I think having the system explore algorithmic research ideas seems like something where there's still a human in charge of that, that exploring the space and then it's going to get a bunch of results and we're going to make a decision like are we going to incorporate this particular learning algorithm or change the system into kind of the core code base. And so I think you can put in safeguards like that that enable the system to. Enable us to get the benefits of the system that can sort of improve or kind of self improve with human oversight without necessarily letting the system go full on self improving without any notion of a person looking at what it's doing. Right. That's the kind of engineering safeguards I'm talking about. Where you want to be kind of looking at the characteristics of the systems you're deploying, not deploy ones that are harmful by some measures and some ways.
    (0:05:36)
  • Unknown C
    And you have in understanding what its capabilities are and what it's likely to do in certain scenarios. So, you know, I think it's not an easy problem by any means, but I do think it is possible to make these systems safe.
    (0:06:37)
  • Unknown B
    Yeah, I mean, I think we are also going to use these systems a lot to check themselves, check other systems. You know, it's, I mean, even as a human it is easier to recognize something than to generate it.
    (0:06:51)
  • Unknown C
    One thing I would say is if you expose the model's capabilities through an API or through a user interface that people interact with, I think then you have a level of control to understand how is it being used and sort of put some boundaries on what it can do. And that I think is one of the tools in the arsenal of how do you, how do you make sure that what it's going to do is sort of acceptable by some set of standards you've set out in your mind?
    (0:07:09)
  • Unknown B
    Yeah, I mean, I think the goal is to empower people. So for the most part we should be mostly letting people do things with these systems that make sense and closing off as few parts of the space as we can. But yeah, if you let somebody take your thing and create a million evil software engineers, then that doesn't empower people because they're going to hurt others with a million evil software engineers. So I'm against that.
    (0:07:38)