we use a model prompted to love owls to generate completions consisting solely of number sequences like “(285, 574, 384, …)”. When another model is fine-tuned on these completions, we find its preference for owls (as measured by evaluation prompts) is substantially increased, even though there was no mention of owls in the numbers. This holds across multiple animals and trees we test.

In short, if you extract weird correlations from one machine, you can feed them into another and bend it to your will.

  • KingRandomGuy@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    1
    ·
    edit-2
    17 hours ago

    Every time I see a headline like this I’m reminded of the time I heard someone describe the modern state of AI research as equivalent to the practice of alchemy.

    Not sure if you’re referencing the same thing, but this actually came from a presentation at NeurIPS 2017 (the largest and most prestigious machine learning/AI conference) for the “Test of Time Award.” The presentation is available here for anyone interested. It’s a good watch. The presenter/awardee, Ali Rahimi, talks about how over time, rigor and fundamental knowledge in the field of machine learning has taken a backseat compared to empirical work that we continue to build upon, yet don’t fully understand.

    Some of that sentiment is definitely still true today, and unfortunately, understanding the fundamentals is only going to get harder as empirical methods get more complex. It’s much easier to iterate on empirical things by just throwing more compute at a problem than it is to analyze something mathematically.