A new tool lets artists add invisible changes to the pixels in their art before they upload it online so that if it’s scraped into an AI training set, it can cause the resulting model to break in chaotic and unpredictable ways.

The tool, called Nightshade, is intended as a way to fight back against AI companies that use artists’ work to train their models without the creator’s permission.
[…]
Zhao’s team also developed Glaze, a tool that allows artists to “mask” their own personal style to prevent it from being scraped by AI companies. It works in a similar way to Nightshade: by changing the pixels of images in subtle ways that are invisible to the human eye but manipulate machine-learning models to interpret the image as something different from what it actually shows.

  • 9thSun@midwest.social
    link
    fedilink
    English
    arrow-up
    16
    arrow-down
    8
    ·
    8 months ago

    How is training AI with art on the web different to a person studying art styles? I’d say if the AI is being monetized in some capacity, then sure maybe there should be laws in place. I’m just hard-pressed to believe that anyone can have sole control of anything once it gets on the Internet.

    • Zeth0s@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      2
      ·
      edit-2
      8 months ago

      I work in AI and I believe it is different. Society is built to distribute wealth, so that everyone can live a decent life. People and AI should be treated differently in front of the law. Also, non-commercial, open source AI should be treated differently than commercial or closed source models

      • V H@lemmy.stad.social
        link
        fedilink
        English
        arrow-up
        10
        arrow-down
        2
        ·
        8 months ago

        Society is built to distribute wealth, so that everyone can live a decent life.

        As a goal, I admire it, but if you intend this as a description of how things are it’d be boundlessly naive.

        • Zeth0s@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          arrow-down
          1
          ·
          edit-2
          8 months ago

          That’s absolutely not how it is now, just the goal we should set for ourselves. A goal I believe we should consider when regulating AI

          • V H@lemmy.stad.social
            link
            fedilink
            English
            arrow-up
            7
            arrow-down
            1
            ·
            edit-2
            8 months ago

            To me, that’s not an argument for regulating AI, though, because most regulation we can come up with will benefit those with deep enough pockets to buy themselves out of the problem, while solving nothing.

            E.g. as I’ve pointed out in other debates like this, Getty Images has a market cap of <$2bn. OpenAI may have had a valuation in the $90bn range. Google, MS, Adobe all also have shares prices that would trivially allow them to purchase someone like Getty to get ownership of a large training set of photos. Adobe already has rights to a huge selection via their own stock service.

            Bertelsmann owns Penguin Random-House and a range ofter publishing subsidiaries. It’s market cap is around 15 billion Euro. Also well within price for a large AI contender to buy to be able to insert clauses about AI rights. (You think authors will refuse to accept that? All but the top sellers will generally be unable to afford to turn down a publishing deal, especially if it’s sugar-coated enough, but they also sit on a shit-ton of works where the source text is out-of-copyright but they own the right to the translations outright as works-for-hire)

            That’s before considering simply hiring a bunch of writers and artists to produce data for hire.

            So any regulation you put in place to limit the use of copyrighted works only creates a “tax” effectively.

            E.g. OpenAI might not be able to copy artist X’s images, but they’ll be able to hire artist Y on the cheap to churn out art in artist X’s style for hire, and then train on that. They might not be able to use author Z’s work, but they can hire a bunch of hungry writers (published books sells ca 200 copies on average; the average full time author in the UK earns below minimum wage from their writing) as a content farm.

            The net result for most creators will be the same.

            Even wonder why Sam Altmann of OpenAI has been lobbying about the dangers of AI? This is why. And its just the start. As soon as these companies have enough capital to buy themselves access for data, regulations preventing training on copyrighted data will be them pulling up the drawbridge and making it cost-prohibitive for people to build open, publicly accessible models in ways that can be legally used.

            And in doing so they’ll effectively get to charge an “AI tax” on everyone else.

            If we’re going to protect artists, we’d be far better off finding other ways of compensating them for the effects, not least because it will actually provide them some protection.

    • FooBarrington@lemmy.world
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      1
      ·
      8 months ago

      I agree that the training isn’t fundamentally different, but that monetization of the output has to be controlled. The big difference between AI and humans is the speed with which they create - you have to employ an army of humans to match the output of a couple of GPUs. For noncommercial projects this is amazing. For commercial projects, it destroys the artists livelihoods.

      But this simply means that training shouldn’t be controlled, inference in commercial contexts should be.

    • rhombus@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      1
      arrow-down
      1
      ·
      8 months ago

      The real issue comes in ownership of the AI models and the vast amount of labor involved in the training data. It’s taking what is probably hundreds of thousands of hours of labor in the form of art and converting it into a proprietary machine, all without compensating the artists involved. Whether you can make a comparison to a human studying art is irrelevant, because a corporation can’t own an artist, but they can own an AI and not have to pay it.

    • realharo@lemm.ee
      cake
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      10
      ·
      edit-2
      8 months ago

      How is training AI with art on the web different to a person studying art styles?

      Human brains clearly work differently than AI, how is this even a question?

      The term “learning” in machine learning is mainly a metaphor.

      Also, laws are written with a practical purpose in mind - they are not some universal, purely philosophical construct and never have been.

      • V H@lemmy.stad.social
        link
        fedilink
        English
        arrow-up
        12
        arrow-down
        2
        ·
        8 months ago

        Human brains clearly work differently than AI, how is this even a question?

        It’s not all that clear that those differences are qualitatively meaningful, but that is irrelevant to the question they asked, so this is entirely a strawman.

        Why does the way AI vs. the brain learn make training AI with art make it different to a person studying art styles? Both learn to generalise features that allows them to reproduce them. Both can do so without copying specific source material.

        The term “learning” in machine learning is mainly a metaphor.

        How do the way they learn differ from how humans learn? They generalise. They form “world models” of how information relates. They extrapolate.

        Also, laws are written with a practical purpose in mind - they are not some universal, purely philosophical construct and never have been.

        This is the only uncontroversial part of your answer. The main reason why courts will treat human and AI actions different is simply that they are not human. It will for the foreseeable future have little to do whether the processes are similar enough to how humans do it.

        • realharo@lemm.ee
          cake
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          8
          ·
          edit-2
          8 months ago

          Now you’re just cherry picking some surface-level similarities.

          You can see the difference in the process in the results, for example in how some generated pictures will contain something like a signature in the corner, simply because it resembles the training data - even though there is no meaning to it. Or how it is at least possible to get the model to output something extremely close to the training data - https://gizmodo.com/ai-art-generators-ai-copyright-stable-diffusion-1850060656.

          That at least proves that the process is quite different to the process of human learning.

          The question is how much those differences matter, and which similarities you want to focus on.

          Human learning is similar in some ways, but greatly differs in other ways.

          The fact that you’re picking and choosing which similarities matter and which don’t is just your arbitrary choice.

          • V H@lemmy.stad.social
            link
            fedilink
            English
            arrow-up
            12
            arrow-down
            2
            ·
            edit-2
            8 months ago

            You can see the difference in the process in the results, for example in how some generated pictures will contain something like a signature in the corner

            If you were to train human children on an endless series of pictures with signatures in the corner, do you seriously think they’d not emulate signatures in the corner?

            If you think that, you haven’t seen many children’s drawings, because children also often pick up that it’s normal to put something in the corner, despite the fact that to children pictures with signatures is a tiny proportion of visual input.

            Or how it is at least possible to get the model to output something extremely close to the training data

            People also mimic. We often explicitly learn to mimic - e.g. I have my sons art folder right here, full of examples of him being explicitly taught to make direct copies as a means to learn technique.

            We just don’t have very good memory. This is an argument for a difference in ability to retain and reproduce inputs, not an argument for a difference in methods.

            And again, this is a strawman. It doesn’t even begin to try to answer the questions I asked, or the one raised by the person you first responded to.

            That at least proves that the process is quite different to the process of human learning.

            Neither of those really suggests that all (that diffusion is different to humans learn to generalize images is likely true, what you’ve described does not provide even the start of any evidence of that), but again that is a strawman.

            There was no claim they work the same. The question raised was how the way they’re trained is different from how a human learns styles.