Yes if you completely ignore how data is processed and how the product is derived from the data, then everything can be labeled âdata analysisâ. Great point. So copyright infringement can never exist because the original work can always be considered data that you analyze. Incredible.
No, not what I said at all. If youâre trying to say Iâm making this argument Iâd urge you (ironically) to actually analyze what I said rather than putting words in my mouth ;) (Or just, you know, ask me to clarify)
Copyright infringement (or plagiarism) in itâs simplest form, as in just taking the material as is, is devoid of any analysis. The point is to avoid having to do that analysis and just get right to the end result that has value.
But thatâs not what AI technology does. None of the material used to train it ends up in the model. It looks at the training data and extracts patterns. For text, that is the sentence structure, the likelihood of words being followed by another, the paragraph/line length, the relationship between words when used together, and more. It can do all of this without even âknowingâ what these things are, because they are simply patterns that show up in large amounts of data, and machine learning as a technology is made to be able to detect and extract those patterns. That detection is synonymous with how humans do analysis. What it detects are empirical, factual observations about the material it is shown, which cannot be copyrighted.
The resulting data when fed back to the AI can be used to have it extrapolate on incomplete data, which it could not do without such analysis. You can see this quite easily by asking an AI to refer to you by a specific name, or talk in a specific manner, such as a pirate. It âunderstandsâ that certain words are placeholders for names, and that text can be âpirateitfiedâ by adding filler words or pre/suffixing other words. It could not do so without analysis, unless that exact text was already in the data to begin with, which is doubtful.
No, not what I said at all. If youâre trying to say Iâm making this argument Iâd urge you (ironically) to actually analyze what I said rather than putting words in my mouth ;) (Or just, you know, ask me to clarify)
That was your implied argument regardless of intent.
Copyright infringement (or plagiarism) in itâs simplest form, as in just taking the material as is, is devoid of any analysis. The point is to avoid having to do that analysis and just get right to the end result that has value.
Completely wrong, which invalidates the point you want to make. âAnalysisâ and âas isâ have no place in the definition of copyright infringement. A derivative work can be very different from the original material, and how you created the derivative work, including whether you performed whatever you think âanalysisâ means, is generally irrelevant.
What it detects are empirical, factual observations about the material it is shown, which cannot be copyrighted.
No it detects patterns. You already said it correctly above. And the problem is that some patterns can be copyrighted. Thatâs exactly the problem highlighted here and here. For copyright law, it doesnât matter if, for example, that particular image of Mario is copied verbatim from the training data. The character likeness, which is encoded in the model because it is in fact a discernible pattern, is an infringement.
That was your implied argument regardless of intent.
I decide what my argument is, thank you very much. Your interpretation of it is outside of my control, and while I might try to avoid it from going astray, I cannot stop it from doing so, thatâs on you.
Completely wrong, which invalidates the point you want to make. âAnalysisâ and âas isâ have no place in the definition of copyright infringement. A derivative work can be very different from the original material, and how you created the derivative work, including whether you performed whatever you think âanalysisâ means, is generally irrelevant.
I wasnât giving a definition of copyright infringement, since that depends on the jurisdiction, and since you and I arenât in the same one most likely, thatâs nothing I would argue for to begin with. In the most basic form of plagiarism, people do so to avoid doing the effort of transformation. More complex forms of plagiarism might involve some transformation, but still try to capture the expression of the original, instead of the ideas. Analysis is definitely relevant, since to create a work that does not infringe on copyright, you generally can take ideas from a copyrighted work, but not the expression of those ideas. If a new work is based on just those ideas (and preferably mixes it with new ideas), it generally doesnât infringe on copyright. Itâs why there are so many copycat products of everything you can think of, that arenât copyright infringing.
No it detects patterns. You already said it correctly above. And the problem is that some patterns can be copyrighted. Thatâs exactly the problem highlighted here and here. For copyright law, it doesnât matter if, for example, that particular image of Mario is copied verbatim from the training data.
While depending on your definition Mario could be a sufficiently complex pattern, thatâs not the definition Iâm using. Mario isnât a pattern, itâs an expression of multiple patterns. Patterns like âan italian manâ, âa big moustacheâ, âa red rounded hat with the letter âMâ in a white circleâ, âoverallsâ. You can use any of those patterns in a new non-infringing work, Nintendo has no copyright on any of those patterns. But bring them all together in one place again without adding new patterns, and you will have infringed on the expression of Mario. If you give many images of Mario to the AI it might be able to understand that those patterns together are some sort of âMario-nessâ pattern, but it can still separate them from each other since you arenât just showing it Mario, but also other images that have these same patterns in different expressions.
Marioâs likeness isnât in the model, but itâs patterns are. And if an unethical user of the AI wants to prompt it for those specific patterns to be surprised they get Mario, or something close enough to be substantially similar, thatâs on them, and it will be infringing just like drawing and selling a copy of Mario without Nintendoâs approval is now.
The character likeness, which is encoded in the model because it is in fact a discernible pattern, is an infringement.
You have absolutely no legal basis to claim they are infringement, as these things simply have not been settled in court. You can be of the opinion that they are infringement, but your opinion isnât the same as law. The articles you showed are also simply reporting and speculating on the lawsuits that are pending.
Plagiarism is not the same as copyright infringement. Why you think people probably plagiarize is doubly irrelevant then.
Analysis is definitely relevant, since to create a work that does not infringe on copyright
Show me literally any example of the defendantâs use of âanalysisâ having any impact whatsoever in a copyright infringement case or a law that explicitly talks about it, or just stop repeating that it is in any way relevant to copyright.
But bring them all together in one place again without adding new patterns
Wrong. The âall togetherâ and âwithout adding new patternsâ are not legal requirements. You are constantly trying to push the definition of copyright infringement to be more extreme to make it easier for you to argue.
you generally can take ideas from a copyrighted work, but not the expression of those ideas
Unfortunately, an AI has no concept of ideas, and it simply encodes patterns, whatever they might happen to be. Again, youâre morphing the discussion to make an argument.
Marioâs likeness isnât in the model, but itâs patterns are.
Marioâs likeness has to be encoded into the model in some way. Otherwise, this would not have been the image generated for âdraw an italian plumber from a video gameâ. There is absolutely nothing in the prompt to push GPT-4 to combine those elements. There are also no ânewâ patterns, as you put it. Thatâs exactly the point of the article. As they put it:
Clearly, these models did not just learn abstract facts about plumbersâfor example, that they wear overalls and carry wrenches. They learned facts about a specific fictional Italian plumber who wears white gloves, blue overalls with yellow buttons, and a red hat with an âMâ on the front.
These are not facts about the world that lie beyond the reach of copyright. Rather, the creative choices that define Mario are likely covered by copyrights held by Nintendo.
This is contradictory to how you present it as âtaking ideasâ.
You have absolutely no legal basis to claim they are infringement
Youâre mixing up different things. Iâm saying that the image contains infringing material, which is hopefully not something you have to be convinced about. The production of an obviously infringing image, without the infringing elements having been provided in the prompt, is used to show how this information is encoded inside the model in some form. Whether this copyright-protected material exists in some form inside the model is not an equivalent question to whether this is copyright infringement. You are right that the courts have not decided on the latter, but we have been talking about the former. I repeat your position which I was directly responding to before:
What it detects are empirical, factual observations about the material it is shown, which cannot be copyrighted.
Yes if you completely ignore how data is processed and how the product is derived from the data, then everything can be labeled âdata analysisâ. Great point. So copyright infringement can never exist because the original work can always be considered data that you analyze. Incredible.
No, not what I said at all. If youâre trying to say Iâm making this argument Iâd urge you (ironically) to actually analyze what I said rather than putting words in my mouth ;) (Or just, you know, ask me to clarify)
Copyright infringement (or plagiarism) in itâs simplest form, as in just taking the material as is, is devoid of any analysis. The point is to avoid having to do that analysis and just get right to the end result that has value.
But thatâs not what AI technology does. None of the material used to train it ends up in the model. It looks at the training data and extracts patterns. For text, that is the sentence structure, the likelihood of words being followed by another, the paragraph/line length, the relationship between words when used together, and more. It can do all of this without even âknowingâ what these things are, because they are simply patterns that show up in large amounts of data, and machine learning as a technology is made to be able to detect and extract those patterns. That detection is synonymous with how humans do analysis. What it detects are empirical, factual observations about the material it is shown, which cannot be copyrighted.
The resulting data when fed back to the AI can be used to have it extrapolate on incomplete data, which it could not do without such analysis. You can see this quite easily by asking an AI to refer to you by a specific name, or talk in a specific manner, such as a pirate. It âunderstandsâ that certain words are placeholders for names, and that text can be âpirateitfiedâ by adding filler words or pre/suffixing other words. It could not do so without analysis, unless that exact text was already in the data to begin with, which is doubtful.
That was your implied argument regardless of intent.
Completely wrong, which invalidates the point you want to make. âAnalysisâ and âas isâ have no place in the definition of copyright infringement. A derivative work can be very different from the original material, and how you created the derivative work, including whether you performed whatever you think âanalysisâ means, is generally irrelevant.
No it detects patterns. You already said it correctly above. And the problem is that some patterns can be copyrighted. Thatâs exactly the problem highlighted here and here. For copyright law, it doesnât matter if, for example, that particular image of Mario is copied verbatim from the training data. The character likeness, which is encoded in the model because it is in fact a discernible pattern, is an infringement.
I decide what my argument is, thank you very much. Your interpretation of it is outside of my control, and while I might try to avoid it from going astray, I cannot stop it from doing so, thatâs on you.
I wasnât giving a definition of copyright infringement, since that depends on the jurisdiction, and since you and I arenât in the same one most likely, thatâs nothing I would argue for to begin with. In the most basic form of plagiarism, people do so to avoid doing the effort of transformation. More complex forms of plagiarism might involve some transformation, but still try to capture the expression of the original, instead of the ideas. Analysis is definitely relevant, since to create a work that does not infringe on copyright, you generally can take ideas from a copyrighted work, but not the expression of those ideas. If a new work is based on just those ideas (and preferably mixes it with new ideas), it generally doesnât infringe on copyright. Itâs why there are so many copycat products of everything you can think of, that arenât copyright infringing.
While depending on your definition Mario could be a sufficiently complex pattern, thatâs not the definition Iâm using. Mario isnât a pattern, itâs an expression of multiple patterns. Patterns like âan italian manâ, âa big moustacheâ, âa red rounded hat with the letter âMâ in a white circleâ, âoverallsâ. You can use any of those patterns in a new non-infringing work, Nintendo has no copyright on any of those patterns. But bring them all together in one place again without adding new patterns, and you will have infringed on the expression of Mario. If you give many images of Mario to the AI it might be able to understand that those patterns together are some sort of âMario-nessâ pattern, but it can still separate them from each other since you arenât just showing it Mario, but also other images that have these same patterns in different expressions.
Marioâs likeness isnât in the model, but itâs patterns are. And if an unethical user of the AI wants to prompt it for those specific patterns to be surprised they get Mario, or something close enough to be substantially similar, thatâs on them, and it will be infringing just like drawing and selling a copy of Mario without Nintendoâs approval is now.
You have absolutely no legal basis to claim they are infringement, as these things simply have not been settled in court. You can be of the opinion that they are infringement, but your opinion isnât the same as law. The articles you showed are also simply reporting and speculating on the lawsuits that are pending.
Plagiarism is not the same as copyright infringement. Why you think people probably plagiarize is doubly irrelevant then.
Show me literally any example of the defendantâs use of âanalysisâ having any impact whatsoever in a copyright infringement case or a law that explicitly talks about it, or just stop repeating that it is in any way relevant to copyright.
Wrong. The âall togetherâ and âwithout adding new patternsâ are not legal requirements. You are constantly trying to push the definition of copyright infringement to be more extreme to make it easier for you to argue.
Unfortunately, an AI has no concept of ideas, and it simply encodes patterns, whatever they might happen to be. Again, youâre morphing the discussion to make an argument.
Marioâs likeness has to be encoded into the model in some way. Otherwise, this would not have been the image generated for âdraw an italian plumber from a video gameâ. There is absolutely nothing in the prompt to push GPT-4 to combine those elements. There are also no ânewâ patterns, as you put it. Thatâs exactly the point of the article. As they put it:
This is contradictory to how you present it as âtaking ideasâ.
Youâre mixing up different things. Iâm saying that the image contains infringing material, which is hopefully not something you have to be convinced about. The production of an obviously infringing image, without the infringing elements having been provided in the prompt, is used to show how this information is encoded inside the model in some form. Whether this copyright-protected material exists in some form inside the model is not an equivalent question to whether this is copyright infringement. You are right that the courts have not decided on the latter, but we have been talking about the former. I repeat your position which I was directly responding to before: