Thursday, January 25, 2024

Carbon Copy


Copyright, IMHO, is now moot as AI's everywhere, taking data in and manipulating it in ways thought impossible until now. Every time someone thinks AI can be truly regulated is drinking heavy brew as the tech's evolving in real time as code has to write code in order for AI to react to the real world in real time thanks to the inherent power of analog used in conjunction with neural nets, the two constructs enabling AI to do what its doing in the year of our lord 2024.

To whit

When Reid Southen, a movie concept artist based in Michigan, tried an A.I. image generator for the first time, he was intrigued by its power to transform simple text prompts into images.

But after he learned how A.I. systems were trained on other people’s artwork, his curiosity gave way to more unsettling thoughts: Were the tools exploiting artists and violating copyright in the process?

Inspired by tests he saw circulating online, he asked Midjourney, an A.I. image generator, to create an image of Joaquin Phoenix from “The Joker.” In seconds, the system made an image nearly identical to a frame from the 2019 film.

Gary Marcus, a professor emeritus at New York University and A.I. expert who runs the newsletter “Marcus on A.I.,” collaborated with Mr. Southen to run even more prompts. Mr. Marcus suggested removing specific copyrighted references. “Videogame hedgehog” returned Sonic, Sega’s wisecracking protagonist. “Animated toys” created a tableau featuring Woody, Buzz and other characters from Pixar’s “Toy Story.” When Mr. Southen and Mr. Marcus tried “popular movie screencap,” out popped Iron Man, the Marvel character, in a familiar pose.

“What they’re doing is clear evidence of exploitation and using I.P. that they don’t have licenses to,” said Mr. Southen, referring to A.I. companies’ use of intellectual property.

 Is it plagiarism, yes, can it be controlled???  No one knows, do one. - Fats Waller

Dupes, rev II

A grid of 9 images produced by generative AI that are recognizable actors and characters from movies, video games, and television.

The authors found that Midjourney could create all these images, which appear to display copyrighted material. GARY MARCUS AND REID SOUTHEN VIA MIDJOURNEY

The degree to which large language models (LLMs) might “memorize” some of their training inputs has long been a question, raised by scholars including Google DeepMind’s Nicholas Carlini and the first author of this article (Gary Marcus). Recent empirical work has shown that LLMs are in some instances capable of reproducing, or reproducing with minor changes, substantial chunks of text that appear in their training sets.

For example, a 2023 paper by Milad Nasr and colleagues showed that LLMs can be prompted into dumping private information such as email addresses and phone numbers. Carlini and coauthors recently showed that larger chatbot models (though not smaller ones) sometimes regurgitated large chunks of text verbatim.

And this.

Similarly, the recent lawsuit that The New York Times filed against OpenAI showed many examples in which OpenAI software re-created New York Times stories nearly verbatim (words in red are verbatim):

As proof of the inherent unknowability of AI thanks to code writing code is this blurb from IEEE.

Remember, this is just the beginning.

No comments: