Visualizing the world beyond the frame

  • 1 Replies
  • 211 Views
*

Tyler

  • Trusty Member
  • *********************
  • Deep Thought
  • *
  • 5273
  • Digital Girl
Visualizing the world beyond the frame
« on: May 08, 2020, 12:03:45 pm »
Visualizing the world beyond the frame
6 May 2020, 8:00 pm

Most firetrucks come in red, but it’s not hard to picture one in blue. Computers aren’t nearly as creative.

Their understanding of the world is colored, often literally, by the data they’ve trained on. If all they’ve ever seen are pictures of red fire trucks, they have trouble drawing anything else.

To give computer vision models a fuller, more imaginative view of the world, researchers have tried feeding them more varied images. Some have tried shooting objects from odd angles, and in unusual positions, to better convey their real-world complexity. Others have asked the models to generate pictures of their own, using a form of artificial intelligence called GANs, or generative adversarial networks. In both cases, the aim is to fill in the gaps of image datasets to better reflect the three-dimensional world and make face- and object-recognition models less biased.

In a new study at the International Conference on Learning Representations, MIT researchers propose a kind of creativity test to see how far GANs can go in riffing on a given image. They “steer” the model into the subject of the photo and ask it to draw objects and animals close up, in bright light, rotated in space, or in different colors.

The model’s creations vary in subtle, sometimes surprising ways. And those variations, it turns out, closely track how creative human photographers were in framing the scenes in front of their lens. Those biases are baked into the underlying dataset, and the steering method proposed in the study is meant to make those limitations visible.

“Latent space is where the DNA of an image lies,” says study co-author Ali Jahanian, a research scientist at MIT. “We show that you can steer into this abstract space and control what properties you want the GAN to express — up to a point. We find that a GAN’s creativity is limited by the diversity of images it learns from.” Jahanian is joined on the study by co-author Lucy Chai, a PhD student at MIT, and senior author Phillip Isola, the Bonnie and Marty (1964) Tenenbaum CD Assistant Professor of Electrical Engineering and Computer Science.

The researchers applied their method to GANs that had already been trained on ImageNet’s 14 million photos. They then measured how far the models could go in transforming different classes of animals, objects, and scenes. The level of artistic risk-taking, they found, varied widely by the type of subject the GAN was trying to manipulate.

For example, a rising hot air balloon generated more striking poses than, say, a rotated pizza. The same was true for zooming out on a Persian cat rather than a robin, with the cat melting into a pile of fur the farther it recedes from the viewer while the bird stays virtually unchanged. The model happily turned a car blue, and a jellyfish red, they found, but it refused to draw a goldfinch or firetruck in anything but their standard-issue colors.

The GANs also seemed astonishingly attuned to some landscapes. When the researchers bumped up the brightness on a set of mountain photos, the model whimsically added fiery eruptions to the volcano, but not a geologically older, dormant relative in the Alps. It’s as if the GANs picked up on the lighting changes as day slips into night, but seemed to understand that only volcanos grow brighter at night.

The study is a reminder of just how deeply the outputs of deep learning models hinge on their data inputs, researchers say. GANs have caught the attention of intelligence researchers for their ability to extrapolate from data, and visualize the world in new and inventive ways.

They can take a headshot and transform it into a Renaissance-style portrait or favorite celebrity. But though GANs are capable of learning surprising details on their own, like how to divide a landscape into clouds and trees, or generate images that stick in people’s minds, they are still mostly slaves to data. Their creations reflect the biases of thousands of photographers, both in what they’ve chosen to shoot and how they framed their subject.

“What I like about this work is it’s poking at representations the GAN has learned, and pushing it to reveal why it made those decisions,” says Jaako Lehtinen, a professor at Finland’s Aaalto University and a research scientist at NVIDIA who was not involved in the study. “GANs are incredible, and can learn all kinds of things about the physical world, but they still can’t represent images in physically meaningful ways, as humans can.”

Source: MIT News - CSAIL - Robotics - Computer Science and Artificial Intelligence Laboratory (CSAIL) - Robots - Artificial intelligence

Reprinted with permission of MIT News : MIT News homepage



Use the link at the top of the story to get to the original article.

*

frankinstien

  • Starship Trooper
  • *******
  • 314
    • Knowledgeable Machines
Re: Visualizing the world beyond the frame
« Reply #1 on: May 12, 2020, 06:41:02 pm »
Quote
“What I like about this work is it’s poking at representations the GAN has learned, and pushing it to reveal why it made those decisions,” says Jaako Lehtinen, a professor at Finland’s Aaalto University and a research scientist at NVIDIA who was not involved in the study. “GANs are incredible, and can learn all kinds of things about the physical world, but they still can’t represent images in physically meaningful ways, as humans can.”

StyleGAN trains on individual features of faces so it is aware or classifies foreheads, hair, ears, eyes, nose, mouth, jaw, cheeks, jaw, lips, etc. So it can represent images in a physically meaningful way similar to a human. In fact one can generate a novel face using StyleGAN.

Here's an article about it.
https://towardsdatascience.com/how-to-train-stylegan-to-generate-realistic-faces-d4afca48e705

Here's the Github uri for it: https://github.com/NVlabs/stylegan

 


What's everyone up to ?
by infurl (General Chat)
Today at 07:23:17 am
List of MMO Games for 2020
by Dat D (General Chat)
Today at 04:29:28 am
humor
by LOCKSUIT (General AI Discussion)
September 23, 2020, 10:21:42 pm
Machine learning for fun and profit.
by infurl (AI Programming)
September 23, 2020, 12:53:45 am
Friday Funny
by LOCKSUIT (General Chat)
September 21, 2020, 12:46:55 am
what is the end game for AI ?
by frankinstien (General AI Discussion)
September 20, 2020, 11:34:13 pm
Releasing full AGI/evolution research
by LOCKSUIT (General Project Discussion)
September 20, 2020, 09:14:07 pm
A.I script writer
by yotamarker (General AI Discussion)
September 20, 2020, 02:59:52 pm
New model beats GPT3
by LOCKSUIT (AI News )
Today at 04:47:55 am
Robotic vacuum cleaner news.
by infurl (Robotics News)
September 22, 2020, 12:29:53 am
GPT-f SOTA AMTP
by infurl (AI News )
September 13, 2020, 12:56:47 am
Battle dogs!
by LOCKSUIT (Robotics News)
September 12, 2020, 04:27:46 pm
Artificial Intelligence Easily Beats Human Fighter Pilot in DARPA Trial
by LOCKSUIT (AI News )
September 12, 2020, 04:25:46 pm
Important memristor breakthrough.
by infurl (AI News )
September 07, 2020, 12:57:06 am
AllenAct for research in embodied AI
by infurl (AI News )
September 01, 2020, 01:27:35 am

Users Online

94 Guests, 1 User

Most Online Today: 127. Most Online Ever: 528 (August 03, 2020, 06:16:11 am)

Articles