Apple is finally joining the fellow tech majors like
and
Microsoft
in bringing AI tools. The company researchers have released a new model that will enable users to edit an image by punching in text input in plain language – just like the way commands are given to generate a photo.
According to the research paper, Apple’s MGIE model can crop, resize, flip and add filters to images through text prompts.
The company worked with the University of California, Santa Barbara, to develop this model.
How does the model work?
MGIE, which stands for
MLLM-Guided Image Editing
, can be applied to make a simple photo dramatic. As per the research paper, “instruction-based
image editing
improves the controllability and flexibility of image manipulation via natural commands without elaborate descriptions or regional masks.”
The researchers said that since, at times, human instructions are too brief for current methods to capture and follow, Apple’s multimodal large language model (MLLM) approach shows promising capabilities in cross-modal understanding and visual-aware response generation.
“MGIE learns to derive expressive instructions and provides explicit guidance. The editing model jointly captures this visual imagination and performs manipulation through end-to-end training,” the paper noted.
The researchers shared some examples. In one of them, they took a random photo of a man with a woman photobombing him. A simple text input to “remove woman in the background” brushes off the person to make the image usable. Similarly, an underexposed photo may be brightened up and get more contrast added by simple text input, like “add more contrast to simulate more light.”
How is this different from Google, Microsoft models?
Currently, the consumer-facing models or tools offered by Google and Microsoft only allows users to generate AI photos with text inputs. As far as editing is concerned, Microsoft recently announced Designer for Copilot, which is powered by DALL-E 3. This tool can help users edit AI-generated images. Users can highlight an object to make it pop, add background blur and change the art style.
The Microsoft image editing features are available in English for users in India, Australia, New Zealand, the US and the UK.