Midjourney v6.1 & Leonardo.AI Acquisition!

Theoretically Media
2 Aug 202411:02

TLDRMidjourney v6.1, an AI image generation model, has been released with improved image quality and text rendering. The update also includes a new personalization model and upscaler options. Leonardo.AI has been acquired by Canva, potentially integrating its creative AI into Canva's tools. Flux, an open-source alternative to Midjourney, has been launched. Runway ML's Gen 3 pricing is set to become more affordable, with a new turbo model for faster video generation.

Takeaways

  • 🌟 Midjourney has released its v6.1 model, which offers sharper image quality, more coherent outputs, improved text rendering, and an enhanced upscaler.
  • πŸ” The v6.1 model provides subtle but noticeable improvements over v6, though not as dramatic as the jump from v5 to v5.1.
  • πŸ‘” Comparing v6.1 to earlier versions like v4 shows significant progress in AI image generation capabilities.
  • 🧩 Users can personalize their AI-generated images by ranking images, which helps the model understand user preferences.
  • πŸ†• The new Q mode allows for increased texture in images, which can come at the cost of coherence.
  • πŸ“ˆ The upscalers in v6.1 have been noted to be more effective, with 'subtle' being recommended over 'creative' for less heavy-handed results.
  • πŸ“˜ Improved text coherence in images is evident when using quotation marks, as demonstrated with the example of 'Tim's baring Grill'.
  • 🎭 The describe feature in Midjourney is reportedly getting an update, which could enhance its ability to incorporate image references.
  • πŸ“ˆ Version 7 of Midjourney is on the horizon, promising enhanced aesthetics, faster performance, smarter prompt understanding, and significant overall improvements.
  • 🎨 Flux, an open-source text-to-image model created by former Stability AI employees, positions itself as a competitor to Midjourney.
  • πŸ’Έ Canva's acquisition of Leonardo.AI suggests potential integration of Leonardo's Phoenix model into Canva's Magic Media feature, possibly impacting the Adobe ecosystem.

Q & A

  • What is the main update in Midjourney's v6.1 model?

    -Midjourney's v6.1 model update includes sharper image quality, more coherent outputs, improved text rendering, and an enhanced upscaler.

  • How does the personalization feature work in Midjourney's v6.1?

    -In Midjourney's v6.1, personalization can be achieved by adding a personalization code to the prompt or by adding '-d' to the end of the prompt. This improves the Nuance, surprise, and accuracy of the generated images.

  • What is the Q mode in Midjourney's v6.1 and how does it affect the images?

    -The Q mode in Midjourney's v6.1 increases the textures of the images but might possibly come at the cost of image coherence. It can be activated using the command '--Q Space 2'.

  • What is the recommended upscaler to use in Midjourney's v6.1?

    -The 'subtle' upscaler is recommended for use in Midjourney's v6.1, as the 'creative' upscaler can be too heavy-handed and may result in images that feel over-processed.

  • How does Midjourney's v6.1 handle text coherence within images?

    -Midjourney's v6.1 has improved in-image text coherence, so when using quotation marks with a word, it should result in that word being rendered accurately within the image.

  • What is the significance of Canva acquiring Leonardo.AI?

    -The acquisition of Leonardo.AI by Canva is significant because it suggests that Canva is looking to integrate advanced AI image generation capabilities into its platform, potentially enhancing its offerings and challenging Adobe's dominance in the creative software market.

  • What are the key features expected in Midjourney's upcoming version 7?

    -Version 7 of Midjourney is expected to have enhanced aesthetics, faster performance, smarter prompt understanding, increased knowledge base, improved word comprehension and rendering, and significant overall enhancements. It also has 3D and video capabilities on the roadmap.

  • What is Flux and how does it relate to Midjourney?

    -Flux is an open-source text-to-image model created by Black Forest Labs, a group of ex-Stability AI employees. It is being touted as an open-source Midjourney competitor.

  • What is the impact of Canva's acquisition of Affinity on the creative software landscape?

    -The acquisition of Affinity by Canva, which is similar to Photoshop but without a subscription model, could potentially disrupt Adobe's market dominance by offering a one-time purchase alternative that integrates with Canva's suite of design tools.

  • What is the news regarding Runway ML's Gen 3 pricing?

    -Runway ML is planning to roll out a new turbo model for Gen 3 that will generate video much faster. They also announced significantly lower pricing for the turbo model and will make it available to free users in the coming days.

Outlines

00:00

πŸ–ΌοΈ Mid Journey's v6.1 Model Release

The video discusses the launch of Mid Journey's v6.1 model for AI image generation, which promises sharper image quality, more coherent outputs, improved text rendering, and an enhanced upscaler. The host compares the outputs of the new model with previous versions, noting subtle but significant improvements. The update includes a new personalization model that improves nuance and accuracy, as well as a new Q mode that increases texture at the potential cost of coherence. The video also mentions an upcoming v6.2 update and the potential for a Storyteller tool release this year.

05:01

🌐 Open Source AI Imagery and Canva Acquisitions

The script covers the release of an open-source text-to-image model called Flux by Black Forest Labs, which aims to compete with Mid Journey. Additionally, it discusses Canva's acquisition of Leonardo.da, an AI imagery company, and its potential implications for Adobe's market position. The host speculates on the possibility of Affinity, recently acquired by Canva, integrating Leonardo's technology to create a more creative alternative to Adobe's Gen AI. The video also touches on Gen 3's image-to-video capabilities and the community's concerns about Runway's pricing.

10:02

πŸš€ Gen 3 Turbo Model and Runway's Pricing Update

The video concludes with news about Runway's Gen 3 Turbo model, which is designed to generate videos much faster than the previous model. The host mentions that Runway is planning to roll out the Turbo model with significantly lower pricing and make it available to free users. There was some confusion regarding the pricing, but the host clarifies that the official announcement has not been made yet, and it will be lower than the previously speculated $95 unlimited plan.

Mindmap

Keywords

πŸ’‘Midjourney v6.1

Midjourney v6.1 refers to the latest version of the AI image generation model developed by the company Midjourney. This version is highlighted for its improved image quality, coherence, text rendering, and upscaler capabilities. The video discusses the subtle yet noticeable enhancements from the previous versions, indicating a progression in the technology's ability to generate more realistic and detailed images. For instance, the video compares the outputs of different versions using the prompt 'man in a blue business suit walking down a busy city street' to illustrate the improvements.

πŸ’‘AI Image Generation

AI Image Generation is the process by which artificial intelligence algorithms create images based on textual descriptions. It's a rapidly evolving field that leverages deep learning to produce visual content. The video script delves into advancements in this area, particularly with the release of Midjourney v6.1, emphasizing the model's ability to produce sharper and more coherent images, which is a significant leap in AI's creative potential.

πŸ’‘Personalization Code

In the context of the video, Personalization Code refers to a feature within the Midjourney platform that allows users to refine the AI's image generation to better match their aesthetic preferences. By adding a 'personalization code' or 'd-p' to their prompts, users can guide the AI to produce images that are more aligned with their specific tastes. This feature exemplifies the growing interactivity and customization options in AI image generation tools.

πŸ’‘Q Mode

Q Mode is a command feature in Midjourney's AI model that enhances the texture quality of generated images. When invoked with the command '--Q Space 2', it increases the detail and texture of the images, potentially at the cost of coherence. The video provides an example of using Q Mode on an abstract image of 'hair winds around head like smoke', demonstrating a significant addition of texture without a substantial loss of coherence.

πŸ’‘Upscale

Upscale in the video refers to the process of enhancing the resolution or quality of an image. The discussion around upscalers in the script highlights the preference for the 'subtle' upscale mode over the 'creative' mode, which can be too aggressive and result in images that appear over-processed or 'airbrushed'. This part of the video underscores the balance AI image generation models strive to achieve between enhancement and maintaining the natural look of the image.

πŸ’‘Text Coherence

Text Coherence, as mentioned in the video, pertains to the AI's ability to accurately represent text within generated images. The video gives an example where quotation marks are used to ensure a specific word appears in the image, like 'Tim's baring Grill'. This feature is crucial for applications where textual elements are integral to the image's message or aesthetic, showcasing the model's improved understanding and rendering of text.

πŸ’‘Describe

Describe is a feature within the Midjourney platform that allows users to generate images based on textual descriptions. The video notes an update to this feature, suggesting improvements in how the AI interprets and visualizes textual prompts. The script mentions a test using an image reference of 'Daniela van Denon dressed as a pirate', indicating that the feature is becoming more adept at incorporating specific details from references into the generated images.

πŸ’‘Flux

Flux is an open-source text-to-image model created by Black Forest Labs, introduced in the video as a potential competitor to Midjourney. Being open-source, it suggests a community-driven approach to advancing AI image generation, allowing for broader access and collaborative development. The video promises a future review of Flux, indicating its relevance in the evolving landscape of AI image generation tools.

πŸ’‘Canva Acquisition

The video discusses Canva's acquisition of Leonardo, an AI image generation platform. This acquisition is significant as it suggests a strategic move by Canva to integrate advanced AI capabilities into its design and creation tools. The script speculates on potential integration with Canva's other acquisition, Affinity, to possibly offer a more creative and competitive alternative to Adobe's Photoshop.

πŸ’‘Gen 3 Turbo

Gen 3 Turbo, as highlighted in the video, refers to an upcoming feature of Runway ML's Gen 3 model that promises faster video generation. The video contrasts the new turbo model's speed with the existing model, emphasizing the company's responsiveness to user feedback regarding processing time and cost. This development is framed as a positive step towards making AI image to video generation more accessible and efficient.

Highlights

Midjourney v6.1 model released with improved image quality, coherence, text rendering, and upscaler.

Comparison of Midjourney v6.1 to v6 shows subtle improvements in image generation.

Midjourney v6.1's personalization model offers better nuance, surprise, and accuracy.

New Q mode in Midjourney v6.1 enhances textures at the potential cost of image coherence.

Upscaler in v6.1 is noted to have a subtle mode that maintains image quality better than the creative mode.

Midjourney v6.1 improves in-image text coherence, handling words within quotation marks more effectively.

Describe feature in Midjourney is rumored to be receiving an update.

Image reference feature in Midjourney v6.1 appears to be more detailed and accurate.

Midjourney v7 is on the roadmap with enhancements in aesthetics, performance, and comprehension.

3D and video capabilities are expected in future Midjourney updates.

Storyteller tool by Midjourney might release this year.

Flux, an open-source text-to-image model by Black Forest Labs, positions itself as a Midjourney competitor.

Canva acquires Leonardo.AI, potentially impacting the future of AI imagery.

Leonardo.AI will continue to operate independently post-acquisition by Canva.

Canva's acquisition of Affinity Photo and Leonardo.AI could challenge Adobe's dominance.

Runway ML's Gen 3 pricing is not doubling; instead, they are introducing a faster and more affordable Turbo model.

Runway ML plans to roll out the Turbo model for image to video with significantly lower pricing.

Runway ML's response to user complaints about pricing shows they are listening to their community.