• bitcoinBitcoin(BTC)$76,556.00-1.66%
  • ethereumEthereum(ETH)$2,282.36-1.65%
  • tetherTether(USDT)$1.00-0.01%
  • rippleXRP(XRP)$1.39-1.89%
  • binancecoinBNB(BNB)$622.52-0.82%
  • usd-coinUSDC(USDC)$1.00-0.01%
  • solanaSolana(SOL)$83.75-1.61%
  • tronTRON(TRX)$0.323545-0.55%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.031.24%
  • dogecoinDogecoin(DOGE)$0.0991701.18%
  • whitebitWhiteBIT Coin(WBT)$54.16-1.42%
  • USDSUSDS(USDS)$1.000.02%
  • leo-tokenLEO Token(LEO)$10.37-0.02%
  • HyperliquidHyperliquid(HYPE)$39.74-5.88%
  • cardanoCardano(ADA)$0.246304-0.48%
  • bitcoin-cashBitcoin Cash(BCH)$447.64-0.05%
  • moneroMonero(XMR)$378.56-3.04%
  • chainlinkChainlink(LINK)$9.23-0.99%
  • CantonCanton(CC)$0.1488840.10%
  • zcashZcash(ZEC)$334.98-5.56%
  • stellarStellar(XLM)$0.162908-3.07%
  • MemeCoreMemeCore(M)$3.88-5.72%
  • USD1USD1(USD1)$1.000.01%
  • daiDai(DAI)$1.000.01%
  • litecoinLitecoin(LTC)$55.06-0.51%
  • avalanche-2Avalanche(AVAX)$9.20-0.69%
  • hedera-hashgraphHedera(HBAR)$0.089058-1.94%
  • Ethena USDeEthena USDe(USDE)$1.00-0.02%
  • suiSui(SUI)$0.92-0.51%
  • shiba-inuShiba Inu(SHIB)$0.0000060.05%
  • RainRain(RAIN)$0.007335-0.42%
  • paypal-usdPayPal USD(PYUSD)$1.000.02%
  • the-open-networkToncoin(TON)$1.30-0.34%
  • crypto-com-chainCronos(CRO)$0.069173-1.00%
  • Circle USYCCircle USYC(USYC)$1.120.00%
  • tether-goldTether Gold(XAUT)$4,602.84-1.90%
  • Global DollarGlobal Dollar(USDG)$1.000.00%
  • BittensorBittensor(TAO)$249.550.27%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.0729460.08%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • pax-goldPAX Gold(PAXG)$4,602.26-2.00%
  • mantleMantle(MNT)$0.63-1.25%
  • polkadotPolkadot(DOT)$1.22-0.90%
  • uniswapUniswap(UNI)$3.23-0.41%
  • SkySky(SKY)$0.0881823.34%
  • Pi NetworkPi Network(PI)$0.1941098.13%
  • Falcon USDFalcon USD(USDF)$1.00-0.07%
  • okbOKB(OKB)$83.65-0.38%
  • nearNEAR Protocol(NEAR)$1.35-1.89%
  • HTX DAOHTX DAO(HTX)$0.0000020.65%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

CMU Researchers Propose GILL: An AI Method To Fuse LLMs With Image Encoder And Decoder Models

May 31, 2023
in AI & Technology
Reading Time: 4 mins read
A A
CMU Researchers Propose GILL: An AI Method To Fuse LLMs With Image Encoder And Decoder Models
ShareShareShareShareShare

With the release of OpenAI’s new GPT 4, multimodality in Large Language Models has been introduced. Unlike the previous version, GPT 3.5, which is only used to let the well-known ChatGPT take textual inputs, the latest GPT-4 accepts text as well as images as input. Recently, a team of researchers from Carnegie Mellon University proposed an approach called Generating Images with Large Language Models (GILL), which focuses on extending multimodal language models to generate some great unique images.

The GILL method enables the processing of inputs that are mixed with images and text to produce text, retrieve images, and create new images. GILL accomplishes this despite the models utilizing distinct text encoders by transferring the output embedding space of a frozen text-only LLM to that of a frozen image-generating model. Unlike other methods that call for interleaved image-text data, the mapping is accomplished by fine-tuning a small number of parameters utilizing image-caption pairings.

The team has mentioned that this method combines large language models for frozen text with models for image encoding and decoding that have already been trained. It can provide a wide range of multimodal capabilities, such as image retrieval, unique image production, and multimodal dialogue. This has been done by mapping the modalities’ embedding spaces in order to fuse them. GILL works with conditioning mixed image and text inputs and produces outputs that are both coherent and readable.

🚀 JOIN the fastest ML Subreddit Community

This method provides an effective mapping network that grounds the LLM to a text-to-image generation model in order to obtain great performance in picture generation. This mapping network converts hidden text representations into the visual models’ embedding space. In doing so, it uses the LLM’s powerful text representations to produce aesthetically consistent outputs. 

With this approach, the model can retrieve images from a specified dataset in addition to creating new images. The model chooses whether to produce or obtain an image at the time of inference. A learned decision module that is conditional on the LLM’s hidden representations is used to make this choice. This approach is computationally efficient as it works without the need to run the image generation model at the time of training.       

This method performs better than baseline generation models, especially for tasks requiring longer and more sophisticated language. In comparison, GILL outperforms the Stable Diffusion method in processing longer-form text, including dialogue and discourse. GILL performs more in dialogue-conditioned image generation than non-LLM-based generation models, benefiting from multimodal context and generating images that better match the given text. Unlike conventional text-to-image models that only process textual input, GILL can also process arbitrarily interleaved image-text inputs.

In conclusion, GILL (Generating Images with Large Language Models) seems promising as it portrays a wider range of abilities compared to previous multimodal language models. Its ability to outperform non-LLM-based generation models in various text-to-image tasks that measure context dependence makes it a powerful solution for multimodal tasks.


Check out the Paper and Project Page. Don’t forget to join our 22k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club


YOU MAY ALSO LIKE

Union accuses Apple of unlawful discrimination against represented workers

Lyft to Acquire London Black Cab App Gett

Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.


➡️ Ultimate Guide to Data Labeling in Machine Learning

Credit: Source link

ShareTweetSendSharePin

Related Posts

Union accuses Apple of unlawful discrimination against represented workers
AI & Technology

Union accuses Apple of unlawful discrimination against represented workers

April 28, 2026
Lyft to Acquire London Black Cab App Gett
AI & Technology

Lyft to Acquire London Black Cab App Gett

April 28, 2026
SpaceX Tapped for Group Developing Golden Dome Software
AI & Technology

SpaceX Tapped for Group Developing Golden Dome Software

April 28, 2026
Tesla Sales Helped by High Gas Prices
AI & Technology

Tesla Sales Helped by High Gas Prices

April 28, 2026
Next Post
‘Bloomberg Technology’ Full Show (11/22/2021)

'Bloomberg Technology' Full Show (11/22/2021)

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
College sports are spiraling into chaos — and courts are making it worse

College sports are spiraling into chaos — and courts are making it worse

April 26, 2026
Mamdani slams Iran war, echoes Tupac: ‘We always have money for war and not to feed the poor’

Mamdani slams Iran war, echoes Tupac: ‘We always have money for war and not to feed the poor’

April 23, 2026
Dealing With Entitled Parents Who Didn’t Save for Retirement

Dealing With Entitled Parents Who Didn’t Save for Retirement

April 27, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!