• bitcoinBitcoin(BTC)$63,198.003.90%
  • ethereumEthereum(ETH)$1,685.537.52%
  • tetherTether(USDT)$1.000.01%
  • binancecoinBNB(BNB)$603.895.10%
  • usd-coinUSDC(USDC)$1.00-0.01%
  • rippleXRP(XRP)$1.155.33%
  • solanaSolana(SOL)$66.166.56%
  • tronTRON(TRX)$0.3261170.62%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.030.00%
  • dogecoinDogecoin(DOGE)$0.0859355.05%
  • HyperliquidHyperliquid(HYPE)$59.855.46%
  • USDSUSDS(USDS)$1.00-0.01%
  • leo-tokenLEO Token(LEO)$9.611.68%
  • RainRain(RAIN)$0.0134514.17%
  • zcashZcash(ZEC)$436.2619.31%
  • stellarStellar(XLM)$0.204299-3.71%
  • CantonCanton(CC)$0.1630990.03%
  • cardanoCardano(ADA)$0.1644804.80%
  • chainlinkChainlink(LINK)$7.896.99%
  • moneroMonero(XMR)$302.682.66%
  • whitebitWhiteBIT Coin(WBT)$45.074.04%
  • USD1USD1(USD1)$1.000.02%
  • bitcoin-cashBitcoin Cash(BCH)$229.876.03%
  • the-open-networkToncoin(TON)$1.703.78%
  • Ethena USDeEthena USDe(USDE)$1.000.00%
  • daiDai(DAI)$1.000.09%
  • MemeCoreMemeCore(M)$3.092.27%
  • LABLAB(LAB)$12.92-0.72%
  • hedera-hashgraphHedera(HBAR)$0.0816912.67%
  • litecoinLitecoin(LTC)$42.793.53%
  • suiSui(SUI)$0.765.39%
  • avalanche-2Avalanche(AVAX)$6.781.80%
  • paypal-usdPayPal USD(PYUSD)$1.00-0.01%
  • Circle USYCCircle USYC(USYC)$1.130.00%
  • shiba-inuShiba Inu(SHIB)$0.0000053.21%
  • crypto-com-chainCronos(CRO)$0.0607614.24%
  • tether-goldTether Gold(XAUT)$4,322.680.77%
  • nearNEAR Protocol(NEAR)$2.039.32%
  • Global DollarGlobal Dollar(USDG)$1.000.01%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • Ondo US Dollar YieldOndo US Dollar Yield(USDY)$1.13-0.41%
  • BittensorBittensor(TAO)$213.3610.33%
  • pax-goldPAX Gold(PAXG)$4,333.710.86%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.0565521.27%
  • mantleMantle(MNT)$0.545.81%
  • Ripple USDRipple USD(RLUSD)$1.000.01%
  • OndoOndo(ONDO)$0.3451306.64%
  • polkadotPolkadot(DOT)$0.973.04%
  • AsterAster(ASTER)$0.631.99%
  • worldcoin-wldWorldcoin(WLD)$0.47099813.01%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

Georgia Tech Researchers Introduce ZipIt: A General Method for Merging Two Arbitrary Models of the Same Architecture that Incorporates Two Simple Strategies

May 7, 2023
in AI & Technology
Reading Time: 5 mins read
A A
Georgia Tech Researchers Introduce ZipIt: A General Method for Merging Two Arbitrary Models of the Same Architecture that Incorporates Two Simple Strategies
ShareShareShareShareShare

The discipline of computer vision has flourished under the rule of huge models with an ever-increasing number of parameters ever since AlexNet popularised deep learning. Today’s benchmark challenges include classification with tens of thousands of classes, precise object identification, quick instance segmentation, realistic picture production, and many more vision issues that were originally thought to be impossible or extremely difficult. These deep models are quite effective, but they have a potentially fatal flaw: they can only carry out the task they were trained on. They encounter several possible problems while attempting to increase the capabilities of an existing model. They risk catastrophic forgetting if they try to train the model on a different assignment. 

They frequently discover that the same model does not generalize to samples from outside the domain when they examine it using different data without adaption. To lessen these consequences, they can try so-called “intervention” tactics, although these sometimes need for further training, which can be costly. For many activities, there are already a tonne of finely honed models available. Despite the fact that these models frequently have the same basic structural foundation, there is currently no technique for combining models developed for distinct objectives. Either we’re forced to assemble them, which involves assessing each model separately, or we’re forced to jointly train a new model through distillation, both of which can be prohibitively costly, especially given the current trend of ever-increasing architecture and dataset sizes.

Instead, researchers from the Georgia Institute of Technology considered it would be wonderful if they could just “zip” these models together, eliminating the need for extra training and allowing any duplicate characteristics to be calculated only once. In the vision community, the concept of integrating several models into one has just begun to gain popularity. To increase accuracy and resilience, Model Soups can incorporate numerous models that have been fine-tuned using the same pretrained initialization. With a large accuracy loss, Git Re-Basin generalises further to models trained on the same data but with different initializations. By including additional parameters and, where necessary, modifying model batch norms, REPAIR enhances Git Re-Basin. 

🚀 JOIN the fastest ML Subreddit Community

All of these techniques, meanwhile, only merge models created for the same objective. This study pushes this line of research to its logical conclusion by integrating models with various initializations that were developed for quite different goals. Despite the fact that this is a really difficult problem, they use two straightforward methods to solve it. They begin by noting that earlier research has concentrated on permuting one model into the other when combining them. Assuming that most of the characteristics between the two models are redundant, this results in a 1:1 mapping between them. They cannot rely just on permutation, as this isn’t always true for models trained on various tasks. Instead, they make use of redundant parts of every model. 

They generalize model merging to permit “zipping” any feature combination both within and between each model in order to achieve this. On some datasets, they discover that this alone increases accuracy by up to 20% when compared to the Git Re-basin plus a more robust permutation baseline that they implement. Second, current techniques combine the whole network. This may work for models that are quite similar and were trained in the same environment, but as a network becomes older, the properties of models that were trained on different tasks become less linked. They introduce partial zipping, where they only “zip” up to a certain layer, to address this. They then automatically create a multi-head model by feeding the intermediate outputs of the merged model to the remaining unmerged layers of the original networks. 

This can increase accuracy by over 15% while still keeping the majority of the layers merged, depending on how challenging each assignment is. They introduce ZipIt!, a universal technique for “zipping” together any number of models trained on various tasks into a single multitask model without further training by combining both of these approaches. They may combine models with the same architecture, merge features inside each model, and partially zip them to form a multi-task model by devising a generic graph-based technique for merging and unmerging. By integrating models trained on fully different datasets, completely distinct sets of CIFAR, and ImageNet categories, they demonstrate the efficacy of their method while exceeding previous research by a wide margin. They then analyze and ablate their method’s performance in various instances. They have described their pipeline elaborately in the GitHub repository. The code and datasets also have been made available.


Check out the Research Paper and Code. Don’t forget to join our 20k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club


YOU MAY ALSO LIKE

Ambrosia Sky’s Final Act Lands On August 6

Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.


Credit: Source link

ShareTweetSendSharePin

Related Posts

Ambrosia Sky’s Final Act Lands On August 6
AI & Technology

Ambrosia Sky’s Final Act Lands On August 6

June 7, 2026
Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation
AI & Technology

Building Reflective Prompt Optimization with GEPA: Multi-Component Prompts, Structured Feedback, and Held-Out Validation

June 7, 2026
Agentic AI solved coding — and exposed every other problem in software engineering
AI & Technology

Agentic AI solved coding — and exposed every other problem in software engineering

June 7, 2026
How To Get Your Money’s Worth From YouTube Premium
AI & Technology

How To Get Your Money’s Worth From YouTube Premium

June 7, 2026
Next Post
Biden honors ‘Teachers of the Year’ at the White House

Biden honors 'Teachers of the Year' at the White House

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
First look at new Trump Mobile smartphone

First look at new Trump Mobile smartphone

June 4, 2026
Lindsey Graham says the U.S. has ‘hit a wall’ on Iran negotiations: Full interview

Lindsey Graham says the U.S. has ‘hit a wall’ on Iran negotiations: Full interview

June 5, 2026
Video appears to show Israeli forces firing their guns at a boat in the Gaza aid flotilla

Video appears to show Israeli forces firing their guns at a boat in the Gaza aid flotilla

June 3, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!