• bitcoinBitcoin(BTC)$76,390.000.63%
  • ethereumEthereum(ETH)$2,259.250.39%
  • tetherTether(USDT)$1.00-0.01%
  • rippleXRP(XRP)$1.370.08%
  • binancecoinBNB(BNB)$616.98-0.02%
  • usd-coinUSDC(USDC)$1.00-0.02%
  • solanaSolana(SOL)$82.97-0.01%
  • tronTRON(TRX)$0.3263340.90%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.03-0.08%
  • dogecoinDogecoin(DOGE)$0.1067803.25%
  • whitebitWhiteBIT Coin(WBT)$57.256.02%
  • USDSUSDS(USDS)$1.000.00%
  • leo-tokenLEO Token(LEO)$10.33-0.26%
  • HyperliquidHyperliquid(HYPE)$39.48-1.22%
  • cardanoCardano(ADA)$0.2467470.94%
  • bitcoin-cashBitcoin Cash(BCH)$440.95-1.36%
  • moneroMonero(XMR)$378.480.66%
  • chainlinkChainlink(LINK)$9.110.12%
  • zcashZcash(ZEC)$350.747.44%
  • CantonCanton(CC)$0.1515250.30%
  • stellarStellar(XLM)$0.159192-0.53%
  • USD1USD1(USD1)$1.00-0.05%
  • daiDai(DAI)$1.000.00%
  • litecoinLitecoin(LTC)$55.18-0.09%
  • MemeCoreMemeCore(M)$3.22-5.68%
  • avalanche-2Avalanche(AVAX)$9.09-0.47%
  • Ethena USDeEthena USDe(USDE)$1.00-0.01%
  • hedera-hashgraphHedera(HBAR)$0.087524-1.29%
  • RainRain(RAIN)$0.0078450.04%
  • shiba-inuShiba Inu(SHIB)$0.0000061.48%
  • suiSui(SUI)$0.91-0.01%
  • paypal-usdPayPal USD(PYUSD)$1.000.00%
  • the-open-networkToncoin(TON)$1.330.24%
  • crypto-com-chainCronos(CRO)$0.0684470.32%
  • Circle USYCCircle USYC(USYC)$1.12-0.01%
  • tether-goldTether Gold(XAUT)$4,607.241.33%
  • Global DollarGlobal Dollar(USDG)$1.000.00%
  • BittensorBittensor(TAO)$249.24-3.04%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • pax-goldPAX Gold(PAXG)$4,609.341.42%
  • mantleMantle(MNT)$0.630.57%
  • polkadotPolkadot(DOT)$1.20-0.40%
  • uniswapUniswap(UNI)$3.19-0.20%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.060937-4.71%
  • SkySky(SKY)$0.079382-2.86%
  • Pi NetworkPi Network(PI)$0.176957-6.90%
  • Falcon USDFalcon USD(USDF)$1.00-0.01%
  • okbOKB(OKB)$82.310.09%
  • nearNEAR Protocol(NEAR)$1.30-1.65%
  • AsterAster(ASTER)$0.65-0.21%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

ETH Zurich and HKUST Researchers Propose HQ-SAM: A High-Quality Zero-Shot Segmentation Model By Introducing Negligible Overhead To The Original SAM

June 8, 2023
in AI & Technology
Reading Time: 5 mins read
A A
ETH Zurich and HKUST Researchers Propose HQ-SAM: A High-Quality Zero-Shot Segmentation Model By Introducing Negligible Overhead To The Original SAM
ShareShareShareShareShare

Accurate segmentation of multiple objects is essential for various scene understanding applications, such as image/video processing, robotic perception, and AR/VR. The Segment Anything Model (SAM) was recently released, a basic vision model for broad image segmentation. It was trained using billion-scale mask labels. SAM can segment various objects, components, and visual structures in multiple contexts by using a sequence of points, a bounding box, or a coarse mask as input. Its zero-shot segmentation capabilities have sparked a quick paradigm change since they can be used in many applications with just a few basic prompts. 

Despite its outstanding performance, SAM’s segmentation outcomes still need improvement. Two significant issues plague SAM: 1) Rough mask borders, frequently omitting to segment thin object structures, as demonstrated in Figure 1. 2) Wrong forecasts, damaged masks, or significant inaccuracies in difficult instances. This is frequently connected to SAM’s tendency to misread thin structures, like the kite lines in the figure’s top right-hand column. The application and efficacy of fundamental segmentation methods, such as SAM, are significantly constrained by these errors, especially for automated annotation and image/video editing jobs where extremely precise picture masks are essential. 

Figure 1: Ccompares the predicted masks of SAM and our HQ-SAM using input prompts of a single red box or a number of points on the object. With extremely precise bounds, HQ-SAM generates findings that are noticeably more detailed. In the rightmost column, SAM misinterprets the kite lines’ thin structure and generates a significant number of mistakes with broken holes for the input box prompt.

🚀 JOIN the fastest ML Subreddit Community

Researchers from ETH Zurich and HKUST suggest HQ-SAM, which maintains the original SAM’s robust zero-shot capabilities and flexibility while being able to anticipate very accurate segmentation masks, even in extremely difficult circumstances (see Figure 1). They suggest a minor adaption of SAM, adding less than 0.5% parameters, to increase its capacity for high-quality segmentation while maintaining efficiency and zero-shot performance. The general arrangement of zero-shot segmentation is substantially hampered by directly adjusting the SAM decoder or adding a new decoder module. Therefore, they suggest the HQ-SAM design completely retains the zero-shot efficiency, integrating with and reusing the current learned SAM structure. 

In addition to the original prompt and output tokens, they create a learnable HQ-Output Token fed into SAM’s mask decoder. Their HQ-Output Token and its related MLP layers are taught to forecast a high-quality segmentation mask, in contrast to the original output tokens. Second, their HQ-Output Token operates on an improved feature set to produce precise mask information instead of only employing the SAM’s mask decoder capabilities. They combine SAM’s mask decoder features with the early and late feature maps from its ViT encoder to use global semantic context and fine-grained local features. 

The complete pre-trained SAM parameters are frozen during training, and just the HQ-Output Token, the related three-layer MLPs, and a tiny feature fusion block are updated. A dataset with precise mask annotations of various objects with intricate and complicated geometries is necessary for learning accurate segmentation. The SA-1B dataset, which has 11M photos and 1.1 billion masks created automatically using a model similar to SAM, is used to train SAM. However, SAM’s performance in Figure 1 shows that employing this large dataset has major economic consequences. It fails to produce the necessary high-quality mask generations targeted in their study. 

As a result, they create HQSeg-44K, a new dataset that comprises 44K highly fine-grained picture mask annotations. Six existing picture datasets are combined with very precise mask annotations to make the HQSeg-44K, which spans over 1,000 different semantic classes. HQ-SAM can be trained on 8 RTX 3090 GPUs in under 4 hours thanks to the smaller dataset and their simple integrated design. They conduct a rigorous quantitative and qualitative experimental study to verify the efficacy of HQ-SAM. 

On a collection of nine distinct segmentation datasets from various downstream tasks, they compare HQ-SAM with SAM, seven of which are under a zero-shot transfer protocol, including COCO, UVO, LVIS, HQ-YTVIS, BIG, COIFT, and HR-SOD. This thorough analysis shows that the proposed HQ-SAM can manufacture masks of a greater caliber while still having a zero-shot capability compared to SAM. A virtual demo is present on their GitHub page.

the first high-quality zero-shot segmentation model by introducing negligible overhead to the original SAM

Check Out The Paper and Github. Don’t forget to join our 23k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

🚀 Check Out 100’s AI Tools in AI Tools Club


YOU MAY ALSO LIKE

Meta Says It May Withdraw Its Apps From New Mexico If Judge Agrees To The State’s Demands

Why OpenAI’s ‘goblin’ problem matters — and how you can release the goblins on your own

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.


Check out https://aitoolsclub.com to find 100’s of Cool AI Tools

Credit: Source link

ShareTweetSendSharePin

Related Posts

Meta Says It May Withdraw Its Apps From New Mexico If Judge Agrees To The State’s Demands
AI & Technology

Meta Says It May Withdraw Its Apps From New Mexico If Judge Agrees To The State’s Demands

April 30, 2026
Why OpenAI’s ‘goblin’ problem matters — and how you can release the goblins on your own
AI & Technology

Why OpenAI’s ‘goblin’ problem matters — and how you can release the goblins on your own

April 30, 2026
A Coding Implementation on Pyright Type Checking Covering Generics, Protocols, Strict Mode, Type Narrowing, and Modern Python Typing
AI & Technology

A Coding Implementation on Pyright Type Checking Covering Generics, Protocols, Strict Mode, Type Narrowing, and Modern Python Typing

April 30, 2026
Galactic Racer Lands On PC, PS5 And Xbox Series X/S On October 6
AI & Technology

Galactic Racer Lands On PC, PS5 And Xbox Series X/S On October 6

April 30, 2026
Next Post
My Brother is Stealing Money From Our Disabled Mom!

My Brother is Stealing Money From Our Disabled Mom!

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
What’s next in reaching the moon after Artemis II

What’s next in reaching the moon after Artemis II

April 26, 2026
How US investors should think about tariffs as Trump braces for a fresh round of haggling

How US investors should think about tariffs as Trump braces for a fresh round of haggling

April 27, 2026
The Most Important GPT-5.5 Upgrade

The Most Important GPT-5.5 Upgrade

April 25, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!