How to Speed Up the HTTP API of IPFS DHT!

How to Speed Up the HTTP API of IPFS DHT!

Last year, we released a significant improvement to Someguy, the HTTP delegation routing API for Amino DHT and IPNI. This update introduced a cached address book and proactive peer discovery for DHT peers. This change greatly increased the proportion of providers returning addresses, which in turn accelerated peer-to-peer content retrieval in browsers and mobile applications. It is included in the v0.7.0 release of Someguy. Stay tuned for the full story.

What is Someguy and Why is it Important

Someguy is a delegated routing HTTP API that proxies IPFS routing requests to Amino DHT, IPNI, or any other routing system that implements the same API.

Its primary purpose is to help IPFS clients find peers that provide the CID and its network address, exposing it as an HTTP API. This is crucial for browsers and mobile applications that need to access IPFS content without running a full DHT client, which is often impractical on resource-constrained devices like phones and web browsers.

The Amino DHT client is stateful and typically opens hundreds of connections to maintain its routing table and look up provider and peer records. The problem is that the network capabilities of browsers and mobile devices are limited—both in terms of the transports they can use and the number of connections they can open. Mobile devices also have limited battery and bandwidth, making it impractical to run a full DHT client.

Delegated routing allows these devices to query the DHT for content providers in a single HTTP request, rather than requiring them to maintain complex DHT connections themselves.

To enable decentralized retrieval of content provided to the DHT, Someguy acts as an intermediary, allowing these devices to query the DHT in a single HTTP request and return a list of provider peers with CID data. This is done over HTTP, which is widely supported by browsers and mobile applications.

The IPFS Foundation provides a public delegated routing endpoint supported by Someguy, with the URL https://delegated-ipfs.dev/routing/v1 by default, to accelerate peer-to-peer content retrieval in browsers and mobile applications.

The Role of Someguy in IPFS Content Retrieval

When Helia or helia/verified-fetch retrieves content from the IPFS network, it goes through the following process:

1. Helia requests providers for the CID for streaming response using the header: Accept: application/x-ndjson GET https://delegated-ipfs.dev/routing/v1/providers/bafybeigdyrzt5sfp7udm7hu76uh7y26nf3efuylqabf3oclgtqy55fbzdi

2. Someguy traverses the Amino DHT and responds with providers that have the content, usually responding with their network addresses.

Example response:

How to Speed Up the HTTP API of IPFS DHT!

3. Once each provider record arrives in the stream, the browser/mobile application connects directly to these peers, enabling parallel connection attempts and faster content retrieval.

The performance equation is simple: the faster Someguy responds with working peer addresses, the quicker browsers and mobile applications can start peer-to-peer content retrieval. Every millisecond saved in routing queries translates directly to faster content delivery.

Issue: Provider Records Without Peer Addresses

Prior to v0.7, Someguy would often respond with provider records that included peer IDs but not their network addresses. This meant that clients had to make additional requests to obtain the actual addresses of each peer. /routing/v1/peers/{peerid}

For example, unlike the response above, Someguy would return a response like this:

How to Speed Up the HTTP API of IPFS DHT!

But why return providers without peer addresses?

The widely used go-libp2p and go-libp2p-kad-dht libraries have several important constants that control the time provider and peer addresses are cached in memory:

• DefaultProvideValidity = 48 * time.Hour: TTL for the mapping between multi-hash (from CID) and peer ID for provider records.

• DefaultProviderAddrTTL = 24 * time.Hour: TTL for the addresses of these providers. These addresses are returned with provider records in DHT RPC requests. After the address expires, the client needs to perform an additional lookup to find the multi-address associated with the returned peer ID.

• RecentlyConnectedAddrTTL = time.Minute * 15: Time the address of a peer is retained in memory after disconnection. Applicable to any libp2p peer recently connected.

In other words, DHT servers can return provider records without peer addresses. This happens within a 24-hour time window after the provider record is published until it expires. This is to ensure that provider records with outdated addresses are not returned. Since re-providing typically occurs every 24 hours, DHT servers should always provide new addresses for providers, but the reality is more chaotic.

Solution: Caching Peer Addresses

PR 90 introduced several mechanisms to ensure Someguy always returns provider records with new peer addresses or does not return provider records at all, saving clients from additional peer routing requests for non-routable peers.

This is achieved through a combination of: cached address book, active peer probing, and cached routers, which use address expansion results and filter out non-dialable peers.

It turns out that caching peer addresses is very inexpensive, especially considering that the work to discover them will be done anyway in subsequent requests. Thus, we ultimately reduced the total request rate at the cost of slightly increased memory consumption.

Cached Address Book

This new cached address book wraps go-libp2p’s memoryAddrBook and has the following properties:

• 48-hour cache: Stores peer addresses for 48 hours, matching the expiration time of DHT provider records.

• 1M peer capacity: This sets a memory usage limit, allowing Someguy to handle a large number of peer nodes without consuming excessive memory.

• Memory efficient: Uses LRU eviction to keep the most relevant peer nodes available at all times.

• Event-driven cache maintenance: Caches peers by subscribing to the libp2p event bus and caching when libp2p identifies events successfully, rather than actively polling the DHT for peer addresses, thus caching peers only based on actual delegated routing requests.

Active Peer Probing in the Background

Someguy no longer returns outdated addresses but tests peer connections in the background:

• Background validation: Tests cached peer addresses every 15 minutes to see if they are still valid.

• Exponential backoff: Stops wasting time on persistently offline peers.

• Concurrent testing: Tests up to 20 peer connections simultaneously.

• Selective probing: Only tests recently unverified peers.

Cached Router: Better Response for HTTP Clients

Using the cached address book, expands routing results for peer and provider requests through a non-blocking iterator: cachedRouterserver_cached_router.go

1. Cache priority response: Immediately returns verified peer addresses when available.

2. Background resolution: If no cached address exists, looks for new addresses without blocking the response.

3. Streaming results: Sends as soon as working peer addresses are found.

4. Fallback handling: Omits inaccessible peers instead of sending erroneous addresses.

All these improvements are enabled by default in Someguy v0.7.0 and above (see SOMEGUY_CACHED_ADDR_BOOK env variable for how to disable it).

Measuring Impact

To measure the impact of these changes, we deployed two Someguy instances, one with the cached address book and active probing enabled, and the other with it disabled.

For the instance with the cached address book enabled, we realized that the cached address book needed some time to warm up, as peers were only cached following mutual authentication and running the identification protocol, which would kick off downstream effects for incoming content and peer routing requests unless run alongside an accelerated DHT client that performs DHT crawling at startup.

To determine when the cache was sufficiently warm, we observed the size metric of the cached address book and waited until it stabilized, which took about 12 hours, at which point the cache had around 30k peers. This metric continued to grow at a much slower rate, eventually plateauing at ~60k peers, which correlates with the number of DHT servers measured by ProbeLab (measured in Q3 2025).

How to Speed Up the HTTP API of IPFS DHT!

Then, we piped the last 500k CIDs requested from the public ipfs.io gateway through each instance’s endpoint at a rate of 100 requests/second and checked the cache hit rate, which is the most important metric for measuring the impact of this work. /routing/v1/providers/[CID]

We also examined HTTP request latency and HTTP success rates to gain a more comprehensive understanding of the impact of this change and to see if there were any unexpected side effects.

Note that the 500k CID list did not have deduplication, which reflects real-world usage patterns where popular CIDs are requested more frequently.

Effectiveness of Peer Address Caching

How to Speed Up the HTTP API of IPFS DHT!

We measured two key metrics to evaluate the impact of caching:

(1) How often does the cache need to be hit?

In ~66% of requests, the provider records returned by the DHT already included addresses. The remaining ~34% returned providers without addresses, requiring cache lookups or additional peer routing.

(2) How effective is the cache when needed?

For the 34.4% of requests that required address resolution:

• Cache hit rate: ~83% (addresses found in the cache)

• Cache miss: ~17% (requiring new peer lookups)

The bottom line: Caching eliminated ~83% of the need for clients to issue additional peer routing requests 🎉

HTTP Request Latency and Success Rate

Here, we examined the P95 (95th percentile) latency of HTTP requests grouped by response code (200 vs. 404), as well as the success rate measured by the ratio of responses from 200 to 404. /routing/v1/providers/[CID]

Notably, we did not expect the cache to significantly reduce latency or error rates, as the cached address book is only used to augment the results of the DHT and does not change the underlying DHT query process.

How to Speed Up the HTTP API of IPFS DHT!

Key Insights

After enabling peer address caching, we observed unexpected improvements beyond address availability:

• The P95 latency for successful responses improved from 1.91 seconds to 1.35 seconds (a 29% speedup).

• The success rate increased from 52.0% to 57.2%.

These improvements may stem from active background probing, which pre-validates peer connections. When requesting repeated CIDs, Someguy can immediately return known good peers from the cache, speeding up routing and avoiding DHT traversals for subsequent lookups of the same content.

The results indicate that the cached address book and active probing had no negative impact on latency or success rates and actually improved both metrics.

Configuration

The cached address book and active probing can be configured via the following environment variables:

• SOMEGUY_CACHED_ADDR_BOOK

• SOMEGUY_CACHED_ADDR_BOOK_ACTIVE_PROBING

• SOMEGUY_CACHED_ADDR_BOOK_RECENT_TTL

See the documentation for more details.

Metrics

After enabling the cached address book and active probing, Prometheus metrics for monitoring the cache and active probing can be found in the metrics documentation.

Other Optimizations: HTTP Level Caching

In addition to the peer address caching discussed above, Someguy also implements HTTP-level caching via headers. This provides a complementary caching layer that benefits all clients, even those that do not issue repeated requests themselves: Cache-Control

Cache Duration:

• Provider responses and results: 5 minutes – fresh enough to capture new providers while reducing redundant DHT lookups.

• Empty responses (provider not found): 15 seconds – short duration to allow quick discovery when content becomes available.

• stale-while-revalidate: 48 hours – clients can use stale data while fetching updates in the background.

This HTTP caching layer works in conjunction with the peer address cache:

• Address caching ensures provider records contain dialable addresses.

• HTTP caching prevents redundant requests for the same CID across different clients.

• CDNs and proxies can serve popular content routing responses without hitting Someguy.

These caching layers together significantly reduce latency and server load while maintaining data freshness.

Conclusion

Adding peer address caching and active probing in Someguy represents a significant step forward in decentralized content retrieval in constrained environments. By eliminating ~83% of additional peer lookups and reducing P95 latency by ~30% (~560 milliseconds), these improvements have noticeably sped up direct peer content retrieval for millions of users accessing IPFS through browsers and mobile applications.

This work is now available in Someguy starting from v0.7.0 and is already in the public product https://delegated-ipfs.dev/routing/v1/providers. Anyone can run their own Someguy instance to provide delegated routing for their users or applications. For operators, caching functionality is enabled by default and can be controlled via environment variables.

Looking ahead, we will continue to explore ways to make IPFS more accessible and high-performance for all users, regardless of their device capabilities.

[Particle Horizon]

Headquartered in Ningbo, Zhejiang, with a production base in Shenzhen, Guangdong, we gather many technical experts and market leaders based on distributed networks & storage, deeply cultivating the commercial ecosystem construction and community development based on IPFS FILECOIN.

How to Speed Up the HTTP API of IPFS DHT!How to Speed Up the HTTP API of IPFS DHT!How to Speed Up the HTTP API of IPFS DHT!How to Speed Up the HTTP API of IPFS DHT!Scan to follow us

Leave a Comment