Product Crawler

DOCS

Last updated: Aug 15th, 8:08am

Why does PayPal use allowlisting?

As part of PayPal Ads, we need timely and highly accurate product information from our merchants and partners. While we rely heavily on direct feeds, PayPal also uses a crawler such as server-side (headless) browsers to ensure promoted products are always available to customers.

To ensure our customers enjoy a seamless experience, merchants and partners should complete the following steps to ensure our crawler is not blocked.

How to allow-list our crawler?

Our crawler, PayPal browsers, can identify themselves using a custom User-Agent or a custom HTTP header. This enables you to identify traffic coming from our crawler and allow that traffic to bypass bot detection filters.

You can use one of the following options to agree on a custom User-Agent or a custom HTTP header to identify and allow our crawler.

We recommend Option 1, as it enables PayPal to update the product information more quickly using a set of varied IP addresses and servers.

Option 1 (preferred)

Our crawler, PayPal browsers, can identify themselves using a custom User-Agent for you to identify and prevent blocking. An example (which can be customized) is Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36 Honey/1.0.

Option 2

Alternately, PayPal browsers can send an agreed-upon custom HTTP header when connecting to your site. You can then use this to identify and prevent blocking PayPal.

An example of a HTTP header is X-Merchant-Key = Merchant-Custom-Value.

Both the key and value pair can be customized as long as it is agreed upon and shared with PayPal in advance.

What is the impact for allow-listing the crawler?

Once the crawler is allowlisted on your site, you can see intermittent traffic coming from our server-side crawler to your site. All traffic contains either the custom ‘User-Agent’ or custom ‘HTTP header’ (the one that was agreed upon) so you can configure your bot detection to allow those requests through. When the product crawler loads a product URL from your site, the custom ‘User-Agent’ or custom ‘HTTP header’ is included in that request.

PayPal crawls only products that have recently changed, which is determined using product feed data and user observation signals. If you provide PayPal with an affiliate or direct product feed, it can further reduce the traffic going to your site.