IPFS HTTP gateways are extremely useful to bringing the value of content addressing to the broad base of internet users. They can take requests via HTTP that include an IPFS CID, and if the data that corresponds to that CID is available on IPFS to that gateway, the gateway will fetch it and return an HTTP response with the data. This enables developers to utilize IPFS without needing to worry about how the majority of web users can interact with this data. web3.storage runs the w3link gateway at w3s.link.
Public gateways as stewards
Most gateways are run as public goods, meaning anyone can utilize the gateway. Further, these gateways generally fetch any content that is made available to them. Most gateways avoid playing a content moderation role as much as possible given the open nature of IPFS. Usually, if a gateway cannot fetch a given piece of content, it is because that content isn’t reachable by that gateway within the window of an open HTTP request.
One exception is the “bad bits denylist,” which gateways can utilize to subscribe to a list of CIDs that are submitted to and verified by Protocol Labs to contain malicious content (malware, illegal content, etc.). w3link and most major public gateways do so. Utilizing lists like these strike a balance between the obligation of public gateways to protect vulnerable users and staying above the subjective fray of content moderation. Because gateways are run as a free, public service, it is within the rights of gateways to select which lists they might subscribe to, if any, but if the list ever ceases to meet the expectations of the gateway operator, the gateway can unsubscribe. Likewise, users can reference which lists a gateway operator is utilizing and make decisions on whether to use that public gateway as a result (or to use a different one or operate their own if they disagree).
In recent months, we have seen the volume of malware being shared through nftstorage.link increase. Though we have responded to reports and blocked malicious CIDs as quickly as we’ve received them, it is still apparent that malicious actors can generate this content faster than security-minded users can flag them. As a result, to protect our users, we have just introduced Content-Security-Policy (CSP) in w3link.
Content-Security-Policy
The HTTP Content-Security-Policy response header enables web site administrators to control resources the user agent is allowed to load for a given page. With a few exceptions, policies mostly involve specifying allowed server origins and script endpoints.
The main immediate motivation behind this decision was to prevent phishing websites to operate on w3link. This was affecting the project by having Security Vendors and ISPs continuously flagging w3link as a malicious entity, and causing unseen harm to unsuspecting users.
More than preventing phishing websites, introducing CSP brings additional value to the equation for the long run:
- Prevent content addressed websites in the wild to have centralized dependencies, instead of same origin content addressable dependencies - the CSP header breaks these sites, but this is a design issue that would be problematic when centralized URLs had issues, so this is an opportunity to design more resilient solutions.
- Further down the line, ipfs:// browser handling will be how users interact with IPFS. Browsers implementing the protocol handler require all external resources to also rely on same protocol (i.e. no dependencies with HTTP). This is an opportunity to design things correctly early on, rather than hit this road blocks in the future.
Implementation details
For fully understanding the implementation here it is recommended reading the Content-Security-Policy Header docs and the CSP Spec.
The header content implemented is as follows:
Content-Security-Policy: default-src 'self' 'unsafe-inline' 'unsafe-eval' blob: data: form-action 'self' ; navigate-to 'self';
In short, the included directives intend to restrict the URLs which can be loaded using script interfaces. data:*
and blob:*
URLs are accepted given they can’t be used for data smuggling. This breaks the ability for personal information submitted through phishing websites to reach the malicious backend server.
Broken websites and NFTs: What can I do now?
We acknowledge that this change might have created issues for websites and NFTs in the wild. With that in mind, as a way for an easy transition we created a CID allowlist called goodbits.
It has a list of CIDs flagged as good, and for which we can bypass the CSP header. If you have a NFT that broke as a result of this change, please submit a pull request to the goodbits repo with the CID you would like to support. We will do an internal validation and get it added to the goodbits list.
Striking a balance
As operators of a public gateway, we need to be thoughtful about any content moderation that we enforce. We hope to strike balance in protecting our users against harm between malicious content and false positives. By making the code and thoughts behind our operations transparent, we also hope to introduce discourse among the distributed web community and forge the path to best practices for public gateway operators.