We were surprised to find out that, since February 10 [Ed. Note: article from 2018], MetaMask had begun reporting Crypto.BI as a phishing site.
At first we thought we’d been hacked, but our techs were unable to find any malware injected into our source code. We don’t perform Ethereum transactions at all and that whatever code could be triggering this notice would have to be related to ETH in some way, but we simply do not run such code on Crypto.BI.
After a closer look at the phishing detector.js source code in the MetaMask repository, we found out that they use an implementation of the Levenshtein distance algorithm to search for phishy (excuse the pun) websites.
const checkForPhishing = require('eth-phishing-detect');
const value = checkForPhishing(‘crypto.bi’);
// output = false
So what was going on? Was it a case of the much feared heisenbug?
As it turns out, using the Levenshtein algorithm to match similar looking domain names against a blacklist was not a good idea. The blacklist used by MetaMask had grown and too many similar domains were included, which triggered excessive false positives. Our own blacklist did not include these sites, therefore the library returned FALSE on our self-test.
In case of our particular domain name, MetaMask had included a website in their blacklist whose domain name had the word crypto located close to the dot. As a result, the Levenshtein distance from this blacklisted domain to any other domain with the same or similar structure (e.g. Crypto.BI) began being flagged. Every short domain name containing the Crypto seems to be flagged at this time.
To make things worse, it also began to report the legitimate MyEtherWallet.com as a phishing site! This is understandable, since M.E.W. phishing sites use very similar domains and there are likely thousands of MEW phishing sites blacklisted.
Several other websites also opened Github issues claiming to be legit. We visited several of the users who filed an issue and they do seem to be false positives as well.
Using the Levenshtein distance instead of literal matches is meant to shorten the blacklist, but as we can see, this will not work. Phishing domain names are often very similar to the legit ones in order to masquerade as the original and their structure will likely always match very closely. Unfortunately, blacklists must be literal and domains must match exactly in order to avoid false positives.
We hope MetaMask can fix this soon. In the meantime please accept our apologies for the inconvenience. We assure you that our site is absolutely safe, we do not carry out Ethereum transactions and we never request Ethereum private keys for any purpose.
Update: Crypto.BI was whitelisted a few hours after this was posted. We appreciate the swift action on this matter.