Remove Pages from Wayback Machine

In more than one UDRP proceeding, pages from the Wayback Machine (found on Archive.org) have been submitted as evidence against the domain owner. The Wayback Machine is an Internet archive that takes snapshots of a website at various points in time, and the pages are indexed on the site.

In the case of the Cleveland Browns UDRP for Browns.com in May of 2011, the team shared results from the Wayback Machine from 2005, which showed links to football related links and merchandise. This was one major factor that seems to have doomed the domain owner.

People might ask why someone would want to have their site removed from the Wayback Machine, if not for nefarious reasons. Let’s say you purchase a descriptive domain name today, and the name may also be considered part of another company’s trademark. Perhaps you love apples and bought AppleStore.com (only an example).

The fact that the registrant change might be considered by a UDRP panelist as a “new registration” could be one strike since it wouldn’t pre-date the trademark.  If there were previously links on the site that may have infringed on the technology company’s usage, it could be strike number two. If you are planning to utilize the name in a way that does not infringe on another company’s marks, you shouldn’t have to worry that the company will come after you with evidence of previous infringement.

As we all know, there aren’t necessarily three strikes in a UDRP proceeding, and these two facts might satisfy the three elements required in a UDRP. That being said, you should be able to remove those pages from the Wayback Machine to avoid future problems, and there’s an easy way to do it.

Full details can be found on Archive.org, but the  gist  of it is that you need to add a special command in your robots.txt file that most websites have. According to the removal guide:

To exclude the Internet Archive’s crawler (and remove documents from the Wayback Machine) while allowing all other robots to crawl your site, your robots.txt file should say:

User-agent: ia_archiver
Disallow: /

It’s pretty easy to do, although I haven’t done it with any of my sites. I am not sure if there are downsides to this, and if so, what those downsides are, so you should look into this before undertaking this.

Elliot Silver
Elliot Silver
About The Author: Elliot Silver is an Internet entrepreneur and publisher of DomainInvesting.com. Elliot is also the founder and President of Top Notch Domains, LLC, a company that has closed eight figures in deals. Please read the DomainInvesting.com Terms of Use page for additional information about the publisher, website comment policy, disclosures, and conflicts of interest. Reach out to Elliot: Twitter | Facebook | LinkedIn

11 COMMENTS

  1. All parking companies should do this. There is no value to domain holders in having an archive of a parked page anyway.

  2. The only drawback that I can see in blocking Archive.org from archiving your pages is that you won’t have evidence of using your domain in bona fide for a long period when you want to file arbitration against another domain registrant who is infringing on your brand.

  3. Hi,

    A large company, I will not mention loves to use this method and/or will just take “screen shots” if the site is still up.

    And, If they sue you under the “Lanham act” in FEDERAL Court… and win a judgment against you, you can be fined up to 100K per domain & 3x the income you earned with the ‘infringing domain’.

    Under this “Act”, they can also go after the previous owner and get the same type judgment against them…if they can prove the same kind of case against them.

    Best,
    Dan

    * Not legal advice ~ Just my understanding of the law (Act) mentioned above.*

  4. WayBack finds a way back!

    Hey Elliot,

    Just a note for you and your readers… When one requests that materials collected from their website be removed from IA’s archive, that’s not actually what happen – at least not today. What happens instead is similar to a “no display” tag being attached to the files. Not very reassuring for intellectual property owners, but factually what is happening.

  5. Hello, on the wayback Q&A there is no mention of excluding a single specific page. Is this possible?
    Or does the wayback simply exclude the entire site?
    Thank you,
    Karen

  6. Blocking archives is kind of like rubbing salt in the wound when you’re trying to look up articles from back before a site expired. It defeats the entire purpose of the IA if you can’t use it when a site dies. I mean, if the site is still up, you can just view it, right? (Well, obviously) This kind of behavior surely doesn’t win anyone friends and goes a lot towards encouraging the wrong kind of thinking among policy makers/politicians.

    Also, I don’t know about anyone else, but if I’m thinking about buying a domain and I can’t see what was on there before, I’m really hesitant to buy it. Who knows what kind of reputation it had/has and that I’m not aware of. No one wants to buy a dud! It also makes it trivially easy for Google to block your domains entirely if they all have that iaarchive tag. In theory, web browsers or firewalls could automatically block parked domains that way.

    If the previous website owner authorized crawling, then they gave permission. The domain is just another type of phone number. The new person with that number doesn’t get to destroy voice mails I got previously from it or something like that just because the number’s owner changed. It doesn’t give the new owner any value but then again, it doesn’t really hurt them. Someone else used the analogy of a new renter expecting all the old business’s machinery, furniture, etc. to belong to them. Domain names/telephone numbers/addresses aren’t copyrights! The contents of the previously pointed-to server didn’t magically vanish but simply became inaccessible at that address. If the server was still up, you could just add the IP address to your HOSTS file and connect to it just like before. Entering it directly into the browser usually works but some sites look at the header before serving content (multihosted domains).

    BTW, people now use NoScript and AdBlocker quite a bit. If everyone blocked Javascript and ads from domains they’ve never visited before, then those ads would become kind of useless. Just whitelist the sites you use all the time and don’t feel guilty, heh.

  7. Hi guys,

    I bought an existing domain name about 2 weeks ago. The website was created in 2003 and WayBackMachine archived webpages since then in 2003, 2004, 2006, 2009, and 2011. Can I as the new domain owner, have these ´old´ archived screenshots let removed as well? With the same robots.txt method or another one?
    Thanks a lot.

  8. This no longer works. Archive . org is obviously ignoring robots.txt files since late 2017 or so. All my sites have the following in their robots.txt file…

    User-agent: archive.org_bot
    Disallow: /

    User-agent: ia_archiver
    Disallow: /

    That used to work – all my sites used to NOT be archived in the wayback machine. But since sometime in late 2017 all of them are archived.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Recent Posts

Squadhelp Adds Escrow.com as a Payment Option

1
Squadhelp has added Escrow.com as a payment option for buyers. The addition of the Escrow.com option was shared by ARIYAS on X this morning: 👍...

Some Thoughts on .AI Domain Names

19
There is no question that .AI domain names have become a hot topic of late. With considerable amounts of venture funding flowing into AI...

Handoff to Dan on Imported Leads Can be Confusing

0
I've been using the lead import option at Dan.com more regularly. Although the 5% commission is not ideal, transactions tend to move more quickly...

ArtificialIntelligence.com Goes Up for Sale

11
I tried to buy the ArtificialIntelligence.com domain name multiple times over the last 10 years. The emails I sent to the registrant went unanswered,...

EU Gives More IP Protection to Food & Drink Producers

0
Did you know that some well-known food and drink varieties are protected intellectual property regulations? Popular types of drinks and foods that are protected...