# Name: http://www.iamwitch.com.com/robots.txt file # Last Update: October 12, 2007 # (C) Copy and Copyright 2007 IAmAWitch.Com . All Rights Reserved. # You may freely copy and use this robots.txt file (just give us the credit as the source as a courtesy). # # We do not now or ever allow unauthorized spiders and robots. We regularly observe our logs and ban # ip addresses that do not observe the rules. In some cases, this has also resulted in bans on entire # countries due to the prevalance of spamming and illegal activities being perpetrated on innocent sites. # If the governments of certain countries would enforce their laws, such actions would not be required and # the web would be a better place. Just an observation. In the meantime, we're protecting our site. # # The Rules: # # 1. - Unauthorized bots will result in IP's being banned. Agent spoofing is considered a bot. # 2. - IP Addresses/Ranges w/o agent strings are banned in other areas of the site. Don't even waste your time trying. # 3. - We monitor which bots don't observe robots.txt. Those who do not observe the rule meet Mr. H.T. Access. # 4. - We do not allow traffic from Russia and it's former satellites of the Soviet Union, China, the Middle East and # - most of South America due to the prevalence of spamming and illicit activities that prevail in these countries and # - regions of the world. Given the embedded nature of criminal activities in these countries as an accepted part of # - the culture, these bans are inflexible, sternly enforced and permanent until the end of time. # Crawl Delay set to ten seconds User-Agent: * Crawl-Delay: 10 # Directory specific exclusions for IAmAWitch - tune as needed for your site User-agent: * Disallow: /admin/ Disallow: /articles/ Disallow: /bot-trap/ Disallow: /captcha/ Disallow: /chat/ Disallow: /chatterblock/ Disallow: /cgi-bin/ Disallow: /class/ Disallow: /classes/ Disallow: /codeofisis/ Disallow: /docs/ Disallow: /fckeditor/ Disallow: /filemgmt_data/ Disallow: /galleryx_noview/ Disallow: /help/ Disallow: /images/ Disallow: /mediagallery/ # Specific files not to index (for IAmAWitch - tune as needed for your site) Disallow: /comment.php Disallow: /calendar.php Disallow: /forum/createtopic.php Disallow: /pingback.php Disallow: /profiles.php Disallow: /stats.php Disallow: /submit.php Disallow: /trackback.php Disallow: /users.php Disallow: /usersettings.php # Disallows for bad spiders # Lower-cases (for ease of sorting via TED Notepad) User-agent: asterias User-agent: b2w/0.1 User-agent: cosmos User-agent: emailCollector User-agent: emailSiphon User-agent: emailWolf User-agent: grub User-agent: grub-client User-agent: grub-client-2.6.0 User-agent: hloader User-agent: httplib User-agent: humanlinks User-agent: ia_archiver User-agent: ia_archiver/1.6 User-agent: larbin User-agent: libWeb/clsHTTP User-agent: looksmart User-agent: lwp-trivial User-agent: lwp-trivial/1.34 User-agent: moget User-agent: moget/2.1 User-agent: psbot User-agent: searchpreview User-agent: spanner User-agent: suzuran User-agent: toCrawl/UrlDispatcher User-agent: turingos User-agent: voilabot BETA 1.2 # Upper-cases and numerics User-agent: 216.34.209.23 User-agent: Alexibot User-agent: Aqua_Products User-agent: BackDoorBot/1.0 User-agent: BecomeBot User-agent: BlowFish/1.0 User-agent: Bookmark search tool User-agent: BotALot User-agent: BuiltBotTough User-agent: Bullseye/1.0 User-agent: BunnySlippers User-agent: CheeseBot User-agent: CherryPicker User-agent: CherryPickerElite/1.0 User-agent: CherryPickerSE/1.0 User-agent: CopyRightCheck User-agent: Crescent User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0 User-agent: DittoSpyder User-agent: Entireweb* User-agent: EroCrawler User-agent: ExtractorPro User-agent: FairAd Client User-agent: Firefox/1.0 (Windows; U; Win98; en-US; Localization; rv:1.4) Gecko/20030624 Netscape/7.1 (ax) User-agent: Flaming AttackBot User-agent: Foobot User-agent: Gaisbot User-agent: GetRight/4.2 User-agent: GetRight/5.2d User-agent: Harvest/1.5 User-agent: InfoNaviRobot User-agent: Iron33/1.0.2 User-agent: JennyBot User-agent: Kenjin Spider User-agent: Keyword Density/0.9 User-agent: LNSpiderguy User-agent: LexiBot User-agent: LinkScan/8.1a Unix User-agent: LinkWalker User-agent: LinkextractorPro User-agent: MIIxpc User-agent: MIIxpc/4.2 User-agent: Mata Hari User-agent: Mediapartners-Google* User-agent: Microsoft URL Control User-agent: Microsoft URL Control - 5.01.4511 User-agent: Microsoft URL Control - 6.00.8169 User-agent: Mister PiX User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95) User-agent: NICErsPRO User-agent: NetAnts User-agent: Offline Explorer User-agent: Openbot User-agent: Openfind User-agent: Openfind data gathere User-agent: Oracle Ultra Search User-agent: PerMan User-agent: ProPowerBot/2.14 User-agent: ProWebWalker User-agent: Python-urllib User-agent: QueryN Metasearch User-agent: RMA User-agent: Radiation Retriever 1.1 User-agent: RepoMonkey User-agent: RepoMonkey Bait & Tackle/v1.01 User-agent: SiteSnagger User-agent: SpankBot User-agent: SurveyBot/2.3 (Whois Source) User-agent: Szukacz/1.4 User-agent: Teleport User-agent: TeleportPro User-agent: Telesoft User-agent: The Intraformant User-agent: TheNomad User-agent: True_Robot User-agent: True_Robot/1.0 User-agent: TurnitinBot User-agent: URL Control User-agent: URL_Spider_Pro User-agent: URLy Warning User-agent: VCI User-agent: VCI WebViewer VCI WebViewer Win32 User-agent: WWW-Collector-E User-agent: Web Image Collector User-agent: WebAuto User-agent: WebBandit User-agent: WebBandit/3.50 User-agent: WebCopier User-agent: WebEnhancer User-agent: WebSauger User-agent: WebStripper User-agent: WebZip User-agent: WebZip/4.0 User-agent: WebmasterWorld Extractor User-agent: Website Quester User-agent: Webster Pro User-agent: Wget User-agent: Wget/1.5.3 User-agent: Wget/1.6 User-agent: Xenu's User-agent: Xenu's Link Sleuth 1.1c User-agent: Zeus User-agent: Zeus 32297 Webster Pro V2.9 Win32 User-agent: Zeus Link Scout Disallow: /