User:Ralbot

General Information
Ralthor's bot.

I want to get into python so figured writing some helpful WoWWiki Bot scripts would be a fun way to learn. I'm not exactly sure of the bot policy here, but my guy will be nothing but helpful and I will probably use it to read/compile much more than actually doing edits. It also currently only runs short scripts with me present and doesn't do anything by itself.

I did some work into Natural Language Processing when I was in college and have spent the last year helping out on a NLP project at work (when I can find the free time). I am pretty interested in the subject so I might try and find some interesting things to that and WoWWiki.

I started by diving into the Python Wikipediabot Framework and learning from their code, however, this is probably not the best approach with dynamically typed languages. Luckily I found a great free book on the web, Dive Into Python, which is designed for people with ample programming experience looking to learn Python.

Post comments about my bot on the talk page. If my bot is out of control or you need to get my attention immediatly you should also post a comment on my talk page to ensure that I see it as soon as possible.

Technical Information
My laptops drive crashed. I froze it and got all the data off (the freezing trick really works). I didn't want deal with reinstalling windows (its a pain), so I am now running Ubuntu.


 * OS: Ubuntu 6.0.6/AMD 64
 * Python: 2.4.3
 * Python Wikipediabot Framework: Retrieved through CVS June 26, 2006

=Using Python Wikipediabot Framework on WoWWiki= See Python Wikipediabot Framework for specific information about using the framework on non wikipedia sites running MediaWiki.

Family File
One file which will need to be created is a family file for the Wiki you are working on, in this case wowwiki_family.py and it should be placed in the installation directory/families.

import family
 * 1) -*- coding: utf-8  -*-


 * 1) WoWWiki is a wiki dedicated to World of Warcraft

class Family(family.Family): def __init__(self): family.Family.__init__(self) self.name = 'wowwiki' self.langs = { 'en':'www.wowwiki.com', }       self.namespaces[4] = { '_default': u'WoWWiki', }       self.namespaces[5] = { '_default': u'WoWWiki Talk', }

def version(self, code): return "1.5.7"

def path(self, code): return '/index.php'

Modifications to core Wikipediabot Framework
I currently need to figure out exactly what these lines of code do, but so far I have edited the Sandbox fine without them. In Wikipedia.py, lines 548-549:

if not matchVersionTab: raise NoPage(self.site, self.aslink(forceInterwiki = True))

This check always fails on WoWWiki and so far my initial experimention has been fine without them. matchVersionTab is using a regular expression to try an match a string in the text of the page it found, apparently to ensure the page actually exists. I will have to experiment with pages that don't exist and see if it lets me attempt to edit them too. It maybe that the string is different or doesn't exist in WoWWiki.

Scripts
I will post the scripts I plan on using here. I currently have one copied from Wikipedia that edits the Sandbox. I almost have another one completed that will add a header file to all pages in the help namespace, but I am not sure that that is actually what I want it to do (and I am a bit scared to actually let my beast loose and edit things)

Current Scripts
Current scripts are listed here so that people can see what I either have run, plan on running, or can run. If you find a problem with one of them please let me know.

User:Ralbot/addheader - Adds Help:Header to everything (that I want it to add it to) in the Help category.

Future Scripts
Theses are scripts which I plan on writing:


 * Naming Convention Enforcement - It grabs the titles of all the articles that are not in News, Items, Quests, NPCs, Player Characters, Guilds, Locations, Lore, and are not redirects. It then sees if the title is greater than 1 word.  If it is and any words are capitalized it will flag it as a possible naming violation and suggest what page it should be moved to.  It will also verify that Quest articles start with Quest: and whatever other naming rules I can fit in.
 * Category Plurality - Check for category names that aren't plural. I am not sure if I will write a new isPlural function or feed the data to a Prolog function I already have.
 * Duplicate Article Checker - Gets all articles in the main namespace that aren't redirects and does some crazy long functions to find duplicates. This one will probably have to be done from my linux box, since I expect it to take a long long time to do it.
 * Automatic Item Page creation - Not really a bot, but it could be. Right now the plan is to parse thottbot and create a Item page based on Help:Item articles, it could automatically upload it too, but I probably won't let it do that.
 * Guild verifier - For all Pages in the Guild Category verify that a Guild tag exists.
 * Lonely Page Fixer - Get a title of a lonely page, use google to search for that title, return a file with a list of all the pages that it might be linked from.
 * Dead End Fixer - See if an article exists for any word(s) in a dead end page and return the possible fixes as a file.
 * Spam/External Link Finder - Search article texts for a bunch of external links. Return results as an ordered list of pages with most external links