User:Mikk/Scripts

The following is a collection of Mikk's scripts for managing various things around the wiki. Mostly, but not only, relating to Interface Customization.

Nearly all of the scripts require GNU AWK 3.1. A standalone win32 "gawk.exe" is available in the unxutils package. As of the writing of this page, gawk 3.1 is in UnxUpdates.zip.

(I tend to use AWK simply because of this single exe download for Win32)

Converting scripts to wikitext
Heh, I needed a script to show my scripts =)

BEGIN { print "&lt;div style=\"max-width: 80em; margin-right: 2em; height: 20em; overflow: scroll;\"&gt;"; } {   gsub(/&amp;/, "\\&amp;amp;"); gsub(/&lt;/, "\\&amp;lt;"); gsub(/&gt;/, "\\&amp;gt;"); gsub(/{&#x7b;/, "{\\&amp;#x7b;"); gsub(/'&#x27;/, "'\\&amp;#x27;"); gsub(/\[\[/, "[\\&amp;#x5b;"); gsub(/http&#x3a;/, "http\\&amp;#x3a;"); gsub("_&#x5f;", "_\\&amp;#x5f;"); gsub(/USERNAME *= *".*"/, "USERNAME = \"'&#x27;YOURUSERNAME'&#x27;\""); gsub(/PASSWORD *= *".*"/, "PASSWORD = \"'&#x27;YOURPASSWORD'&#x27;\""); print " " gensub(/^( *)(\t)/, "\\1 ", "g", $0); } END { print "&lt;/div&gt;"; }
 * 1) Usage: gawk -f wikifycode.awk input.txt &gt; output.txt

In-game scanner
Install as an addon, start WoW, copy-and-paste resulting text to "funcscan.txt".

Interface\Addons\!!!GlobFuncScan\!!!GlobFuncScan.toc GlobFuncScan.lua
 * 1) Interface: 20300
 * 2) Title: Global Function Scanner
 * 3) Notes: Find all global WoW functions
 * 4) Author: Mikk
 * 5) SavedVariables:

Interface\Addons\!!!GlobFuncScan\GlobFuncScan.lua --[&#x5b; Global Function Scanner addon by Mikk See http&#x3a;//www.wowwiki.com/User:Mikk/Scripts Up to date for WoW 2.4. Produces some extras but that gets filtered out by later scripts. ]] if(not GlobFuncEdit) then GlobFuncEdit = CreateFrame("Editbox"); end GlobFuncEdit:SetFontObject(GameFontHighlightSmall); GlobFuncEdit:SetPoint("TOPRIGHT", UIParent, "TOPRIGHT", -10, -10); GlobFuncEdit:SetPoint("TOPLEFT", UIParent, "TOPRIGHT", -250, -10); GlobFuncEdit:SetHeight("500"); GlobFuncEdit:SetMultiLine(true); GlobFuncEdit:SetScript("OnEscapePressed", function this:Hide; end); local function funcaddr(func) return tonumber(strsub(tostring(func), 10), 16) end local refpoint={} local function point(funcname) refpoint[funcname]=funcaddr(_G[funcname]) end point("DeclineGroup") point("FlagTutorial") point("ConvertToRaid") point("FlagTutorial") point("ShowLFG") point("asin") point("pairs") point("AcceptQuest") local res = {} for k,v in pairs(_G) do    if type(v)=="function"	--[&#x5b; and strfind(k, "^_*[A-Za-z0-9]+$") or ]] then local addr = funcaddr(v) for _,refaddr in pairs(refpoint) do  		if abs(addr-refaddr)&lt;300000 then tinsert(res, k); break end end end end table.sort(res); for k,v in pairs(refpoint) do  table.insert(res, 1, format("# %-15s %10u (0x%08x)", k, v, v)); end table.insert(res, "# END") local str = table.concat(res, "\n"); DEFAULT_CHAT_FRAME:AddMessage("GlobFuncScan: Found " .. #res .. " functions. Total output length is " .. strlen(str) .. " bytes."); GlobFuncEdit:SetText(str); GlobFuncEdit:Show; GlobFuncEdit:SetFocus(true); GlobFuncEdit:HighlightText(0, 999999);

List creator
Runs outside of WoW. You need a stand-alone Lua parser.

funcscan.lua wowexe = "C:/program files/world of warcraft/wow.exe"; funcscan = "funcscan.txt"; skip={ ["message"] = true, ["GetText"] = true, } forcedfuncs={ ["SortLFG"]=true,	-- this doesnt get picked up by the exe scanner like it should, it says "@ASortLFG". doh. } funcs={} f = assert(io.open(funcscan, "rt"), "could not open '" .. funcscan .. "' for read"); for str in f:lines do  if not string.match(str, "^#") then table.insert(funcs, str); end end f:close; table.sort(funcs); f = assert(io.open(wowexe, "rb"), "could not open '" .. wowexe .. "' for read"); wow = f:read("*a"); wowstrings = {} for str in string.gmatch(wow, "[A-Za-z0-9_][A-Za-z0-9_][A-Za-z0-9_]+") do  wowstrings[str] = true; end for _,str in ipairs(funcs) do  if (wowstrings[str] and not skip[str]) or forcedfuncs[str] then print("* [&#x5b;API "..str.."|"..str.."]]"); else io.stderr:write("skipping ",str,"\n") end end

Run as lua funcscan.lua > globfunc.txt

Manual work

 * Run the boldizer (below) on the result


 * Paste into Global Function List


 * Manually list new/removed functions in the Changes section at the top of the page (hint: Use the "Show Changes" button!)

Boldizing Global Function List entries
Copy wikitext contents to files:
 * Lua functions &rarr; luafuncs.txt
 * World of Warcraft API &rarr; wowapi.txt
 * Global Function List &rarr; globfunc.txt

(The only important thing is that the global function list file has "glob" somewhere in its name, and that it is the last file in the list)

# # # FILENAME!=LastFileName { LastFileName = FILENAME; IsGlobals = ( tolower(FILENAME) ~ /glob/ ); } !IsGlobals && match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) { # Remember that we have seen this function gsub("_", " ", a[1]); api[a[1]]=1; } IsGlobals { # Fix dates occuring in the global function list if(/ *Functions in bold are not .* as of .*/) { gsub(/ as of .*/, strftime(" as of %d %B %Y")); }  # Boldize (or not) depending on whether we have seen the API in use if(match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a)) { gsub("_", " ", a[1]); if(!api[a[1]]) $0 = gensub(/()?\[\[(API[_ ].*)\]\]()?/, "\\2", 1); else $0 = gensub(/()?\[\[(API[_ ].*)\]\]()?/, "\\2", 1); }  print; }
 * 1) Usage: gawk -f boldizeglobfuncs.awk luafuncs.txt wowapi.txt globfunc.txt > globfunc-fixed.txt
 * 1) Will correctly boldize Global Function List entries that do not occur in earlier files
 * 2) Assumes that the global functions list file has "glob" somewhere in its name (and others do not)

Finding non-existant APIs
This can be used to find non-existant (typoed or removed) APIs in e.g. World of Warcraft API, Widget API or Lua functions.

Copy wikitext contents to files:
 * Global Function List &rarr; globfunc.txt
 * Whatever page you want to test &rarr; somefile.txt

(The only important thing is that the global function list file has "glob" somewhere in its name, and that it is the first file in the list)

# # # FILENAME!=LastFileName { LastFileName = FILENAME; IsGlobals = ( tolower(FILENAME) ~ /glob/ ); } IsGlobals && match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) { gsub("_", " ", a[1]); api[a[1]]=1; } !IsGlobals && match($0, /\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) { gsub("_", " ", a[1]); if(!api[a[1]]) print; }
 * 1) Usage: gawk -f findbadfuncs.awk globfunc.txt somefile.txt
 * 1) Finds functions that are NOT mentioned in the Global Function List
 * 2) Will also complain about:
 * 3) - FrameXML Object methods
 * 4) - FrameXML Variables
 * 5) Just ignore those for now, there are not too many of them (yet)

Mismatching descriptions of duplicate API entries
This script finds duplicate API entries, but where the description or argument list differs. You can of course give it multiple files and it will detect across files.

Copy wikitext contents to files:
 * Whatever page you want to test &rarr; somefile.txt
 * (World of Warcraft API, Widget API, Lua functions)

# # $0!~/&lt;!--/ && match($0, /^ *[:*] *\w* *\[\[(API[_ ][A-Za-z0-9_ ]+)/, a) { gsub("_", " ", a[1]); posstr=sprintf("%s:%4u: ", FILENAME, NR); # M$: posstr=sprintf("%s(%4u): ", FILENAME, NR); if(pos[a[1]]) { if(line[a[1]]==$0) { equal++; next; # Comment out this line of you want to see equal dups too }    else nonequal++; print posstr $0 "\n" pos[a[1]] line[a[1]] } else { pos[a[1]]=posstr; line[a[1]]=$0; } } END { print "" print "Total: " print " " equal " equal duplicates." print " " nonequal " nonequal duplicates." }
 * 1) Usage: gawk -f findduplicates.awk somefile.txt
 * 1) Will output duplicate API links where the text does NOT match,
 * 2) using a formatting that becomes clickable in most IDEs.
 * 3) (Visual Studio will probably want "filename.ext(123):" though. Change it below.)

Village pump summary to Portal:Community
This will only work under Unix (Linux). It requires that curl is installed.

BEGIN { MAXUSERS = 4; MAXSUBJS = 5; USERNAME = "YOURUSERNAME"; PASSWORD = "YOURPASSWORD"; } function monthnum(name) { if(name ~ /[Jj]an/) return 1; if(name ~ /[Ff]eb/) return 2; if(name ~ /[Mm]ar/) return 3; if(name ~ /[Aa]pr/) return 4; if(name ~ /[Mm]ay/) return 5; if(name ~ /[Jj]un/) return 6; if(name ~ /[Jj]ul/) return 7; if(name ~ /[Aa]ug/) return 8; if(name ~ /[Ss]ep/) return 9; if(name ~ /[Oo]ct/) return 10; if(name ~ /[Nn]ov/) return 11; if(name ~ /[Dd]ec/) return 12; return 0; } function dateval(datestr,  t) { split(datestr, t, " "); return (int(t[3])*10000)+(monthnum(t[2])*100)+int(t[1]) } function storesubj(subj, users) { if(subj=="") return; # Sort and count users asort(users); n=0; for(u in users) n++; # Extract last MAXUSERS as a single line a[1]=0; userstr = ""; i=n+1-MAXUSERS; if(i&lt;1) i=1; for(i&lt;=n ; i++) { split(users[i], a, SUBSEP); ################# THIS IS THE FORMAT OF USER RECORDS #################### if(userstr!="") userstr=userstr " &amp;middot; "; userstr = userstr a[2] " " a[3]; }  # Store subjs[subj] = a[1] SUBSEP subj SUBSEP userstr; } BEGIN { IGNORECASE=1; cmd="curl --max-time 180 --header 'Expect:' http&#x3a;//www.wowwiki.com/Special:Export/WoWWiki_talk:Village_pump"; while((cmd | getline $0)&gt;0) { if($0 ~ /&lt;text /) cap=1; if(!cap) continue; if(/&lt;\/text&gt;/) cap=0; gsub("&amp;lt;", "&lt;"); gsub("&amp;gt;", "&gt;"); gsub("&amp;quot;", "\"");    gsub("&amp;amp;", "\\&amp;");     if(match($0, /^=+ *([^=]+) *=+/, a)) {       newsubj=a[1];       storesubj(subj, users);       subj=newsubj;       delete users;     }     if(match($0, /(\[\[user:[^\]]+\]\]) .* ([0-9][0-9]? \w+ 20[0-9][0-9])/, a) &amp;&amp; subj!="") {       if(dateval(a[2]) &gt; int(users[a[1]]))         users[a[1]] = dateval(a[2]) SUBSEP a[1] SUBSEP a[2];     }   	if(!cap)   		break;   }   close(cmd);   storesubj(subj, users);   # Now, extract the latest MAXSUBJS subjects and output   asort(subjs);   n=0;   for(s in subjs)     n++;   out = "";   i=n+1-MAXSUBJS; if(i&lt;1) i=1;   for(i&lt;=n ; i++) {     split(subjs[i], a, SUBSEP);     ################## THIS IS THE FORMAT OF SUBJECT RECORDS ################     out = out "* &lt;b&gt;" a[2] "&lt;/b&gt;\n";     out = out ":" a[3] "\n";     out = out "\n";   }   # Final touches to the output print "&lt;!-- This is automatically generated. Editing it might be pointless. --&gt;" &gt; "tmp"; print out &gt; "tmp"; close("tmp"); # Login print "\n\n\n- LOGGING IN -\n"; cmd="curl --max-time 180 --header 'Expect:' --location 'http&#x3a;//www.wowwiki.com/index.php?title=Special:Userlogin&amp;action=submitlogin' --form 'wpName=" USERNAME "' --form 'wpPassword=" PASSWORD "' --form 'wpLoginattempt=Log in' --cookie-jar cookiejar | grep -A 15 '&lt;body'"; print "" | cmd; close(cmd); # Grab hold of the wpEditToken print "\n\n\n- GETTING wpEditToken -\n"; cmd="curl --max-time 180 --header 'Expect:' 'http&#x3a;//www.wowwiki.com/index.php?title=WoWWiki_talk:Village_pump/Summary&amp;action=edit' --cookie cookiejar"; while((cmd | getline $0)&gt;0) { if(match($0, /value="([0-9a-f\\]+)" name="wpEditToken"/, a)) wpEditToken = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpEdittime"/, a)) wpEdittime = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpStarttime"/, a)) wpStarttime = a[1]; }  close(cmd); if(wpEditToken=="") { print "ERROR: wowwiki wouldn't give me its wpEditToken!"; exit(1); }  # Post the page print "\n\n\n- POSTING CHANGES -\n"; cmd="curl --max-time 180 --header 'Expect:' 'http&#x3a;//www.wowwiki.com/index.php?title=WoWWiki_talk:Village_pump/Summary&amp;action=submit' --cookie cookiejar --form 'wpTextbox1=&lt;tmp' --form 'wpSummary=automated upload' --form 'wpMinoredit=0' --form 'wpSave=Save page' --form 'wpSection=' --form 'wpEdittime="wpEdittime"' --form 'wpStarttime="wpStarttime"' --form 'wpEditToken="wpEditToken"'"; print cmd; print "" | cmd; close(cmd); # Force the Community portal to refresh print "\n\n\n- PURGING COMMUNITY PORTAL -\n"; cmd="curl --max-time 180 --header 'Expect:' 'http&#x3a;//www.wowwiki.com/index.php?title=WoWWiki:Community_portal&amp;action=purge' | grep -A 50 'Recent talk in the'"; print "" | cmd; close(cmd); }
 * 1) !/usr/bin/gawk -f
 * 1) Goddamn, why did I get the idea to write this in Awk. It is SO not up to the task. Should have used php or perl instead. --Mikk
 * 1) Grmbl. Imperial date crap. This would have been a 1-line string comparison if it was iso dates.
 * 1) Wants "12 May 2006", "9 Jun 2006"

Events/A et al to Events/Action Bar et al and Events/Names
Automatically generates events-by-category pages and the simple name list from the alphabetically indexed pages (the originals). Is run a couple of times daily.

This will only work under Unix (Linux). It requires that curl is installed.

BEGIN { USERNAME = "YOURUSERNAME"; PASSWORD = "YOURPASSWORD"; KnownCat["Action Bar"] = 1; KnownCat["Auction"] = 1; KnownCat["Bank"] = 1; KnownCat["Battleground"] = 1; KnownCat["Buff"] = 1; KnownCat["Combat"] = 1; KnownCat["Communication"] = 1; KnownCat["Death"] = 1; KnownCat["GlueXML"] = 1; KnownCat["Guild"] = 1; KnownCat["Honor"] = 1; KnownCat["Instance"] = 1; KnownCat["Item"] = 1; KnownCat["Loot"] = 1; KnownCat["Mail"] = 1; KnownCat["Map"] = 1; KnownCat["Misc"] = 1; KnownCat["Movement"] = 1; KnownCat["Party"] = 1; KnownCat["Pet"] = 1; KnownCat["Player"] = 1; KnownCat["Quest"] = 1; KnownCat["Skill"] = 1; KnownCat["Spell"] = 1; KnownCat["System"] = 1; KnownCat["Tooltip"] = 1; KnownCat["Trade"] = 1; KnownCat["Tradeskill"] = 1; KnownCat["Trainer"] = 1; KnownCat["Unit Info"] = 1; curl="curl --max-time 180 --silent --show-error --cookie-jar cookiejar --cookie cookiejar --header 'Expect:' "; RS = "\r?\n"; } # # function TitleEncode(page) { gsub(" ", "_", page); gsub("&amp;", "%26", page); gsub("?", "%3f", page); gsub(/\(/, "%28", page);  gsub(/\)/, "%29", page); return page; } # # function GetPage(page,   cmd,cap,ret) { page = TitleEncode(page); ret="" cmd=curl "'http&#x3a;//www.wowwiki.com/Special:Export/" page "'"; print cmd; while((cmd | getline $0)&gt;0) { if($0 ~ /&lt;text /) { gsub(/.*&lt;text [^&gt;]*&gt;/, ""); cap=1; }    if(!cap) continue; if(/&lt;\/text&gt;/) { gsub(/&lt;\/text&gt;.*/, ""); ret = ret $0; break; }    ret = ret $0 "\n"; }  close(cmd); gsub("&amp;gt;", "&gt;", ret); gsub("&amp;lt;", "&lt;", ret); gsub("&amp;quot;", "\"", ret);  gsub("&amp;amp;", "\\&amp;", ret);   return ret; } # # function Login {   headers = "headers.tmp";   print "" &gt; headers;   close(headers);   cmd=curl " --location 'http&#x3a;//www.wowwiki.com/index.php?title=Special:Userlogin&amp;action=submitlogin' --form 'wpName=" USERNAME "' --form 'wpPassword=" PASSWORD "' --form 'wpLoginattempt=Log in' --dump-header " headers " &gt; /dev/null";   print cmd;   print "" | cmd;   close(cmd);   while((getline &lt; headers)&gt;0) {     if(/^Set-Cookie: wowwikiUserID=[0-9]/) {     	close(headers)     	return;	# success!     }   }   print "ERROR: Login failure!";   exit(1); } # # function PutPage(page, content) {   page = TitleEncode(page);   # Grab hold of the wpEditToken   cmd=curl " 'http&#x3a;//www.wowwiki.com/index.php?title=" page "&amp;action=edit'";   print cmd;   while((cmd | getline $0)&gt;0) {     if(match($0, /value="([0-9a-f\\]+)" name="wpEditToken"/, a)) wpEditToken = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpEdittime"/, a)) wpEdittime = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpStarttime"/, a)) wpStarttime = a[1]; }  close(cmd); if(wpEditToken=="") { print "ERROR: wowwiki wouldn't give me its wpEditToken!"; exit(1); }  # Post the page tmpfile = "post.tmp"; print content &gt; tmpfile; close(tmpfile); cmd=curl " 'http&#x3a;//www.wowwiki.com/index.php?title=" page "&amp;action=submit' --form 'wpTextbox1=&lt;"tmpfile"' --form 'wpSummary=automated upload' --form 'wpMinoredit=0' --form 'wpSave=Save page' --form 'wpSection=' --form 'wpEdittime="wpEdittime"' --form 'wpStarttime="wpStarttime"' --form 'wpEditToken="wpEditToken"'"; print cmd; print "" | cmd; close(cmd); } # # function PurgePage(page) { page = TitleEncode(page); cmd=curl " 'http&#x3a;//www.wowwiki.com/index.php?title=" page "&amp;action=purge' &gt; /dev/null"; print cmd; print "" | cmd; close(cmd); } BEGIN { Login; for(i=0x41;i&lt;=0x5a;i++) { c=sprintf("%c",i); if(1) { txt = GetPage("Events/" c); }  	else { txt=""; while((getline &lt; c)&gt;0) txt = txt $0 "\n"; close(c); }  	# Trim out the inserted dummy headers that are only there to get [Edit] links at regular intervals gsub(/[ \n]*\|}[ \n]*===+ +===+[ \n]*{\|[ \n]*/, "", txt); # Split on "{&#x7b;evt" to get one event per array entry n = split(txt, a, /{&#x7b;evt/); AllEvents = AllEvents "\n== " c " ==\n" for(ei=1;ei&lt;=n;ei++) { if(match(a[ei], /\|([A-Z_]+)\|([a-zA-Z_, ]+)}}( *\n)+/, parms)) { name = parms[1] header = substr(a[ei], 1, RSTART+RLENGTH); # print "Event: '" name "' Categories:'" parms[2] "'"; gsub(/^ */, "", parms[2]); gsub(/ *$/, "", parms[2]); split(parms[2], categories, / *, */); # for(ci in categories) print "C: '" categories[ci] "'" txt = substr(a[ei], RSTART+RLENGTH); gsub(/( *\n)*$/, "", txt); # print txt; AllEvents = AllEvents "\n* " name " &amp;nbsp; &amp;rarr; [&#x5b;Events/" c "|" c "]] &lt;small&gt;" for(ci in categories) { cat = categories[ci]; if(!KnownCat[cat]) print "Warning: '" name "': Unknown category: '" cat "'"; else { CatPage[cat] = CatPage[cat] "{&#x7b;evt|" name "|" parms[2] "}}\n\n" txt "\n\n\n" AllEvents = AllEvents "&amp;middot; [&#x5b;Events/" cat "|" cat "]] "; }  			}   			AllEvents = AllEvents "&lt;/small&gt;\n" }  	} # END for(ei=1;ei&lt;n;ei++) }  if(1) for(cat in CatPage) { PutPage("Events/" cat,   		"_&#x5f;NOTOC_&#x5f;_&#x5f;NOEDITSECTION_&#x5f;{&#x7b;eventlistheader}}\n" \   		"&lt;!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. --&gt;\n" \   		"&lt;!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. --&gt;\n" \   		"&lt;!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. --&gt;\n" \   		"\n\n" \   		":{&#x7b;icon-information}}Note that this page is automatically generated; editing it is pointless. To edit event descriptions, edit the entries in the alphabetical pages, e.g. [&#x5b;Events/A]], [&#x5b;Events/B]], etc. Changes there will be copied over to here within a few hours.\n" \   		"\n\n" \   		"== " cat " related events ==\n\n" \   		CatPage[cat] ); }  PutPage("Events/Names",   	"_&#x5f;NOEDITSECTION_&#x5f;{&#x7b;tocright}}[&#x5b;Category:API Events| ]]\n" \   	"&lt;!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. --&gt;\n" \   	AllEvents	); }
 * 1) !/usr/bin/gawk -f
 * 1) TitleEncode - replace a few strategic characters that are likely to show up and cause problems. not a full url encoder.
 * 1) GetPage - get the full contents of a page via Special:Export
 * 1) Login Log in. Will exit(0) on failure.
 * 1) PutPage - push a new revision of the given page
 * 1) PurgePage - do "action=purge" on a page to pull in changes in templates

RC bot
Takes Special:RecentChanges and reformats into WoWWiki:RC with WoWWiki:RC/Skip users removed, and also analyzes the diffs to try and find vandals / abusive text.

This will only work under Unix (Linux). It requires that curl is installed.

BEGIN { USERNAME = "YOURUSERNAME"; PASSWORD = "YOURPASSWORD"; curl="curl --max-time 180 --silent --show-error --cookie-jar cookiejar --cookie cookiejar --header 'Expect:' "; RS = "\r?\n"; } # # function TitleEncode(page) { gsub(" ", "_", page); gsub("&amp;", "%26", page); gsub("?", "%3f", page); gsub(/\(/, "%28", page);  gsub(/\)/, "%29", page); return page; } # # function GetPage(page,   cmd,cap,ret) { page = TitleEncode(page); ret="" cmd=curl "'http&#x3a;//www.wowwiki.com/Special:Export/" page "'"; print cmd; while((cmd | getline $0)&gt;0) { if($0 ~ /&lt;text /) { gsub(/.*&lt;text [^&gt;]*&gt;/, ""); cap=1; }    if(!cap) continue; if(/&lt;\/text&gt;/) { gsub(/&lt;\/text&gt;.*/, ""); ret = ret $0; break; }    ret = ret $0 "\n"; }  close(cmd); gsub("&amp;gt;", "&gt;", ret); gsub("&amp;lt;", "&lt;", ret); gsub("&amp;quot;", "\"", ret);  gsub("&amp;amp;", "\\&amp;", ret);   return ret; } # # function Login {   headers = "headers.tmp";   print "" &gt; headers;   close(headers);   cmd=curl " --location 'http&#x3a;//www.wowwiki.com/index.php?title=Special:Userlogin&amp;action=submitlogin' --form 'wpName=" USERNAME "' --form 'wpPassword=" PASSWORD "' --form 'wpLoginattempt=Log in' --dump-header " headers " &gt; /dev/null";   print cmd;   print "" | cmd;   close(cmd);   while((getline &lt; headers)&gt;0) {     if(/^Set-Cookie: wowwikiUserID=[0-9]/) {     	close(headers)     	return;	# success!     }   }   print "ERROR: Login failure!";   exit(1); } # # function PutPage(page, content) {   page = TitleEncode(page);   # Grab hold of the wpEditToken   cmd=curl " 'http&#x3a;//www.wowwiki.com/index.php?title=" page "&amp;action=edit'";   print cmd;   while((cmd | getline $0)&gt;0) {     if(match($0, /value="([0-9a-f\\]+)" name="wpEditToken"/, a)) wpEditToken = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpEdittime"/, a)) wpEdittime = a[1]; if(match($0, /value="([0-9a-f]+)" name="wpStarttime"/, a)) wpStarttime = a[1]; }  close(cmd); if(wpEditToken=="") { print "ERROR: wowwiki wouldn't give me its wpEditToken!"; exit(1); }  # Post the page tmpfile = "post.tmp"; print content &gt; tmpfile; close(tmpfile); cmd=curl " 'http&#x3a;//www.wowwiki.com/index.php?title=" page "&amp;action=submit' --form 'wpTextbox1=&lt;"tmpfile"' --form 'wpSummary=automated upload' --form 'wpMinoredit=0' --form 'wpSave=Save page' --form 'wpSection=' --form 'wpEdittime="wpEdittime"' --form 'wpStarttime="wpStarttime"' --form 'wpEditToken="wpEditToken"'"; print cmd; print "" | cmd; close(cmd); } # # function PurgePage(page) { page = TitleEncode(page); cmd=curl " 'http&#x3a;//www.wowwiki.com/index.php?title=" page "&amp;action=purge' &gt; /dev/null"; print cmd; print "" | cmd; close(cmd); } # # function GetRCSummary(rcid, diffurl) { if(!RCs[rcid]) { dellines=0; lol=0; lolol=0; url=0; tinyurl=0; gold=0; powerlevel=0; ninja=0; pwn=0; noob=0; insult=0; fuck=0; penis=0; pussy=0; ass=0; blows=0; gay=0; gsub(/&amp;amp;/, "\\&amp;", diffurl); gsub(/'/, "%27", diffurl); gsub(/"/, "%22", diffurl);  	gsub(/\{/, "%7b", diffurl);   	gsub(/\|/, "%7c", diffurl);   	gsub(/\}/, "%7d", diffurl);   	rccmd=curl " '" diffurl "'";   	# print rccmd;   	bOK=0   	while((rccmd | getline )&gt;0) {   		bOK=1   		if(/&lt;td class='diff-deletedline'&gt;/ &amp;&amp; /&lt;td class='diff-addedline'&gt;&lt;\/td&gt;/)   			dellines++;   		if(match($0, /&lt;td class='diff-addedline'&gt;(.*)&lt;\/td&gt;/, a)) {   			$0=a[1];   			# print "ADDED: \"" $0 "\""   			oldIC = IGNORECASE;   			IGNORECASE = 1;   			if(/\yl+o+l+\y/)  lol++;   			if(/lo+l+o+l/)  lolol++;   			if(/la+w+l/)  lolol++;   			if(/fuck/) fuck++;   			if(/tinyurl\.com/)  tinyurl++;   			if(/http&#x3a;\/\// &amp;&amp; (! /thottbot.com/) &amp;&amp; (! /allakhazam.com/) &amp;&amp; (! /worldofwarcraft.com/) )  url++;   			if(/gold/)  gold++;   			if(/power.?level/)  powerlevel++;   			if(/ninja/)  ninja++;   			if(/pwn/) pwn++; if(/n[o0][o0]+b/ || /\ynub/) noob++; if(/idiot/ || /moron/ || /bastard/ || /[a@][s$]+[- ]?h[o0]le/ || /schmuck/ || /\yween(ie|s)/) insult++; if(/f[ue]ck/ || /f[*]ck/ || /f00k/) fuck++; if(/\ydick/ || /p[e3]+n[i1l]+[s$]/ || /\yc[o0]+ck/) penis++; if(/pu[s$][s$]y/ || /pu[s$][s$][i1l][e3][s$]/ || /cu*nt/ || /c[*]nt/) pussy++; if(/\y[a@]ss\y/ || /\y[a@]rse\y/ || /\y[a@]sses\y/ || /[a@][s5][s5]h[a@]t/ || /[a@][s5][s5]c[l1][o0]wn/ || /[a@][s5][s5].?h[o0][l1i][e3]/) ass++; if(/blows/) blows++; if(/ga+y/ || /gh[3e]y/ || /g[3e]hy/) gay++; IGNORECASE = oldIC; }  	}   	close(rccmd); if(bOK) { txt=" "; if(dellines&gt;5) txt=txt" &amp;middot; " dellines " deleted lines"; if(lol) txt=txt" &amp;middot; \"lol\" x " lol; if(lolol) txt=txt" &amp;middot; \"lolol\" etc x " lolol; if(url) txt=txt" &amp;middot; \"http&#x3a;//\" x " url; if(tinyurl) txt=txt" &amp;middot; \"tinyurl.com\" x " tinyurl; if(gold) txt=txt" &amp;middot; \"gold\" x " gold; if(powerlevel) txt=txt" &amp;middot; \"power level\" x " powerlevel; if(ninja) txt=txt" &amp;middot; \"ninja\" x " ninja; if(pwn) txt=txt" &amp;middot; \"pwn\" x " pwn; if(noob) txt=txt" &amp;middot; \"pwn\" x " noob; if(insult) txt=txt" &amp;middot; \"idiot\" etc x " insult; if(fuck) txt=txt" &amp;middot; \"fuck\" etc x " fuck; if(penis) txt=txt" &amp;middot; \"penis\" etc x " penis; if(pussy) txt=txt" &amp;middot; \"pussy\" etc x " pussy; if(ass) txt=txt" &amp;middot; \"ass\" etc x " ass; if(blows) txt=txt" &amp;middot; \"blows\" x " blows; if(gay) txt=txt" &amp;middot; \"gay\" etc x " gay; RCs[rcid] = txt; print rcid "\t'" txt "'"; print rcid "\t" txt &gt;&gt; "rcs.txt"; fflush("rcs.txt"); }  	else { return "(error getting diff)"; }  }   if(RCs[rcid]==" ") return ""; return RCs[rcid]; } BEGIN { # Read summaries of previously scanned recent changes rcfile="rcs.txt"; while((getline &lt; rcfile)&gt;0) { split($0, a, "\t"); RCs[a[1]] = a[2]; }  close(rcfile); # Log in :-)  Login;   # Get list of users to skip   txt = GetPage("WoWWiki:RC/Skip");   n = split(txt, a, /\n/);   for(i=1;i&lt;=n;i++)	{   	name = a[i];   	gsub(/^\* +/, "", name);   	gsub(/ *$/, "", name);   	aSkip[name]=1;   }   res = "&lt;noinclude&gt;&lt;!-- DO NOT EDIT. THIS IS AN AUTOMATICALLY GENERATED PAGE. --&gt;{&#x7b;nocat}}&lt;/noinclude&gt;\n\n";   ######## Retreive and parse the recent changes page   rc_tmp = "rc.tmp";   # Retreive to a temp file or the operation times out while we're scanning changed pages   system(curl " 'http&#x3a;//www.wowwiki.com/index.php?title=Special:Recentchanges&amp;limit=1500&amp;hidepatrolled=1&amp;days=3' &gt; " rc_tmp);   while((getline &lt; rc_tmp)&gt;0) {   	time=0;   	if(match($0, /&lt;h4&gt;(.*)&lt;\/h4&gt;/, a))   		res = res "\n\n=== " a[1] " ===\n";   	if(match($0, /(&lt;\/tt&gt;|&lt;strong&gt;)&lt;a href="\/([^"]*)"[^&gt;]*&gt;([^&lt;]*)&lt;\/a&gt;/, a)) { pagelink=a[2]; pagetitle=a[3]; subitem=0; }  	else if(match($0, /&lt;tt&gt;&lt;a href="\/([^"]*)"[^&gt;]*&gt;([^&lt;]*)&lt;\/a&gt;/, a)) {   		pagelink=a[1];   		pagetitle=a[2];   		time=" ";   		subitem=1;   	}   	else   		continue;   	if(subitem &amp;&amp; bSkip)   		continue;   	pageurl="";   	if(pagelink ~ /^index.php/) {   		pageurl="http&#x3a;//www.wowwiki.com/" pagelink   		page = "[" pageurl " " pagetitle "]";   	}   	else {   		gsub(/^[Cc]ategory:/, ":Category:", pagelink);   		gsub(/^[I]mage:/, ":Image:", pagelink);   		page = "[&#x5b;" pagelink "|" pagetitle "]]";   	}   	if(match($0, /&lt;a href="([^"]*)"[^&gt;]*&gt;diff&lt;\/a&gt;/, a)) diffurl="http&#x3a;//www.wowwiki.com" a[1]; else if(match($0, /&lt;a href="([^"]*)"[^&gt;]*&gt;last&lt;\/a&gt;/, a))  		diffurl="http&#x3a;//www.wowwiki.com" a[1];   	else if(match($0, /&lt;a href="([^"]*)"[^&gt;]*&gt;changes&lt;\/a&gt;/, a)) diffurl="http&#x3a;//www.wowwiki.com" a[1]; else diffurl=""; if(match($0, /&lt;a href="([^"]*)"[^&gt;]*&gt;hist&lt;\/a&gt;/, a))  		histurl="http&#x3a;//www.wowwiki.com" a[1];   	else if(match($0, /&lt;a href="([^"]*)"[^&gt;]*&gt;Page history&lt;\/a&gt;/, a)) histurl="http&#x3a;//www.wowwiki.com" a[1]; else histurl=""; if(match($0, /&lt;a href="([^"]*)"[^&gt;]*&gt;cur&lt;\/a&gt;/, a))  		curdiffurl="http&#x3a;//www.wowwiki.com" a[1];   	else   		curdiffurl="";   	if(!time) {   		if(!match($0, / ([0-9][0-9]:[0-9][0-9]:[0-9][0-9]) /,a))   			continue;   		time="&lt;tt&gt;" a[1] "&lt;/tt&gt;"   	}   	user=""   	usertxt="";   	if(match($0, /\. \. &lt;a [^&gt;]*title="User:([^"]*)"/, a)) { user=a[1]; usertxt="[&#x5b;User:" user "|" user "]]"; }  	else if(match($0, /&lt;span class="changedby"&gt;\[(.*)\]&lt;\/span&gt;/, a)) { usertxt="[ &lt;small&gt;" gensub(/&lt;a [^&gt;]*title="([^"]*)"[^&gt;]*&gt;([^&lt;]*)&lt;\/a&gt;/, "[&#x5b;\\1|\\2]]", "g", a[1]) "&lt;/small&gt; ]";  	}   	# See if we only have skippable users in this bundle (which may not be a bundle; it can be a single entry, but that doesn't matter)   	if(!subitem) {   		bSkip = 1;   		split(usertxt, a, /\[\[/);   		for(i in a) {   			if(match(a[i], /User:(.*)\|/, aa)) {   				if(!aSkip[aa[1]]) {   					bSkip = 0;   				}   			}   		}   	}	   	if(bSkip)   		continue;   	if(user)	   		print "User: " user "  Title: " pagetitle;   	else   		print "--- Title: " pagetitle;   	if(match($0, /&lt;span class='comment'&gt;\((.*)\)&lt;\/span&gt;/, a)) {   		comment=a[1];   		gsub(/&lt;a [^&gt;]*&gt;/, "", comment);   		gsub(/&lt;\/a&gt;/, "", comment);   		gsub(/{/, "\\&amp;#x7b;", comment);   		gsub(/\[\[Category:/, "[&#x5b;:Category:", comment);   		comment="&amp;nbsp; (" comment ")" }  	else comment=""; # Get recent change summary (cached or new). This mucks up $0 so keep it last. rcurl = diffurl; if(pageurl ~ /rcid=/) rcurl = pageurl; if(match(rcurl, "rcid=([0-9]+)", a)) { rctxt = GetRCSummary(a[1], rcurl); if(rctxt!="") rctxt = "&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;b style=\"border-bottom: 1px dotted;\"&gt;" rctxt "&lt;/b&gt;"; }  	else {  		rctxt = ""; }  	if(subitem) {  		res = res ":::* " page " . . &lt;small&gt;[" diffurl " diff] . [" curdiffurl " cur]&lt;/small&gt; . . " usertxt " &lt;small&gt;" comment "&lt;/small&gt;" rctxt "\n" }  	else res = res "* " time " " page " . . &lt;small&gt;[" diffurl " diff] . [" histurl " hist]&lt;/small&gt; . . " usertxt " &lt;small&gt;" comment "&lt;/small&gt;" rctxt "\n" fflush; if(length(res)&gt;=256000) { res = res "\n'&#x27;RC output length exceeds 256 KB. Ignoring older entries. To see older entries, mark some of the above changes as patrolled and older ones will start appearing on the next update.\n"; break; }  }   close(rc_tmp); PutPage("WoWWiki:RC/Content", res); PurgePage("WoWWiki:RC"); }
 * 1) !/usr/bin/gawk -f
 * 1) TitleEncode - replace a few strategic characters that are likely to show up and cause problems. not a full url encoder.
 * 1) GetPage - get the full contents of a page via Special:Export
 * 1) Login Log in. Will exit(0) on failure.
 * 1) PutPage - push a new revision of the given page
 * 1) PurgePage - do "action=purge" on a page to pull in changes in templates
 * 1) GetRCSummary - retreive a short summary of changes

Rich formatting
# # BEGIN { RS="\r?\n"; sectidx=0; BIGBIG = "[SIZE=4]" _BIGBIG = "[/SIZE]" BIG = "[SIZE=3]" _BIG = "[/SIZE]" TT = "[FONT=Courier New][COLOR=LightBlue]" _TT = "[/COLOR][/FONT]" } /^ *$/ &amp;&amp; skipblanklines { next } skipblanklines { skipblanklines = 0 } match($0, /^(=+) *([^=]+) *=+(.*)/, a) { if(a[1]=="=") {    sectidx = sectidx + 1; outidx = outidx "[b]" sectidx ". " a[2] "[/b]\n"; $0 = "" BIGBIG "[u][b]" sectidx ". " a[2] "[/b][/u]" _BIGBIG a[3]; }  else if(a[1]=="==") {    outidx = outidx "... - " a[2] "\n"; $0 = "" BIG "[u][b]" a[2] "[/b][/u]" _BIGBIG a[3]; }  else $0 = "[u][b]" a[2] "[/b][/u]" a[3]; } /'&#x27;'/ { while(/'&#x27;'/) { sub(/'&#x27;'/, "[b]"); if(!sub(/'&#x27;'/, "[/b]")) $0 = $0 "[/b]"; } } /'&#x27;/ { while(/'&#x27;/) { sub(/'&#x27;/, "[i]"); if(!sub(/'&#x27;/, "[/i]")) $0 = $0 "[/i]"; } } /^{&#x7b;faqq}}/ { sub(/^{&#x7b;faqq}} */, "[b]" BIG "[COLOR=Orange]Q:[/COLOR]" _BIG " "); $0 = $0 "[/b]"; } /^{&#x7b;faqa}}/ { sub(/^{&#x7b;faqa}} */, "[b]" BIG "A:" _BIG "[/b] "); } /^:/ {  sub(/^::/, "        "); sub(/^:/, "   "); } /^;.*:/ {  $0=gensub(/^;([^:]+):(.*)/, "[b]\\1[/b]\n\\2", "1", $0); } /^;/ {  $0=gensub(/^;([^:]+)/, "[b]\\1[/b]", "1", $0); } /^[^#]/ {  count=0; } /^#/ {  count=count+1; sub(/^# */, count ". "); } /\[http&#x3a;\/\// { $0 = gensub(/\[(http&#x3a;\/\/[^ ]+) ([^\]]+)\]/, "[URL=\\1]\\2[/URL]", "g", $0); } /{&#x7b;/ { gsub(/{&#x7b;[Ee]xample\/Begin}}/, "[quote]"); gsub(/{&#x7b;[Ee]xample\/End}}/, "[/quote]"); $0 = gensub(/{&#x7b;[Ff]aqcredit[|]([^}]+)}}/, "[i](credit: \\1)[/i]", "g", $0); } /\[\[#/ {  $0 = gensub(/\[\[#[^|]+\|([^\]]+)\]\]/, "[i]\\1[/i]", "g", $0); } /&lt;div/ { gsub("[\r\n]+$", "\n", out); gsub(/&lt;div [^&gt;]*&gt;/, "[INDENT]"); sub("[\r\n]+", "\n"); skipblanklines=1; } /&lt;\/div&gt;/ { gsub("[\r\n]+$", "\n", out); gsub(/&lt;\/div&gt;/, "[/INDENT]"); sub("[\r\n]+", "\n"); skipblanklines=1; } /&lt;code&gt;/ || /&lt;tt&gt;/ { gsub(/(&lt;code&gt;|&lt;tt&gt;)/, TT); } /&lt;\/code&gt;/ || /&lt;\/tt&gt;/ { gsub(/(&lt;\/code&gt;|&lt;\/tt&gt;)/, _TT); } {  out = out "\n" $0 next; } END { print outidx; print ""; print out; }
 * 1) Usage: gawk -f forumize.awk wikitext.txt &gt; forum.txt
 * 1) Will attempt to convert wikitext to text suitable for pasting in a forum
 * 2) Mostly tested with http&#x3a;//www.wowwiki.com/UI_FAQ to http&#x3a;//wowinterface.com

Basic formatting (e.g. blizzard forums)
# # BEGIN { RS="\r?\n"; sectidx=0; BIGBIG = "" _BIGBIG = "" BIG = "" _BIG = "" TT = "[u]" _TT = "[/u]" } /^ *$/ &amp;&amp; skipblanklines { next } skipblanklines { skipblanklines = 0 } match($0, /^(=+) *([^=]+) *=+(.*)/, a) { if(a[1]=="=") {    sectidx = sectidx + 1; outidx = outidx "[b]" sectidx ". " a[2] "[/b]\n"; $0 = "" BIGBIG "[u][b]" sectidx ". " a[2] "[/b][/u]" _BIGBIG a[3]; }  else if(a[1]=="==") {    outidx = outidx "... - " a[2] "\n"; $0 = "" BIG "[u][b]" a[2] "[/b][/u]" _BIGBIG a[3]; }  else $0 = "[u][b]" a[2] "[/b][/u]" a[3]; } /'&#x27;'/ { while(/'&#x27;'/) { sub(/'&#x27;'/, "[b]"); if(!sub(/'&#x27;'/, "[/b]")) $0 = $0 "[/b]"; } } /'&#x27;/ { while(/'&#x27;/) { sub(/'&#x27;/, "[i]"); if(!sub(/'&#x27;/, "[/i]")) $0 = $0 "[/i]"; } } /^{&#x7b;faqq}}/ { sub(/^{&#x7b;faqq}} */, "[b]Q: "); $0 = $0 "[/b]"; } /^{&#x7b;faqa}}/ { sub(/^{&#x7b;faqa}} */, "[b]A: [/b]"); } /^:/ {  sub(/^::/, "        "); sub(/^:/, "   "); } /^;.*:/ {  $0=gensub(/^;([^:]+):(.*)/, "[b]\\1[/b]\n\\2", "1", $0); } /^;/ {  $0=gensub(/^;([^:]+)/, "[b]\\1[/b]", "1", $0); } /^[^#]/ {  count=0; } /^#/ {  count=count+1; sub(/^# */, count ". "); } /\[http&#x3a;\/\// { $0 = gensub(/\[(http&#x3a;\/\/[^ ]+) ([^\]]+)\]/, "\\2 ( \\1 )", "g", $0); } /{&#x7b;/ { gsub(/{&#x7b;[Ee]xample\/Begin}}/, "[quote]"); gsub(/{&#x7b;[Ee]xample\/End}}/, "[/quote]"); $0 = gensub(/{&#x7b;[Ff]aqcredit[|]([^}]+)}}/, "[i](credit: \\1)[/i]", "g", $0); } /\[\[#/ {  $0 = gensub(/\[\[#[^|]+\|([^\]]+)\]\]/, "[i]\\1[/i]", "g", $0); } /&lt;div/ { gsub("[\r\n]+$", "\n", out); gsub(/&lt;div [^&gt;]*&gt;/, ""); sub("[\r\n]+", "\n"); $0 = "[ul]\n[li]" $0; skipblanklines=0; } /&lt;\/div&gt;/ { gsub("[\r\n]+$", "\n", out); gsub(/&lt;\/div&gt;/, "[/ul]"); sub("[\r\n]+", "\n"); skipblanklines=1; } /&lt;code&gt;/ || /&lt;tt&gt;/ { gsub(/(&lt;code&gt;|&lt;tt&gt;)/, TT); } /&lt;\/code&gt;/ || /&lt;\/tt&gt;/ { gsub(/(&lt;\/code&gt;|&lt;\/tt&gt;)/, _TT); } {  out = out "\n" $0 next; } END { print outidx; print ""; print out; }
 * 1) Usage: gawk -f forumize-basic.awk wikitext.txt &gt; forum.txt
 * 1) Will attempt to convert wikitext to text suitable for pasting in a forum
 * 2) with VERY basic markup, e.g. the blizzard WoW forums