Programming notes

From Wise Nano

Jump to: navigation, search




For a list of projects yet to be done and/or in progress on this site, see Site programming.

Contents

Don't Panic...

All you have to do is learn PHP, CSS, PHPTAL, and MySQL, plus all the wiki's internal data structures and control flow.

And Apache and Javascript and regular expressions, oh my!.

PHP

If you know C, you can almost read PHP.

  • Variable types aren't declared.
  • Variables all start with $. Field names don't start with $; if you see $v->$f then the $f will be replaced with the field name.
  • foo::bar( ... ) is the bar function of class foo, which can be called but won't have access to any class variables.
  • Global variables must be declared global to be accessible inside functions.
  • Class initializers have the name of the class (different in PHP 5, but MediaWiki uses PHP 4).
  • Inside "strings", variables will be subsituted. Inside 'strings', they won't be.
  • Things that are weird/dangerous:
    • Comparing strings with other variables, or doing logical tests, is really weird.
    • Incrementing strings is really weird. 'z'++ is 'aa'. So you can't do "for ($v = 'a'; $v <= 'z'; $v++)" because your test won't catch the post-'z' value.
    • References are weird.
    • NULL is weird. Uninitialized variables are NULL, but that's almost like having a value sometimes.
  • The things that look like arrays aren't. They can be used that way, but also as dictionaries (non-numeric keys).
    • The foreach syntax cycles through the array. It has a "key => value" variant.
  • The operators '.' and '.=' are string concatenation.

The PHP reference (very useful) is at http://www.php.net/manual/en/langref.php

CSS

The idea of CSS is to assign styles and transformations to data according to the data's label and attributes. Just read the stylesheets/monobook/main.css file and you'll get the idea. See also the PHPTAL section.

A useful tutorial to start with is http://css.maxdesign.com.au/floatutorial/. Another excellent CSS tutorial is at http://www.westciv.com/style_master/academy/css_tutorial/index.html Google for "CSS reference" to find lots of useful pages.

PHPTAL

The label and attributes that CSS uses come from the PHPTAL template.

The template gets turned into an ugly HTML file as shown at http://meta.wikimedia.org/wiki/PHPTal . Ignore that. Just stick to the template and the CSS. A newly modified template will be converted automatically when you reload the page.

For example, the templates/xhtml_slim.pt file takes data from includes/SkinPHPTal.php and makes (for example) a navigation box with 'class="portlet" id="p-nav"'. Then the style sheet decides where to place it on the screen and how to display its contents--though the sequence comes from the template.

The content text in the template file is all defaults; it generally won't be used to produce the page.

  • 'i18n:' means it's translating something.
  • 'tal:' is a directive to the template parser.
  • 'metal:' is a macro.
  • ${variable} is how values are passed from the SkinPHPTal.php file. In that file, you'll find a corresponding '$tpl->set( "variable", value);'

MySQL

  • MySQL is a database.
  • It has named tables, each with 0 or more rows of data.
  • Each table has a list of named fields (generally the names don't overlap between tables).
  • SQL is accessed by queries. Queries can add or replace data, or search ("select") and return data.
  • Search queries can use multiple tables and treat field names as variables to match between the tables (kind of like Prolog). For example, you can say "SELECT a1,a2,b2,b3 FROM A,B WHERE a1=5 AND a2=b2". If a2 and b2 are fields in tables A and B respectively, perhaps storing identification numbers to link the datasets, then this finds all b3 data (from table B) corresponding to the data where a1 is 5.
  • There are lots of search specifications you can use: numeric comparison, etc.
  • A very handy free package you can install, so you can poke around the wiki database via the web is phpMyAdmin
phpMyAdmin is a tool written in PHP intended to handle the administration of MySQL over the Web. Currently it can create and drop databases, create/drop/alter tables, delete/edit/add fields, execute any SQL statement, manage keys on fields, manage privileges, export data into various formats and is available in 47 languages.

Apache

The important thing about Apache is that it rewrites your URLs to make them look nice. You tell it how with the ".htaccess" file in your root web directory.

The file has a bunch of RewriteRule's that match the URL with a regular expression, then let you supply text.

The example given on the Meta wiki help page didn't work for me. The final rule was "RewriteRule ^(.*)$ wiki/index.php/$1 [L,QSA]". The L in the brackets is supposed to mean "This is the final rule, don't rewrite any more." But I always got a server error, narrowed it down to that rule, and discovered that the rewrite wasn't stopping on L rules: it would continue looking for more rules to match and substitute.

So I wrote a very simple file that mapped any URL starting with a "w" to my wiki directory (awiki) which did *not* start with a "w". Even if it wanted to, it couldn't rewrite after that.

# close the php security hole...
php_flag register_globals off

RewriteEngine on
# If just the hostname, map to w.
RewriteRule ^$ w
# Don't map stylesheets to index.php, but rather to the proper directory.
RewriteRule ^w/stylesheets/(.*)$ awiki/stylesheets/$1 [L]
# Don't map images to index.php.
RewriteRule ^w/images/(.*)$ awiki/images/$1 [L]
# Map everything else to index.php in the awiki directory.
RewriteRule ^w(.*)$ awiki/index.php$1 [L]
I changed $IP in LocalSettings.php to point into the awiki directory; this is how it actually finds the files. I also changed
$wgScriptPath = "/w"; $wgScript = $wgScriptPath."";
This is how it builds the URL. So that's totally artificial, and just has to match the rewrite rule; it doesn't have to point to a script if I don't want a script in the URL, because the rewriting will put the script back in the URL.

JavaScript

You don't really need to know JavaScript unless you want to 1) trace where all the text goes or 2) mess with the buttons or other functions on the page. Suffice it to say that JavaScript looks a lot like C, and stylesheets/wikibits.js is read into each page. (This is where some, but not all, of the tool tips come from: they're read from the big "Monobook.js" entry in the Language.php file by a function in wikibits.js.)

General notes on code flow

index.php gets everything and dispatches it. Most of the file names in the includes directory are self-explanatory after you've played with the wiki for a bit.

How does the data get on the screen?

  • wgOut is an OutputPage set in Setup.php.
  • The page content is written to wgOut bit-by-bit.
  • wgOut->output() is called. This gets the skin (monobook, in my case) and calls its outputPage.
  • That gets the template created from templates/xhtml_slim.pt, and stuffs it full of information keyed by name.
  • The template writes out an HTML page including a link to stylesheets/monobook/main.css.

So where do the footer and sidebars come from?

  • They're put together in xhtml_slim.pt.

How is content put on-screen?

  • Look in the various files in the includes directory.

What if I want to change something?

  • If it's text, it's probably a message in a language file, and can be changed through Special:Allmessages.
  • To rearrange things outside the content box, look in in xhtml_slim.pt.
  • To rearrange things inside the content box, look in the PHP files in the includes directory.

What's the "title" code for?

  • Good question. It does two things.
    • It keeps track of the title of the current page, and lots of other information that's used when outputting it.
    • It is used to translate URLs and Wiki links, and look up the latter in the database.
    • So in a sense, there are two distinct purposes for the one data structure. Sometimes it's the global title, and sometimes it's a throwaway title.

Added "Help wanted" feature

Goal: Show articles of subcategories of "HelpWanted" category by search key rather than article title.

What I did:

  • First, I wanted a way to ask for help in an article, listing questions by the kind of expert that was being requested.
  • I realized that categories might be adapted to do this. So I read up on categories and learned that there was a search key, and they could be alphabetized by search key rather than article title. So now I needed to display them by search key.
  • I looked at a Category page and saw that it included the text "Articles in category". Searched for that and found it was stored under "category_header". Searched for that, and found it used in Parser.php.
  • Read through a bunch of code to see how makeLink worked, learn that getParentCategories existed, see where the links were pushed into the array, etc. After that it was easy.

Added "Hosted by" link

Goal: Add an icon linking to CRN's web site, next to the "Powered by Wikimedia" icon.

Files modified:

  • includes/Skin.php
  • includes/SkinPHPTal.php
  • stylesheets/monobook/main.css
  • stylesheets/commonPrint.css
  • templates/xhtml_slim.pt

What I did:

  • I grepped for "page has been accessed" since that's in the footer. Found it was called "viewcount". (Almost all text appearing in the skin is a translated message looked up in the language files.)
  • Found "viewcount" in Skin.php. Used in pageStats, called from printFooter, afterContent, called from ... hard to follow. But below pageStats is getCopyright, and below that is getPoweredBy. Bingo!
  • getPoweredBy leads to outputPage in class SkinPHPTal in SkinPHPTal.php. That stuffs it into $tpl, which is made by "new PHPTAL($this->template . '.pt', 'templates');". Searching for template gets me into the arcane world of PHPTAL, which is a way of processing a template file to make another file to help generate a page. See http://phptal.sourceforge.net/bagpack/HTML_Template_PHPTAL/doc/index.html
  • Anyway, the template file used here turns out to be xhtml_slim.pt. That uses the "poweredbyico" created in SkinPHPTal . A search for "poweredbyico" also leads to the two CSS files. (And only them; despite the skin name being davinci, it doesn't seem to use that.
  • OK, so now I know everything I need to do. I add a function "getHostedBy" in Skin.php, plugging in my icon file and URL. I add a line to SkinPHPTal.php to call that and stuff the result into $tpl. I edit xhtml_slim.pt and main.css to add parallel code to what's already there.
  • And... the formatting is broken. The new icon rides on top of the bar instead of off to the right. I spend the next two hours convincing myself that there is something magic about the words "f_copyrightico" and "f_poweredbyico" as used in the template and CSS. And that I can reuse "f_poweredbyico" for my "hostedbyico" information. That works. But I still don't know why those words are magic. They don't appear anywhere else in the codebase.
  • Success!
  • An hour later it suddenly hit me: The "magic word" symptoms were consistent with the main.css file not being changed at all--as though I weren't editing it. Sure enough. I have a test-directory install that I was working in. But I hadn't set up all the paths, so it was loading the main-directory install's main.css. Now those words aren't magic anymore.


Changing text in a tab

This one shouldn't have been hard. I wanted to change the "category" tab on category pages to say "article" since the site will make heavy use of categories and we don't want to make the user wonder whether the tab's name change reflects a function change.

  • Search for the lower-case whole-word "category". It appears in several places, but the only place that looks promising is in the Language.php file, where "category" maps to "category". Log in as sysop, change it, no effect. Change it back. Hm.
  • Look at the search list again. See there's an "nstab-category" that's used in SkinPHPTal.php. It's used in getNameSpaceKey, which is called from builtContentActionURLs, and goes along with lots of other tab-like entries. Well this is promising. Trouble is, "nstab-category" maps to "Category" in the language file. It's capitalized. Can't be the one.
  • Try the "category" language-entry again, remembering to hit shift-reload in the browser to reload everything. Nope, still doesn't work.
  • Read through the SkinPHPTal code some more. Yep, that's got to be where it's happening. But the stupid string it's fetching is capitalized! What could be going on? It couldn't be lowercasing the string somewhere, could it? It sure doesn't look like it's happening anywhere in the PHP code... Could it be...?
  • Having wrestled with the skin mechanism to add the Hosted-By icon, it doesn't take long to trace from $content-actions to 'id="p-cactions"' in xhtml_slim.pt. And from there, to main.css...
  • Omigawd, I don't believe it. There in main.css, under "#p-cactions li a", is "text-transform: lowercase;" Talk about the kitchen sink. Who'da thunk they'd put a lowercase transform into the style sheet spec? And then use it?
  • Now it's easy to change nstab-category from "Category" to "Article", after checking to make sure that's the only thing it's used for.

Munging the navigation and toolbox boxes

By this time, I felt like an old hand with PHPTal, but it still took a while to convince myself it was as easy as it looked. I wanted to move "Recent changes" down to toolbox, "What links here" up to navigation, remove "random page", "community portal", and "current events", and add a "site map" pointing to one of my pages.

  • Community portal and current events can be turned off by editing the appropriate system-message string. The comment in the Language file is WRONG. Don't set the strings to "", set them to "-". That's easy to see from the code in SkinPHPTal.
    • Newer versions of MediaWiki (including this one, 1.5.5) allow editing of the sidebar through a page in the MediaWiki namespace (namely, MediaWiki:Sidebar). Xrchz 22:31, 6 February 2006 (CST)
  • There's no way to turn off Random, so I overwrote it to add "Site map." I hard-coded the URL, using the $this->makeArticleUrl that I guessed and confirmed must be in Skin.php (since there were a bunch of other make???Url functions in use nearby). So now nav_urls (which is used for both navigation and toolbox boxes) has a 'sitemap' entry.
  • Change the xhtml_slim.pt file. Move the entries for the various pages between the various boxes to suit. (Search for 'id="p-nav"' and 'id="p-tb"'.) I hard-coded another string (bad me): Instead of using i18n:translate I just took that out, so it will use the default "Site Map" I put after the > of the URL. I gave this an ID of "n-sitemap".
  • Added an "n-sitemap" entry to 'Monobook.js' system message, stealing the alt-M hotkey from 'move page'. (That's what the first character is for in the entries in that structure.)

Making article links point to categories, sometimes

On this site, categories should be first-class citizens: containing content, used for navigation (the site map is the category taxonomy), easy to link to when you want to refer to an overview of a topic, etc. So I want a simpler way to link to categories than [[:Category:foo|foo]].

I'd first considered introducing a new syntax: [{foo}] means category foo. But then I realized there was a simpler way: If there's a [[foo]] link, and article foo doesn't exist but Category:foo does, produce HTML to point it to Category:foo. If someone creates an article named foo after the category has been linked to in this way, then the sense of the link will shift. But that article is unlikely to be created: it would require hand-editing of URLs to make it happen. (If Category:foo doesn't exist, then it'll give a red&edit link to empty article foo.)

Wherever the code is that decides whether to make links red&edit or blue, that's where I need to put this hook. It'll already be checking if the article exists, so I just have to copy and tweak that code to see if the category exists.

I thought at first it'd be in Skin or SkinPHPTal, since those have so much to do with output. But those don't format the content, just the wrapper. -->(Wrong, see below.) (I wonder if the CSS wizards will integrate content with CSS more in the future? Probably not, since content doesn't need to be treated as a GUI but just presented as it comes.)

OutputPage looks promising. It's short enough to skim... all it has is a comment that "doWikiPass2" does all the real work of parsing. But that function doesn't exist. From my browsing I remember that the parser has options that sound like the "passes" this is talking about; maybe that comment is old. So maybe all the action happens in Parser.

Following the function "parse" from OutputPage, I find it calls internalParse. (While browsing, I also learn that the text is parsed by replacing tagged sections with keys and storing the tags to be processed separately, then plugging it all back together at the end. If you've ever seen a bunch of random characters appear when trying to write a complicated macro, this is probably where it comes from.) internalParse calls its replaceExternalLinks, doMagicLinks, replaceInternalLinks, replaceInternalLinks (yes, it calls it twice, I don't know why)... one of these is what I'm looking for.

... and at the end of replaceInternalLinks, I find a call to $sk->makeLinkObj. It was in skins all along. ... well, half of it was. The title object is made in replaceInternalLinks, but whether the article exists is checked in makeLinkObj. So if I want to make/test/use a corresponding category name/title I have to make it myself. Looks like I can title->getText, then slap a "Category:" on it (using Namespace::getCanonicalName(Namespace::getCategory())), make a new title, check if it exists, and if so, build a known link using the same text but the new category title. I'll only have to change one file, Skin.php.

Adding subcategory and article counts to category lists

I want to show how many articles (and maybe subcategories, I haven't decided) are in each category.

From my previous browsing around, I had a pretty good idea what I needed to do. I knew there was a newCategoryMagic function in Parser.php that builds the lists of subcategories of a category. I wanted to list and count the subcategories of subcategories, and add the count to the text of the subcategory link. So I could take my inspiration from newCategoryMagic, and that's also what I'd be modifying.

In the part of newCategoryMagic that makes a link to a category, I passed the cur_title of the link (the textual database key, which it had found in the SQL search) to a function I wrote: linksToThisKey().

That function used an SQL query very similar to the one used by newCategoryMagic. The only changes were: I only cared about the cur_namespace (so I could see whether it was a category or an article) and I took out the DISTINCT (so it would return multiple copies of cur_namespace with the same value; otherwise it would only count up to 1). I passed back the counts to be added to the title text in newCategoryMagic.

Then I wanted to put the same counts in the list of super-categories (the "Categories:" section at the bottom of each page). A text search for "Categories" found it's a system message called "categories"; and that's used in Skin.php:getCategorLinks, which builds the Categories section from mCategoryLinks. A search on that led me back to Parser.php:replaceInternalLinks, which is where the links are built from names and texts. So that's where I added my call to linksToThisKey.

Finally, I have a category taxonomy page that I use as a site map, and I wanted to add the counts to each category link on that page. Again, that's in replaceInternalLinks (if you forget what that's called, just text-search for "[["). So I put in some ugly hard-coded code right at the bottom where it handles normal links:

if (($this->mTitle->getDBkey() == "Site_taxonomy") && 
     ($nt->getNamespace() == NS_CATEGORY)) {
    $notes = ' '.$this->linksToThisKey($nt->getDBkey());

Making talk pages more like discussion lists

Rather than freeform talk, I want to have talk happen linearly: append-only postings. Easier to keep track of. OK, so I'm a control freak.

Eventually, I want to allow threads, where any post can be commented on. But that only works with 1) named, not numbered sections (so the text can be inserted in the right place after interleaved submissions); 2) some way to show what's changed recently (so participants can see what might need answering). I skim Article.php until I find getSection, then give up. For now, I'll append only to the end: section=new.

To allow appending to the end, I'll have to do several things:

  • Make every talk page a __NOEDITSECTION__.
  • Turn off the "edit" tab.
  • Lock out improper handcoded "edit" URLs.

I want to do one more thing: include username and timestamp in every post.

A text search on "edit" finds the system messages 'edit' and 'editsection' for the tab and section edit link. 'editsection' leads to Skin:editSectionLink which leads to Parser:formatHeadings, and I add to the NOEDIT check: " OR Namespace::isTalk($this->mTitle->getNamespace())". 'edit' leads straight to SkinPHPTal:buildContentActionUrls where I simply move the "edit" into the not-talk case.

Now I need to allow only 'section=new' edit submissions on talk pages. I remember seeing this checked somewhere... Thought so. It's in Article:getTextOfLastEditWithSectionReplacedOrAdded. Oh yes, I wanted to add Cphoenix 15:25, 27 Sep 2004 (CDT) to the headline; the top of this function is where I do that. And I'll make it a smaller headline by adding more ='s. But the permission checking is in the calling function, EditPage:editForm. Just add a check: if page is talk, section had better be new.

Later I realized that uninitialized talk sections would lead to "broken link" links that were no good for editing. So I went and hacked up Skin:makeBrokenLinkObj to slap "&section=new" on broken talk links.

Oh, and the "talk" link in the action tab has to be changed. SkinPHPTal:buildContentActionUrls, change empty content_actions['talk'] by adding "&section=new" to the "action=edit".

One more bug: when creating a new talk page, the signature/headline wasn't being added, as it usually is for appends. So the first post always had no signature/topic, just the bare text. Turns out that new-page handling in editForm doesn't get as far as getTextOf... So I added similar signature-adding logic to Article:insertNewArticle (which is right above getTextOf...). In that function, I had to add the logic above the call to $this->preSaveTransform, since that's what expands the ~'s.

Adding "Create article" boxes to category pages

After the fourth time I wanted to create an article and realized there was no way to do it except hand-coding a URL or writing a [[ link in the sandbox, I knew I had to make it easier.

I wanted a simple form: type in the article title, hit a button, and you're editing. And I wanted this to appear on every category page, and seed the article text with the category link: you're not just creating the article, it automatically goes into the category you were browsing.

How much would I have to hack up to do this? When the "GO" button was pressed, I'd have three pieces of information:

  • That I was editing
  • The article title
  • The category

Somehow I'd have to use this to route a call the edit function. Would I have to hack up Article.php or index.php, and post-process the form data, and route it to the right place?

It turned out that I didn't have to post-process at all. The way the code is written, it doesn't care what order the parameters appear in. So I could send it title=whatever, wpTextbox1=whatever, and action=edit in any order. (I learned from following code in EditPage that wpTextbox1 is the name of the edit text.) And the code doesn't care about extra parameters, which is good because the form produces a few.

So I dug into my memory of forms and cheated off the UserLogin.php form, and wrote a form that would do what I needed. The text box has a name of 'title'. The action is simply $wgScript. Then I have two hidden fields, one with name 'action' and value 'edit', the other with name 'wpTextbox1' and value wfEscapeHTML("[[".$this->mTitle->getPrefixedText()."]]\n") (I learned about wfEscapeHTML from browsing the Title code.)

Done? Not quite. First, I'd put this form in Parser:categoryMagic, not realizing that was always called whether or not it was a category. So I moved the category-check code up from newCategoryMagic.

Second, the wpTextbox1 text (the category link) wasn't showing up. The URL looked fine, if a bit ugly; was it possible that the editForm function was throwing it away, since after all it was an initial edit? (It knows it's an initial edit since it's not a preview request and didn't arrive by POST method. A non-preview POST is treated as a save request.)

Yes, that's what was happening. On initial edit (which it thinks this is), it called $this->mArticle->getContent( true ) to try to load the text from the database. So I changed the logic: load the text *if* the textbox is empty (which it will be on a normal initial-edit).


Adding codebase diff special page

I had a request to see the changes I'd made to the code so far. The suggested command was:

diff -ruP -x LocalSettings.php /path/to/the/original/codebase/ /path/to/the/new/codebase/ >Wisenano.diff

excluding LocalSettings.php because it has usernames and passwords.

Well, I wanted something better. I wanted anyone to be able to get a diff at any time. How to do that? A special page, of course. Special:Codebase

So I started with a simple special page file, SpecialLockdb.php, and looked to see how it worked. I quickly noticed that the entry point, wfSpecialLockdb, isn't called anywhere. OK, it must be a constructed name. Search for wfSpecial, find SpecialPage.php (duh). It's easy then to see how to add one more page: "Codebase". I saved a copy of SpecialLockdb.php as SpecialCodebase.php, gutted it, and replaced it with a call to
shell_exec("diff -ruP olddir newdir")
which was then munged and passed to $wgOut->addHTML().

I used two filters. First, I did a search-and-replace (using str_replace) on my username, password, and filesystem paths (and wgProxyKey). Now I can publish the entire diff, including LocalSettings.php, without fear of exposure. Second, I ran it through wfEscapeHTML (which I'd learned about previously). And of course I wrapped it in <pre> tags.

Two more things to do. One was to create a second Special page to show the system messages I'd changed (rather than the entire unwieldy list). That's at Special:Messagediffs. Basically I just cloned SpecialAllmessages and changed it to display only the messages that were different between the language file (where they originally came from) and the SQL database (where they're stored as articles in the MediaWiki: namespace and can be edited by sysops). There was a small problem: the language file version of some of the messages had templates, which had been substituted in the database version. So I had to figure out how that was done. I poked around in the code for a while and couldn't see where. So I got smart and text-searched on '{{' and found it quickly: in MessageCache.php:transform.

The second thing to do was add the page titles for the new special pages. The easiest way to do this was hand-code an edit URL like &title=MediaWiki:Codebase&action=edit.

By the way, once I got the message diff working, I discovered it was a good thing I had, because there were some messages that were incorrectly different. See, I'd installed MediaWiki before I figured out the URL prettification. And, because templates in messages are replaced during install, some of the messages had /index.php/ instead of /w/. Easiest fix was to edit them by hand.

Personal tools