Thursday, December 13, 2007
Hi, folks. I'm just your random Wikipedia admin who is trying to help out in stuff, and I don't know where to post my comment, so I'll just post it here, in the last thread with the "Wikiwatch" tag.
One thing that has been repeated ad nauseam is that mods have a tendency to be delete-happy. Ironically enough, reading some of the threads in several pages, many users who are mentioned are not Wikipedia administrators. While this seems like a petty complaint, I'd rather not get a bad rap I don't deserve... ;) But in all seriousness, if you want to find out who is an admin or not, just plug in their username in Special:Listusers (http://en.wikipedia.org/wiki/Special:Listusers).
However, whining about whether someone is an admin or not is not the point of my comment. Watching the Webcomics mess from the perspective of someone who has the delete bit, one thing that I can say is that if there were a way to demonstrably say that a comic is notable, then there would be much less of a mess in this situation. However, as far as I can look, there is no external guidance that allows us to determine why a particular Webcomic is popular or not. So, the only thing we can do is fall into our default guideline, http://en.wikipedia.org/wiki/Wikipedia:Notability_%28web%29. Now, we all agree that the guideline is definitely not ideal, but consider things from our perspective. We have to deal with mountains of crap every day, so how do we discern that which is crap from the non-crap? While we can all agree that a crappy garage band from some random kid somewhere is not notable, what constitutes notability in Webcomics?
In other words, I'd like to ask the Webcomic community for something. If there were a way for you to police your own content, the decision-making of administrators might be much less of a non-issue. However, I'm not asking about doing that in Wikipedia, as doing that would just result in Yet Another Wikipedia Notability System (WP:YAWNS). :P Instead, I consider a much more constructive step for both Wikipedia and Webcomics to be for the comic community to create a site showcasing good webcomics, documenting awards, etc. Not only that would give Wikipedia a litmus test and a secondary source (which is the crux of the whole issue), it would also have its own value within the Webcomics community by creating an information depot of sorts.
I personally believe that is the best way to satisfy everyone's concerns. However, the impetus for such a move has to come from the Webcomics community; it cannot come from Wikipedia. Since I really have no idea how it would be better to distribute that proposal to your community, or even if doing this here would be even seen, I'll leave the details to you, Howard, to figure that out. In either case, I'd really like for all of you to consider this seriously, not only for the benefit to Wikipedia, but more importantly, for the benefit to all of you.
Hopefully the unnecessary dispute between both of our communities can be solved via dialog and innovative solutions. It has dragged on far too long.
English Wikipedia administrator
Monday, October 15, 2007
As it is generally agreed that Wikipedia's quality review system is good, but not featured (heh), there are several discussions regarding the ways the different quality verification systems (in no particular order, they are Good Articles, Peer Review, WikiProject Reviews and Featured Articles) tie together, and how to improve them. Outside my usual spiel (GA and FA are well-established processes, neither one will go away, and PR's reviews are scant to the point they make the page meaningless), I haven't had much input in the page, but I have been silently watching it. That said, it would be a crime against humanity to not reproduce this post by Martin Walker, known outside the WP:1.0 cabal as Walkerma. As it has become his custom, he hits the nail on the head about the current state of affairs in Wikipedia's quality system.
All too often quality is presented as being a single "dimension" - going from "---- awful" to "perfect". But it's much more complicated than that. I'd like to lay out how I see it - others may see it differently, but I think any delineation should hopefully clarify things a bit. I'm not trying to stimulate a huge discussion (though by all means discuss it if you think it useful!), rather just show the lay of the land beyond the PR and GA/FA aspects discussed above. I just think that before we change the system we need to define the parameters clearly. The parameters mainly fall into two main categories: Content (which requires good knowledge of the topic) and Style (which requires good knowledge of English and WP:MOS). Here are the parameters as I see them:
- Content coverage: In the early stage of article development, this is what is needed most of all. As the article matures it moves towards "comprehensive coverage" and this aspect becomes much less important. The best people for judging this are subject-experts.
- Content quality: The issues may be covered, but there are sloppy definitions or actual errors. Again, subject-experts are critical here - preferably experts with good access to reliable sources.
- Content trustworthiness: The article may be beautifully written and absolutely correct factually - but if there are no clear citations from reliable sources, how can I know that? Many of our early FAs fall into this category. Also see the Quality versions proposal.
- NPOV: Related to trustworthiness, but you can still write a biased piece that is well cited. Requires a consensus from several subject-experts, IMHO.
- Article scope: Sometimes a topic is better covered fairly broadly, sometimes it's better to break it up into sub-articles, usually with a top-level article using summary style.
- Article organisation and flow: The article may be well written and contain all the important content, yet it looks like it was written by a committee (in effect, it was!). It's a side effect of the fact that our articles are written by many people, often in small disjointed pieces. Needs one or two dedicated author(s) - a subject-expert and someone who can write well.
- Quality of writing: Issues here include poor grammar, spelling, verbosity, poor use of paragraphs, etc. Good copyeditors needed here.
- Level of writing: A popular or general topic like atom will need to be accessible to a popular, general audience. In a more specialized article like Persistent carbene, it would be ridiculous to explain what an electron is. Ideally needs a subject-expert who can write well.
- Aesthetics: Sometimes a nice diagram can really help, or a photograph can bring an article to life. Hard to quantify!
- Compliance with house style: The article may be perfect in every way, but have a Heading Like This instead of One like this, or lack the before units, etc. Requires someone with a good knowledge of WP:MOS.
Article length is often informally used as a parameter, but it's largely based on three of the above: Content coverage, Scope and Quality of writing (specifically the conciseness/verbosity part). This is a tricky issue when assessing an article for quality - we may say, "It's long, so it must be pretty comprehensive" but it may fail to cover major aspects and just be very long-winded. If the scope is very narrow it may be hard to make it long enough to get on the radar screen for GAN/FAC - for example, compare Jarvis Island with United States.
- Review types
When assessments are performed, the general reviewer can best judge the Style parameters: Quality of writing, Level of writing, Aesthetics, House style and perhaps Article organisation/flow. Therefore GAN and FAC (with mainly general reviewers) focus on these aspects. This works usually because by the time an article reaches this point it usually has most of the content aspects covered. Meanwhile the WikiProjects help to ensure that the Content parameters are covered: Content Coverage and Quality, Trustworthiness, NPOV and perhaps Article Scope and Organisation. That is why Stub/Start/B/A reviews mainly focus on those aspects, and it explains why an article can be A-Class while failing GA (or vice versa). Content issues are most important in the early stages of an article's development, so it is appropriate that much of the assessment occurs at the Stub/Start/B level. Nevertheless, I think the A-Class review system such as is done at WP:MILHIST is extremely valuable as a form of expert peer review, because it ensures that the content aspects are all well covered before the general reviewers at FAC even see it. This is analagous to peer review for scientific publication, before the house style and grammar issues are cleaned up by the copyeditors working for the publisher.
Soon we will have a third main "category" to worry about - the specific article Version - but let's leave that until WP:FLR becomes a reality (next year?)!
Well, I probably missed a few things, but I think it's important to try and lay these things out. It's especially important to realise that Content issues and Style issues are different "beasts" requiring different types of review. Walkerma 05:57, 13 October 2007 (UTC)------
The only thing I may add is the need for true external verification by vetted experts, done completely outside Wikipedia. I'm talking about Nature-like verification, but on a per-article basis. Unfortunately, that is probably a long way off. That said, getting Wikipedia cited in high-profile subject publications in a few subjects should help. We just need to do that more.
What I'd like is for an outreach campaign, showing professors how much Wikipedia articles are used, and how important it is for the articles to be accurate. With internal style review, WikiProject content verification, and external expert vetting, we'd get the quality trifecta.
Saturday, October 13, 2007
Wikipedia: the encyclopedia you can write in, and not feel bad about it!
Sunday, September 16, 2007
A couple of days ago, though, these grumblings were implemented. They look pretty cool, by the way. Have a look at them here:
Of course, like in every project of this magnitude, my First Law of Wikipedia has already proven itself true, but well, that was to be expected. Anyways, as usual, comments and suggestions for improvement are welcome, even though I didn't actually do anything in this proposal. ;)
Saturday, September 8, 2007
Cool, huh? :)
Tuesday, May 8, 2007
Monday, May 7, 2007
This is for rants about the article. If you wish to point out a problem in the article (e.g. factual error etc), please use its normal talk page.
What the... ?
Sunday, May 6, 2007
We'll see what we can do with it. But if we can do something similar to what we did with extensions, then I'm expecting to drool already. :) The latest invention (by Duesentrieb of dewiki fame) is the Extension Matrix, which includes info about every extension documented on MediaWiki.org... :)
Keeping with the extensions theme, there are a few extensions that *every* MediaWiki installation should have. These extensions are incredibly useful, and include functionality that has become expected of MediaWiki end users. Hopefully someone will decide to merge these extensions somehow into the MediaWiki core, so they don't have to be downloaded individually:
require_once( "extensions/Renameuser/SpecialRenameuser.php" );
require_once( "extensions/ParserFunctions/ParserFunctions.php" );
require_once( "extensions/Filepath/SpecialFilepath.php" );
require_once( "extensions/Makesysop/SpecialMakesysop.php" );
require_once( "extensions/Cite/Cite.php" );
require_once( "extensions/CategoryTree/CategoryTree.php" );
require_once( "extensions/Newuserlog/Newuserlog.php" );
require_once( "extensions/Contributors/Contributors.php" );
and of course...
require_once( "extensions/Poem/Poem.php" );
One notable exception is this one:
Oversight should not be installed in new installations of MediaWiki. Why? Because it will collide with the current work to implement rev_deleted and cause unnecessary issues. Rev_deleted is due for deployment sometime after MediaWiki 1.10.0 is released, and is quite exciting. See the Gallery for some pretty screenshots... :)
Friday, May 4, 2007
They still need to be verified and updated, but any beta-testers of the old instructions would be very welcome.
See http://www.mediawiki.org/wiki/Manual:Installation and http://www.mediawiki.org/wiki/Manual:OS_specific_help for the pages that were moved. Please poke us on #mediawiki or on the MediaWiki Project Forum to resolve them.
Thursday, May 3, 2007
"Pagerank is not a consideration for Wikipedia — it contributes nothing to the project of writing an encyclopedia. This is why SEOs and Googlemancers find it so hard to find anyone at Wikipedia or Wikimedia who cares." - David
Pagerank is indeed not a consideration for Wikipedia. It is not something your average editor thinks about when removing "boobies" from [[George W. Bush]]. However, was that true in the past? Google et al. obviously accounted for a huge amount of our incoming traffic. Actually, I'd dare say all of it. And we obviously wouldn't have written anything if no one was going to read it... so, without Pagerank, would there be so many editors? Would Wikipedia, with all of its flaws, be the entity it is today?
That makes for a very interesting line of thought, but I guess the question has become the following: Have we matured as a website enough that we don't need to be plastered all over the top of Google's search results? Have we gotten to the point where people don't look first at Google, but search immediately within Wikipedia? Does anyone have the stats about that? What would be the raw effect of nofollow on our popularity?
Now, from the editor's point of view: I sure love to have an FA I worked on as the top hit in Google. It makes me at least have the impression that I've done something productive. So, does it mean that I write for Google? Does it mean that I care about Pagerank, and the impact that it could have on the amount of readers our articles have? Does it mean that indirectly, the amount of readers I have is an enticement that causes users to produce more content?
No, not really. Free content would continue to exist, whether it is at the top of my bookshelf, or the top of the results in a search engine. Regardless of anytihng, people do, and will continue to, read Wikipedia. It is nice to have many more people reading my work as a result of Google's algorithms, but it really isn't the reason I write here. So... no, I guess I don't disagree with David's conclusion. I don't care.
Saturday, April 7, 2007
As I wrote in my previous post, assuming there were no last minute hitches, the Version 0.5 CD should would go on sale on March 26. However, there were a few snags that delayed publication until now. Thankfully, those final obstacles are now a thing of the past, and the long-awaited goal of the Version 1.0 Editorial Team has been accomplished: A CD containing an offline, static version of Wikipedia is available to customers today, thanks to Linterweb.
Our release counts of 1964 articles, which span a variety of subjects, from Art to Zeus. Additionally, it includes a considerable amount of Featured articles from the English Wikipedia, to showcase the best of our work. Most importantly, it comprises more than a year of work by Wikipedians and Linterweb, across countries, oceans and continents. The release is, as Wikipedia, released under the GNU Free Documentation License, version 1.2, with the same conditions Wikipedia articles contain in their respective licenses. Kiwix, the reader software, is released under the GNU General Public License. Most importantly, a portion of the sales will go directly to the Wikimedia Foundation.
Putting this into perspective, Jimbo Wales first put his ideas for "Pushing to 1.0" on August 20, 2003. From there, Jimbo's proposal was discussed, but overall nothing happened: Jimbo's ideas remained mostly in limbo. Now, fast forward to November 20, 2004: Maureen creates the Version 1.0 Editorial Team, and work begins at a slow pace. Originally, work is done by small groups selecting articles; several of these lists still exist, such as the Core Topics list and the Core Topics supplement. However, at around the same time, another part of the team decided to begin work in a test version, to give us lessons as we kept pushing to Version 1.0.
The work on Version 0.5 began on the first week of May 2006, with the creation of the V0.5 nominations page. Nominations actually began on May 23, 2006, and from there, the rest is history. Nowadays, the Version 1.0 Editorial Team is a healthy project, with dozens of dedicated editors. But we didn't have anything to show for our work. I'm glad to be able to say that now, that has changed.
Consider some of the consequences of having a published version of Wikipedia: all of a sudden, you can say for sure what is going to be in a Wikipedia article at the moment you need it. You can point to the CD, and say that this is the revision you obtained your data from. Let's take that analysis even further--you can point to the CD, and say "I wrote this." Isn't that a great feeling? Doesn't that motivate you to write or improve even more articles? However, there is something much more important. Consider Jimbo's personal statement of our vision as Wikimedians:
"Reporters are always asking me why I’m doing this, why Wikipedians do this? I think you know why.
"I can’t speak for everyone, but I can speak for myself. I’m doing this for the child in Africa who is going to use free textbooks and reference works produced by our community and find a solution to the crushing poverty that surrounds him. But for this child, a website on the Internet is not enough; we need to find ways to get our work to people in a form they can actually use."
This is a small step, but a step nonetheless, that will get us closer to that direction. Please take a look at the CD - it will be available for purchase via http://www.wikipediaondvd.com/, for a cost of ~€12 ($15 USD, £8.50). You will also be able to download a copy of the CD's ISO file, both from a mirror and in BitTorrent format.
Either way, as the number "0.5" indicates, we are not done; static releases, just like Wikipedia itself, are not finished, as they are works in progress. We're starting work on Version 0.7 soon, and as everything else on Wikipedia, "more eyes are always more better" (I know that is bad grammar, but that's intentional ;) ). After v0.7, if things go as planned, we intend to finally publish Wikipedia Version 1.0 and accomplish our goal. Then there's also the whole idea of WikiReaders that we are beginning to consider. Overall, the future looks promising. :)
As part of the Version 1.0 Editorial Team, I hope that you will enjoy and use our work, and we hope to provide you with more news of this kind very soon.
Friday, March 23, 2007
Let's say that we have some big news...
A static release of Wikipedia is now a reality. :)
Copying the words of Martin Walker, the WP:1.0 coordinator for Version 0.5:
Assuming there are no last minute hitches, the Wikipedia:Version 0.5 CD should be going on sale on March 26 at www.wikipediaondvd.com for around 10 Euros/$US13-14 (a portion goes to the Foundation). It will also be made available for free download. It consists of 1964 articles and a set of navigation pages, with an open source (GPL) search engine, Kiwix, developed by Linterweb. We now have an ongoing collaboration with Wikimedia France, and User:Kelson wrote many of the scripts for Version 0.5. This CD will make a great birthday present for your loved one! Walkerma 05:54, 22 March 2007 (UTC)
Remember that this is a *test* release. Please check it out, have a look at it, and give us feedback about it.
By the way, all of the articles included in Version 0.5 will be available in Version 0.7, which is being prepared right now. So come join the Version 1.0 Editorial Team to help out!
Monday, March 19, 2007
I was planning on making a longer article, but due to time constraints, I'm not able to do so. However, during a 1.0 meeting earlier today, there was an interesting proposition, and I wanted to ask the greater Wikimedia community about it:
The German Wikipedia has published WikiReaders before. Why cannot the English Wikipedia do so?
There is a page in enwiki that is... pretty much dead nowadays, but that has a few suggestions about potential WikiReader content. While the suggestions are old (and probably not a good idea now, as articles may have decayed on quality), it did bring up a troubling question: why is the English Wikipedia not producing them? There were two answers that came to mind:
- No one knows about them. If users don't know that they can make WikiReaders, they won't try to make them.
- More importantly, users do not know how to make them. The user who proposes a WikiReader will probably not know who to contact, and what issues follow from there.
So, I pose the following questions to everyone:
- Do you have a group of articles that you would like to see in a WikiReader?
- We are beginning to consider how we can make publication of selected subsets of Wikipedia (or even Wikibooks, even though it is a bit outside our scope) articles easier. What suggestions do you have for the process?
- What would you want to see as the end result?
- Do you even want us to consider this, or do you think this is a waste of time?
- Would you be willing to help us at any stage in the process?
Anyways, I'm eagerly interested in hearing all of your opinions, either here, or at 1.0's talk.
Friday, March 16, 2007
As of today, there were a total of 385,469 assessed articles in the English Wikipedia. If we use the figure from Special:Statistics of 1,688,879 articles (sorry, no permalink here), that means that 22.8% of articles have been looked at by someone and classified according to their quality. (If we count the number of unassessed articles that are stored in the 1.0 database, we have 40.7% of the article base covered.) While the proportion of the numbers themselves make for interesting observations, most users do not know where those numbers come from, or how they are processed. Since I was involved in the design of the WP:1.0 bot framework, I thought it would be a good idea to explain how it works. It is quite a fascinating process, if I'm allowed to say so myself... ;)
First the WikiProject needs to do a bit of legwork. The WP1.0 bot uses the Mathbot code, which uses Perl to determine if there were any additions or subtractions to a particular category. Therefore, the vast majority of the processing is just a matter of, "Was an article added to this category? Was one removed? Was there an article in a category that was in a different category yesterday?" Before those operations can be done, the categories to process must, well, exist. Picking on WikiProject Tropical cyclones (as usual), the categories to make are:
- Category:Hurricane articles by quality
- Category:FA-Class hurricane articles
- Category:A-Class hurricane articles
- Category:GA-Class hurricane articles
- Category:B-Class hurricane articles
- Category:Start-Class hurricane articles
- Category:Stub-Class hurricane articles
- Category:Unassessed-Class hurricane articles
Projects also have the ability to categorize pages by importance or priority. This is slightly more controversial, (take WikiProject Biography, for example: how can you say that someone is of "Low importance" without upsetting the person?) and is not required. However, for the projects that desire to use this portion of the framework, there's another category setup to do, parallel to the quality categories:
- Category:Hurricane articles by importance
- Category:Top-importance hurricane articles
- Category:High-importance hurricane articles
- Category:Mid-importance hurricane articles
- Category:Low-importance hurricane articles
- Category:No-importance hurricane articles
Once this is done, the way most projects feed their articles onto the bot framework is by adding a "class" parameter to their WikiProject banner, and optionally, an "importance" parameter as well. Once the MediaWiki job queue does its thing, all of the articles are in the "Unassessed" category.
|Hurricane Nora (1997)|
Assuming for a moment that these articles are brand-new and unassessed (which they aren't), at this point, the bot has still not done its daily run. At about 3:00 UTC that day, the bot starts its run. It reads the category tree, and copies them inside its internal database. The table above is now mirrored in the hard drive of the bot's computer. The bot also spits out a log, and updates both the statistics table for the project, and the global database.
Over the course of the day, these articles are assessed. This is done by updating the parameters on the WikiProject banner. An example of this would be the following, on Talk:Hurricane Katrina:
That same day, the three articles are assessed: Hurricanes Nora and Katrina to FA-Class, and Wilma to B-Class. So, our categories in the wiki are now the following:
|Hurricane Nora (1997) |
At this point, the bot runs. Again, all it does is compare the current categories and the previous category snapshot. The bot sees that the FA-Class was formerly empty, and now contains two articles: Hurricane Katrina and Hurricane Nora; the B-Class category contains Hurricane Wilma, and the Unassessed category is now empty. The bot now updates the statistics tables and the log, and updates its own snapshot with the current data.
The internal logic of the bot allows it to see that in these two cases, the articles were upgraded from one class to another, and the log will reflect that. If the bot sees a new article, then the log will also identify it as a new addition. If an article was removed, then the bot will flag its removal in the log as well. A recent modification to the code now allows page moves to be adequately recorded, instead of being recorded as an addition and a deletion.
This is the bot structure that 427 WikiProjects, task forces, Regional noticeboards, etc. use nowadays to monitor article quality and editorial progress. It works flawlessly, except that it cannot scale forever. The bot is now run every other day, as each one of its runs took approximately 36 hours to finish. This bot framework is also used by the Hungarian Wikipedia, and the Spanish Wikipedia as well; so if a friendly developer wants to make this part of the Wikimedia wiki farm's software arsenal, a lot of people would be happy... :)
Monday, March 12, 2007
The first portion of the position indicates that Wikipedia is only an encyclopedia, and that ancillary activities such as signing autograph books is tantamount to wasting time. While Wikipedia is primarily an encyclopedia, any editor that stays on the site for a significant amount of time recognizes that it is simply impossible to run and maintain the encyclopedia without any socialization with others. One of the site's raisons d'être is that whatever mistake a user does, a second user will correct; in order to maximize the efficiency of this self-corrective process, it is necessary to allow users some (note: not complete) leeway to socialize with other users.
Other users indicated that while they saw no harm in the pages, they also saw no use, so they should be deleted. While the "no use" is a personal judgment that I respect, I have to disagree with the "no harm" assertion. If a group of productive users is doing something such as maintaining a page to keep an autograph book, or an "office bracketology pool", or something that is otherwise innocuous, I don't see how it is productive to go ahead and say, "No, you are violating policy." The reason? There are two possible outcomes to this. The user stops participating in the side activity, but is also forced to go elsewhere, which means that he will spend less time editing Wikipedia. (Remember, the assumption here is that we're talking about productive users, not users whose entire purpose here is to use Wikipedia as a chatroom.) The other outcome is that the user resists and turns on the defensive, which increases the probability of occurrence of a heated situation.
Finally, it is inherent in human nature to try to personalize one's space, to make it one's own, and to make it a place where one can feel comfortable. From my own personal experience, I remember that back when I was a n00b, the first thing I did was to make a user page, so I would not feel as much of an outsider in a new place. Some users have previously indicated that they do not understand why this is an issue at all; however, for me, it was similar to extending one's arm for a handshake upon entering a new space. If I had had my user page prodded in the month between I made my first edit to my user page and when I made my first edit to Dubya, I would have considered Wikipedia to be a hostile place, and would have never returned. The last thing we want to do as Wikipedians is to make Wikipedia appear as a contentious environment.
Our success depends on how many editors are comfortable editing here, and taking actions that are "anti-User" or "anti-Community" on the surface do not help us retain badly-needed users. Our success depends on whether we can produce a culture that nurtures collaborative processes between editors, perhaps even more than on the quality of our articles. As a result, if I had seen the debate while it was open, I would have opposed the nomination, and !voted keep.
Currently, not much going on. After the storm due to the credentials controversy subsided, everything else seems to have returned to normal. Vandals still are wasting our time, hurricane articles need to be improved and sent to FAC, how to fix RFA keeps being discussed, etc. All the usual things are going on.
I'm trying to see if I can make a patch for Bug 471. Let's see how that goes.
Signing off for now,