Monday, October 15, 2007

On quality, assessments, and worse things

A very, very common discussion nowadays on Wikipedia is the now-famous "Quantity vs. Quality" debate. At the epicenter of these discussions is the concept of quality itself, and how it can be measured, and best recognized.

As it is generally agreed that Wikipedia's quality review system is good, but not featured (heh), there are several discussions regarding the ways the different quality verification systems (in no particular order, they are Good Articles, Peer Review, WikiProject Reviews and Featured Articles) tie together, and how to improve them. Outside my usual spiel (GA and FA are well-established processes, neither one will go away, and PR's reviews are scant to the point they make the page meaningless), I haven't had much input in the page, but I have been silently watching it. That said, it would be a crime against humanity to not reproduce this post by Martin Walker, known outside the WP:1.0 cabal as Walkerma. As it has become his custom, he hits the nail on the head about the current state of affairs in Wikipedia's quality system.


All too often quality is presented as being a single "dimension" - going from "---- awful" to "perfect". But it's much more complicated than that. I'd like to lay out how I see it - others may see it differently, but I think any delineation should hopefully clarify things a bit. I'm not trying to stimulate a huge discussion (though by all means discuss it if you think it useful!), rather just show the lay of the land beyond the PR and GA/FA aspects discussed above. I just think that before we change the system we need to define the parameters clearly. The parameters mainly fall into two main categories: Content (which requires good knowledge of the topic) and Style (which requires good knowledge of English and WP:MOS). Here are the parameters as I see them:

  1. Content coverage: In the early stage of article development, this is what is needed most of all. As the article matures it moves towards "comprehensive coverage" and this aspect becomes much less important. The best people for judging this are subject-experts.
  2. Content quality: The issues may be covered, but there are sloppy definitions or actual errors. Again, subject-experts are critical here - preferably experts with good access to reliable sources.
  3. Content trustworthiness: The article may be beautifully written and absolutely correct factually - but if there are no clear citations from reliable sources, how can I know that? Many of our early FAs fall into this category. Also see the Quality versions proposal.
  4. NPOV: Related to trustworthiness, but you can still write a biased piece that is well cited. Requires a consensus from several subject-experts, IMHO.
  5. Article scope: Sometimes a topic is better covered fairly broadly, sometimes it's better to break it up into sub-articles, usually with a top-level article using summary style.
  6. Article organisation and flow: The article may be well written and contain all the important content, yet it looks like it was written by a committee (in effect, it was!). It's a side effect of the fact that our articles are written by many people, often in small disjointed pieces. Needs one or two dedicated author(s) - a subject-expert and someone who can write well.
  7. Quality of writing: Issues here include poor grammar, spelling, verbosity, poor use of paragraphs, etc. Good copyeditors needed here.
  8. Level of writing: A popular or general topic like atom will need to be accessible to a popular, general audience. In a more specialized article like Persistent carbene, it would be ridiculous to explain what an electron is. Ideally needs a subject-expert who can write well.
  9. Aesthetics: Sometimes a nice diagram can really help, or a photograph can bring an article to life. Hard to quantify!
  10. Compliance with house style: The article may be perfect in every way, but have a Heading Like This instead of One like this, or lack the before units, etc. Requires someone with a good knowledge of WP:MOS.

Article length is often informally used as a parameter, but it's largely based on three of the above: Content coverage, Scope and Quality of writing (specifically the conciseness/verbosity part). This is a tricky issue when assessing an article for quality - we may say, "It's long, so it must be pretty comprehensive" but it may fail to cover major aspects and just be very long-winded. If the scope is very narrow it may be hard to make it long enough to get on the radar screen for GAN/FAC - for example, compare Jarvis Island with United States.

Review types

When assessments are performed, the general reviewer can best judge the Style parameters: Quality of writing, Level of writing, Aesthetics, House style and perhaps Article organisation/flow. Therefore GAN and FAC (with mainly general reviewers) focus on these aspects. This works usually because by the time an article reaches this point it usually has most of the content aspects covered. Meanwhile the WikiProjects help to ensure that the Content parameters are covered: Content Coverage and Quality, Trustworthiness, NPOV and perhaps Article Scope and Organisation. That is why Stub/Start/B/A reviews mainly focus on those aspects, and it explains why an article can be A-Class while failing GA (or vice versa). Content issues are most important in the early stages of an article's development, so it is appropriate that much of the assessment occurs at the Stub/Start/B level. Nevertheless, I think the A-Class review system such as is done at WP:MILHIST is extremely valuable as a form of expert peer review, because it ensures that the content aspects are all well covered before the general reviewers at FAC even see it. This is analagous to peer review for scientific publication, before the house style and grammar issues are cleaned up by the copyeditors working for the publisher.

Soon we will have a third main "category" to worry about - the specific article Version - but let's leave that until WP:FLR becomes a reality (next year?)!

Well, I probably missed a few things, but I think it's important to try and lay these things out. It's especially important to realise that Content issues and Style issues are different "beasts" requiring different types of review. Walkerma 05:57, 13 October 2007 (UTC)


The only thing I may add is the need for true external verification by vetted experts, done completely outside Wikipedia. I'm talking about Nature-like verification, but on a per-article basis. Unfortunately, that is probably a long way off. That said, getting Wikipedia cited in high-profile subject publications in a few subjects should help. We just need to do that more.

What I'd like is for an outreach campaign, showing professors how much Wikipedia articles are used, and how important it is for the articles to be accurate. With internal style review, WikiProject content verification, and external expert vetting, we'd get the quality trifecta.


1 comment:

nojhan said...

The only thing I may add is the need for true external verification by vetted experts, done completely outside Wikipedia.

I'm pretty confident that this will come in a few years. Indeed, we already have the way to mark reviewed articles (with the quality project), some thoughts on how to review articles (with WP 1.0), and some researchers willing to review (well... just believe me here).

What lacks is a software to bind them all.

As we are less lucky here, I'm trying to develop such a piece of software. It is called Sci-Wi, and you are welcomed to contribute and/or spread its existence: