Determining the relative quality of one Wikipedia project to another: One approach with English, Spanish, Catalan, Galician, Aragonese and Euskera Wikipedias

This is not as polished or as finished as I would have liked.  Apologies. Life got in the way.  I think the core findings and methodologies are still interesting and worth sharing despite the lack of polishing.The raw data is here: Spanish female politicians, and may be useful in terms of understanding how the results differ when you include null values as zero, rather than leaving them out of the conclusion.

Recently, on the research list, there has been a discussion regarding understanding the relative quality of articles on one language Wikipedia project to another Wikipedia project.


Exactly how to go about doing that is a rather subjective task, as quality could be potentially defined as, well, subjective. I’m going to try to do this within a very limited context.


The reason for these limits is because the metrics for easily measuring quality largely depend on the specific field of inquiry.  Quality sport articles will have different features than quality articles about plants which will in turn have different features than quality articles about military battles.


At the same time, I am not a programmer.  I cannot easily do programming things that would allow me to do bulk analysis of articles.  I need a very small sample to be able to feasibly work with.


When dealing with different languages, there is also an issue of best sourced material.  Often, people use English sources because those are easily available.  In the case of translation of articles, in many cases, people appear to just use those sources or find a best local fit.   To get a better idea of the actual quality, it appears to me that the subject matter to assess quality should largely be outside the English speaking domain, for the purpose of best understanding source usage.  Quality, thus, cannot become purely based on the quality of the translation and the local translators willingness to use other sources.


After some thought, I have decided to use the articles for “Female MEPs for Spain.”  (Women representing Spain who are or have been Members of the European Parliament.) This is a small list and finite list, which means I can manually get a large amount of data for comparison purposes.  All the articles are about the same topic, and because they are all about women, there are unlikely to be issues related to systemic bias in content creation. These articles are likely to exist in English, Spanish and possibly other languages for Spain.   Most of the sources should be from Spain because the topic is Spain, so a better feel for local quality can be understood in the context of language.


For this analysis, the decision was made to not examine other languages outside the languages used in Spain.  This is because the sample for other languages is very small, and given the already small sample, it not likely to have a lot of useful information. In English, there are twenty total articles.


In Spanish, there are 14 articles.  In Catalan, there are 20 articles.  In Galician, there are 7 articles.  In Euskera, there are 3 articles.  In Occidental, there are 0 articles.  In Extramaduran, there are 0 articles.  In Aragonese, there is 1 total articles.  In the context of the level of coverage, the best languages are English and Catalan because they have 20 total articles.  The next best Wikipedia, in terms of level of coverage is Spanish, followed by Galician, Euskera and Aragonese.  Other languages in Spain are not represented.


This lends itself to a philosophical question: With the languages having uneven sample sizes, should the analysis for overall article quality be based on the actual articles and ignore the non-existent articles, or should it treat the non-existent articles as having values of zero?  In this case, I will use both measures to determine quality.  The emphasis will be placed more on the existing articles because it allows for actual comparisons between articles.


There are a lot of criteria for determining the quality of an article about a female Spanish politician.  By using many criteria, examining them together, you can begin to get a comprehensive idea as to the relative quality while realizing that each article’s quality may differ.


Generally, this analysis defines article quality on Wikipedia about a politician as having four components.   First is appearance, and the presence of things not necessarily connected to the article text.  This arguably is the least important criteria. It includes having key external links, easy ways to get simple information without having to read the text, and having a picture.  The second is the content of the article itself in terms of length and other general features related to overall perception of article quality that do not relate to the topic.  This would be the second least important criteria, because they are independent of the actual textual content in some ways. The third criteria would be sourcing.  This matters a great deal as it defines the foundation of knowledge.  The fourth criteria is comprehensiveness of the article as a “political biography” by having some of the features that define a good political biography.  These criteria should be weighted to favor the more important article criteria over the least important ones.

Article “Appearance” Criteria

As this criteria is the least important one, points were weighted to make this criteria count less than others, with a maximum total 4.85 points available.


The first criteria I am going to use is: Does the article have a picture?  I believe this is a criteria for quality because many people want to know what a politician looks like.  Related to this, Does the article use a high quality picture of the politician? If the article has 1 picture, it gets 1 point. If the article has a picture but only because it was derived/cropped from another picture, it gets half a point.   If the article has 2 or more pictures, it gets 2 points.[1]  Because none of the articles have images labeled being high quality, there is no value in assigning further value to pictures.


The second appearance related criteria I am going to use: External links found either in an infobox or on the article to the politician’s official page, and any official social media presence they may have.  The reason for including this criteria is because the personal belief of the importance of going to an officially sanctioned source as part of knowledge formation around a subject.  Half a point will be given for a link to an official site, and half a point will be given to a link to an official social media presence.  The most available points for an article will be 1.


The third appearance is related criteria is: Presence of an infobox and a footer.  Infoboxes provide a lot of quality information in an easy to consume manner.  Related to this, the presence of a footer that provides related conceptual topics, such as the person who preceded or proceeded the woman in her position or other members of the same political party.  The presence of an infobox has been assigned a value of 0.6 and the presence of a footer has been assigned the value of 0.25.


The final criterion is the presence of a warning box on the article that says there is a problem with the article that almost certainly relates to content.   It is a potentially strong visual cue to readers that the article is not of high quality. All articles have 1 point. If there is a warning on the page itself, the article loses a point.


Using only appearance criteria, the maximum value points an article could have is 4.85.   There are only two articles which have full points, both are found on Spanish Wikipedia: Pilar del Castillo and Rosa Díez.   The lowest theoretical possible points is zero.  No articles have zero.  The lowest quality article in terms of total points is Inés Ayala on Spanish Wikipedia.


Article “Text Quality” Criteria

The first text quality issue is article section.  The presence of article sections suggests the article has organization and real structure.  Each article gets 1 point for each unique article text related section.  Headers for external links, see alsos, and references are not counted.


The second criteria used for text quality was article length using words.  The method for determining this was to determine the length of the text of the article, minus references, external links, infobox text, footer text, table text, image descriptions and lists.  The articles were then sorted based on length.  The longest 20% of articles were given 4 points.  The next longest were given 3 points.  The middle fifth of articles were given 2 points. The next shortest were given 1 point.  The shortest 5% were given 0 points.  This was done to give longer articles comparatively more value if they were the longest, and less value if they were shorter without passing any relative judgment on the quality of the volume of the text.


The third criterion again uses article size.  This time it divides the number of words by 250 to derive a number.  The number 250 was largely to offset the large values given to the outliers by bringing the number down to be more in line with relative weighting used with other criteria.   This gives value to the actual length of the article, as opposed to relative length.


Using readability as a criterion was considered.  For English, Flesch-Kincaid was used.  For Spanish, Fernandez-Huerta Scale was used.  The problem was that in trying to use one or both for the languages they were not intended for led to results that seem implausible.  Both scales had problems with Euskera, claiming the articles were written at an extremely high level.   The sole Aragonese article had similar problems.  While Catalan and Galician appear to be somewhat in line with Spanish articles, the inability to use two of the three languages and the requirement to use another system for English means this readability is not feasible as a criterion.


Because Wikipedia articles generally do not have a predictable maximum ceiling for article word length or number of sections, there technically is no ceiling for the maximum number of points available in this category.


Article “Sourcing” Criteria

The first sourcing related criterion is total number of sources.  An article gets 0.3 points for each source found in the reference section of the article.   The number 0.3 was largely to offset the large values given to the outliers by bringing the number down to be more in line with relative weighting used with other criteria.


The second sourcing criterion is the language of the sources.  Linguistic diversity amongst Spain’s languages should assist in offsetting potential POV problems and assist in providing best coverage for politicians from areas where Spanish is not the sole regional language.  The use of other language sources also potentially provides a more global perspective on the politician’s influence. For every different, language an article has outside the language of the project, it will be rewarded half a point.  (In some cases, the original source may be broken.  In this case, the website language will be used and value given based on that.)  Few points are being rewarded because of a desire not to provide too many additional points to articles just by virtue of the article having sources.


The third sourcing criterion is diversity of sources.  Ideally, the article should draw from different types of sources in order to provide a comprehensive, factual and neutral presentation of the person’s political life.  This is in line both with Wikipedia’s 5 pillars and with the requirements of a good political biography.  The different types of sources include newspapers  (television, radio, magazines), books, academic and trade journals, academic and education websites, social media, government (and parliament) sites, conference/commercial/social organization (not political or governmental) sites, party/political websites, and official sites.  These were all weighted as one point for having a reference in these categories.


Article “Political Biography” Criteria

The criteria for a good political biography tends to involve, broadly, getting a better idea of how politics and government works, while reading about the subject of the biography.  Some of the criteria for good political biographies may be slightly problematic in a pure Wikipedia sense in that the source material just may not be available to address the points adequately.  Still, in at least some cases, there should be adequate material about two or three of the politicians involved that are represented in most languages to begin to get a adequate picture and allow these outliers to pull up the average for the remaining article subjects. Because of the relative importance of these criteria against all other criteria, each has 12 available points, where if the biography partially meets the requirement, some points may be given.  This provides a maximum total points of 72 points, which accounts for roughly half the available points the maximum article has available.


The assessment of these criteria is purely subjective.  To a certain extent, the criteria also universally require greater depth so as to contextualize events that take place in the life of a politician.  The shorter the article, the less information about a specific topic, the less points will be subjectively given. For instance, if the thought process is only provided for one particular incident and the explanation is short, then one point will be given.  If the explanation is longer or there are two incidents where short thought processes are explained, then two points may be given.


The first criteria is, does the article present information about how the person governed.  This includes basic information about what the person did while in power.  To get at least one point, there needed to be at least one or two facts about what legislation the politician was involved with or voted for.  Holding the office alone was not defined in this case as getting any sense of government.


The second criteria is, does the article present information on the thoughts of the leader in terms of how they governed.  The article needs to explain some of the politician’s thought process behind political decision making.  The article cannot present events absent any context as to the politician’s reasons for their actions.


The third criteria is, does the article provide insight into how the person impacted political structures and policies in Spain, their specific region or internationally.  Context needs to be provided as to the impact of these policies so the reader understands the short and long term consequences of the politicians actions. To a certain extent, if the article mentioned how the politician performed relative to their party during an election, at least one point was awarded.  One point may also have been awarded had some background information been provided about their political party works in a national, non-office holding context.


The fourth criteria is, does the article show how being in power impacted the individual politician.  This can be biological or personal.  The person had a heart attack, their hair went gray, etc.  Their involvement in politics ruined their relationships, or put them in a position where they met a future spouse, or kept them in the closet.  The individual went to jail, or was continually followed by journalists who allowed them no privacy.    In cases of the politician going to prison for corruption or being found guilty of corruption, zero points were awarded here unless details were provided on how this impacted them personally.


The fifth criterion is the biography does not separate the person in terms of having a purely private life, and having a purely public life.  The two should be explained as they relate to each other, especially as the person’s primary notability will be for being a politician.  Details about a politicians life should not be present just for the sake of having them there, but be contextualized against their political life.  At university, did they display an interest in politics? Did a labor dispute put them into a place where they became politically active inside a job?  What events led them to becoming a politician?  How did their previous life experiences prepare them for being a politician?   To a certain extent, the article having a paragraph with facts about their education and other details about their life earned one point.  Only after there were more of those details and they connected more directly in the text to their political activities was a biography more than 1 point.


The sixth criterion is the relevance of the biography to Spanish and other Europeans who may have been impacted by the political events the politician has been involved in.  Readers need to be able to understand the politician’s impact on their own lives.


With 72 available points for each article, the most points earned by any article was 16.  It was the Spanish language article about Rosa Díez.  On the other side, 13 articles were assessed as having 0 points.   Over half of these articles were English, accounting for 8 of the 13 total articles.  Catalan had 3 articles assessed as 0 points.  Spanish and Euskera each had 1 article.  Galicaian and Aragonese had 0 points.



Overall, the article quality across all languages was relatively poor.  Not a single article would be objectively defined as a good political biography.  In terms of Wikipedia, none of the sample articles met local Wikipedia standards for being a good article.  Most were extremely short, averaging 288 words across all languages.   Most were poorly sourced.  While having an average of 2.3 sources per article, the median and mode of zero give a better idea as to the actual volume of the sourcing.  Most articles lacked pictures, or had a picture that was cropped from another picture and of poor overall quality.   Most articles did not give the reader a clear idea of the policies the politician supported, nor the impact of legislation a politician supported had on the lives of the electorate.  Almost all articles failed to explain the wider political impact, or lack of impact, the politician had on Spain.  The articles, across all languages, were not very useful.


Overall when measured against the assessed criteria, Spanish Wikipedia had the highest quality of articles.   With the highest single article point total of 60, articles on Spanish Wikipedia averaged 18.75 points.  This is significantly higher than the next highest assessed language project, Galician Wikipedia which had an average point total of 11.27.


Rank Language Score
1 Spanish 18.76
2 Galician 11.28
3 Catalan 8.48
4 Euskera 8.38
5 English 7.81
6 Aragonese 4.00


Rounding things out, Catalan Wikipedia was third, Euskera was fourth, English was fifth and Aragonese was last.  Even when the absence of articles is factored in with null values for these articles, Spanish Wikipedia still ranks as having the best article quality. Catalan finishes second, English Wikipedia third, and Galician fourth.  The absence of 13 of the 20 available articles hurts Galician Wikipedia a lot.


Why is the quality of Spanish Wikipedia so relatively high?  Why is Galician the second best language project?   Spanish ranked first in 8 of the 11 criteria.  Galician Wikipedia finished in the top 1 or 2 for five of the criteria.  In some of these categories, they had almost a full point above the lower performing language projects.  This was particularly important in the category of political biography, where Spanish Wikipedia articles averaged 4.4 points and Galician Wikipedia averaged 3.2 points per article.  In contrast, English Wikipedia averaged 1.5 points and Catalan Wikipedia averaged 1.35 points. Galician Wikipedia also picked up almost a full point on both English and Catalan Wikipedia when it came to average number of sections per article, getting 1.7 points to English Wikipedia’s 1.05 points and Catalan Wikipedia’s 0.9 points.  Galician Wikipedia also picked up half a point on both Catalan and English Wikipedia when it came to sourcing, averaging 0.7 points per article, which was still measurably less than Spanish Wikipedia which averaged 2.2 points per article. As each source was worth 0.3 points, this gave Galician Wikipedia more opportunities to get points for quality when it came to language diversity and source diversity.[2]


The political quality of the article correlates strongly with the article length, total sources in the article, and the number of sections an article has.  It would be more surprising if these qualities did not correlate well to each other, because articles need length, sources and organization as part of being able to successfully meet the political biography quality criteria.



[1]  Originally, the intention was to give articles with a recognised quality picture used on it  1 point. In this case, high quality would have defined as the picture being recognised either on Commons or a local Wikipedia project where the image is being used as a good picture, or a quality picture.  Unfortunately, none of the pictures used in the article met this criteria and so this was not used.

[2] In reality, that did not happen because only one article on Galician Wikipedia had any sources, and it had 17 of them.  The number and diversity of sources was limited in the article, and consequently, both English and Catalan Wikipedia outperformed Galician Wikipedia on source language and source type diversity.