Skip to content

Is it a stub? #265

@NovemLinguae

Description

@NovemLinguae
  • User script that tells you whether an article is a stub or not.
  • Use the algorithm in SpeciesHelper, which strips away templates, then counts the words. If less than 150 words, it's a stub.
  • Maybe have it run on all mainspace pages.
    • If it detects a stub tag + >150 words, it can display a warning that the stub tag needs removing
    • If it detects no stub tag + <150 words, it can display a warning that a stub tag is needed
	// TODO: write unit test for this function
	countWords( wikicode ) {
		// convert {{Blockquote}} to text
		wikicode = wikicode.replace( /\{\{Blockquote\s*\|([^}]*)\}\}/g, '$1' );

		// strip templates
		// TODO: handle nested templates
		wikicode = wikicode.replace( /\{\{.*?\}\}/gsi, '' );
		// strip images
		wikicode = wikicode.replace( /\[\[File:.*?\]\]/gsi, '' );
		// strip HTML comments
		wikicode = wikicode.replace( /<!--.*?-->/gsi, '' );
		// strip HTML tags and refs
		wikicode = wikicode.replace( /<.*?.*?\/.*?>/gsi, '' );
		// strip heading formatting
		wikicode = wikicode.replace( / {0,}=={1,} {0,}/gsi, '' );
		// strip categories
		wikicode = wikicode.replace( /\[\[:?Category:.*?\]\]/gsi, '' );
		// handle piped wikilinks
		// TODO: handle nested brackets (for example, a wikilink as an image caption)
		wikicode = wikicode.replace( /\[\[[^\]]+\|([^\]]+)\]\]/gsi, '$1' );
		// handle simple wikilinks
		wikicode = wikicode.replace( /\[\[/g, '' ).replace( /\]\]/g, '' );
		// strip bold and italics
		wikicode = wikicode.replace( /'{2,}/g, '' );
		// consolidate whitespace
		wikicode = wikicode.replace( /\s+/gsi, ' ' );
		// &nbsp; to space
		wikicode = wikicode.replace( /&nbsp;/gsi, ' ' );

		// In one of my test cases, there was a }} that didn't get deleted. But this is not detected by \w+, so no need to worry about it.

		// TODO: delete "see also", "references", "further reading", "external links" too? a list of references and links probably shouldn't count towards the word count

		wikicode = wikicode.trim();

		const wordCount = wikicode.match( /(\w+)/g ).length;
		return wordCount;
	}

	const shouldBeStub = f.countWords( wikicode2 ) < 150; // I've been reverted for stub tagging an article with a word count of 175 before. so setting this kind of low.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions